Friday 23 November 2012

Data Structures using „C‟ - Searching Methods

Data Structures using „C‟ Unit 8

Unit 8 Searching Methods
Structure:
8.0 Introduction
8.1 Basics Searching Techniques
Self Assessment Questions
8.2 Algorithmic Notation
8.2.1 Sequential Search [Linear search]
8.2.2 Binary Search
8.3 Illustration of C programmes
8.4 Summary
8.5 Terminal questions
8.0 Introduction
Information retrieval is one of the most important applications of computers. It usually involves giving a piece of information called the key, and ask to find a record that contains other associated information. This is achieved by first going through the list to find if the given key exists or not, a process called searching. Computer systems are often used to store large amounts of data from which individual records must be retrieved according to some search criterion. The process of searching for an item in a data structure can be quit straightforward or very complex. Searching can be done on internal data structures or on external data structures. Information retrieval in the required format is the central activity in all computer applications. This involves searching. This block deals with searching techniques.
Searching methods are designed to take advantage of the file organisation and optimize the search for a particular record or to establish its absence.
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 227
The file organisation and searching method chosen can make a substantial difference to an application's performance.
Objectives
At the end of this unit, you will be able to understand the:
 Brief discussion on Basics Searching Techniques.
 Algorithmic Notation such as The average time, The worst-case time and, The best possible time.
 Introduction of Sequential Search [Linear search]
 Introduction of Binary Search
8.1 Basics Searching Techniques
Consider a list of n elements or can represent a file of n records, where each element is a key / number. The task is to find a particular key in the list in the shortest possible time.
If you know you are going to search for an item in a set, you will need to think carefully about what type of data structure you will use for that set. At low level, the only searches that get mentioned are for sorted and unsorted arrays. However, these are not the only data types that are useful for searching.
 Linear search: Start at the beginning of the list and check every element of the list. Very slow [order O(n) ] but works on an unsorted list.
 Binary Search : This is used for searching in a sorted array. Test the middle element of the array. If it is too big. Repeat the process in the left half of the array, and the right half if it‟s too small. In this way, the amount of space that needs to be searched is halved every time, so the time is O(log n)
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 228
 Hash Search : Searching a hash table is easy and extremely fast, just find the hash value for the item you‟re looking for then go to that index and start searching the array until you find what you are looking for or you hit a blank spot. The order is pretty close to o(1), depending on how full your hash table is.
 Binary Tree search: Search a binary tree is just as easy as searching a hash table, but it is usually slower (especially if the tree is badly unbalanced). Just start at the root. Then go down the left subtree if the root is too big and the right subtree if is too small. Repeat until you find what you want or the subtree you want isn‟t there. The running time is O(log n) on average and O(n) in the worst case.
Self Assessment Questions
1. Discuss the various types of Searching techniques.
8.2 Algorithmic Notation
let‟s examine how long it will take to find an item matching a key in the collections. We are interested in:
1. The average time
2. The worst-case time and
3. The best possible time.
However, we will generally be most concerned with the worst-case time as calculations based on worst-case time can lead to guaranteed performance predictions. Conveniently, the worst-case time are generally easier to calculate than average time.
If there are n items in our collection whether it is stored as an array or as linked list-then it is obvious that in the worst case, when there is no item in
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 229
the collection with the desired key, then n comparisons of the key with keys of the items in the collection will have to be made.
To simplify analysis and comparison of algorithms, we look for a dominated operation and count the number of times that dominant operation has to be performed. In the case of searching, the dominant operation is the comparison, since the search requires n comparisons in the worst case, we say this is O(n)
(pronounce this “big-Oh-n” or “Oh-n”) algorithm.
The best case-in which the first comparison returns a match-requires a single comparison and is O(1).
The average time depends on the probability that the key will be found in the collection-this is something that we would not expected to know in the majority of cases. Thus in this case, as in most others, estimation of the average time is of little utility. If the performance of the system is vital, i.e. it‟s part of a life-critical system, then we must use the worst case in our design calculations as it represents the best guaranteed performance.
We will now discuss two searching methods and analyze their performance. These two methods are:
 The sequential search
 The binary search
8.2.1 Sequential Search [Linear search]
This is the most natural searching method. Simply put it means to go through a list or a file till the required record is found. It makes no demands on the ordering of records. The algorithm for a sequential search procedure is now presented.
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 230
Algorithm: Sequential Search
This represents the algorithm to search a list of values of to find the required one.
INPUT: List of size N. Target value T
OUTPUT: Position of T in the list –I
BEGIN
1. Set FOUND to false
Set I to 0
2. While (I<=N) and (FOUND is false)
If List [I] = T
FOUND = true
Else
I=I+1
END
3. If FOUND is false
T is not present in List.
END
This algorithm can easily be extended for searching for a record with a matching key value.
Analysis of Sequential Search
Whether the sequential search is carried out on lists implemented as arrays or linked lists or on files, the criterial part in performance is the comparison loop step 2. Obviously the fewer the number of comparisons, the sooner the algorithm will terminate.
The fewest possible comparisons = 1. When the required item is the first item in the list. The maximum comparisons = N when the required item is the last item in the list. Thus if the required item is in position I in the list, I comparisons are required.
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 231
Hence the average number of comparisons done by sequential search is
1+2+3.....+N
N
= -N (N+1)
2*N
= (N + 1)/2
Sequential search is easy to write and efficient for short lists. It does not require sorted data. However it is disastrous for long lists. There is no way of quickly establishing that the required item is not in the list or of finding all occurrences of a required item at one place.
We can overcome these deficiencies with the next searching method namely the Binary search.
Example : Program to search for an item using linear search.
#include<stdio.h>
/* Search for key in the table */
int seq_search(int key, int a[], int n)
{
Int I;
for(i=0;i<n;i++)
{
If(a[i]==key) return i+1
}
return 0;
}
void main()
{
int I,n,key,pos,a[20];
printf(“Enter the value of n\n”);
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 232
scanf(“%d”,&n);
printf(“Enter n values\n”;
for(i=0;i<n;i++)
scanf(%d”,&a[i]);
printf(“Enter the item to be searched\n”);
scanf(“%d”, &key);
pos= seq_search(key,n,a);
if(pos==0)
printf(“Search unscccessful \n”);
else
printf(“key found at position = %d \n”,pos);
}
8.2.2 Binary Search
The drawbacks of sequential search can be eliminated if it becomes possible to eliminate large portions of the list from consideration in subsequent iterations. The binary search method just that, it halves the size of the list to search in each iterations.
Binary search can be explained simply by the analogy of searching for a page in a book. Suppose you were searching for page 90 in book of 150 pages. You would first open it at random towards the later half of the book. If the page is less than 90, you would open at a page to the right, it is greater than 90 you would open at a page to the left, repeating the process till page 90 was found. As you can see, by the first instinctive search, you dramatically reduced the number of pages to search.
Binary search requires sorted data to operate on since the data may not be contiguous like the pages of a book. We cannot guess which quarter of the data the required item may be in. So we divide the list in the centre each time.
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 233
We will first illustrate binary search with an example before going on to formulate the algorithm and analysing it.
Example: Use the binary search method to find 'Scorpio' in the following list of 11 zodiac signs.
Aries 1
Comparison 1 (Leo Scorpio)
Aquarius 2
Cancer 3
Comparison 2
(Sagittarius Scorpio)
Capricorn 4
Comparison 3
( =scorpio)
Gemini 5
Leo 6
Libra 7
Pisces 8
Sagittarius 9
Scorpio 10
Taurus 11
This is a sorted list of size 11. The first comparison is with the middle element number 6 i.e. Leo. This eliminates the first 5 elements. The second comparison is with the middle element from 7 to 11, i.e. 9 Sagittarius. This eliminates 7 to 9. The third comparison is with the middle element from 9 to 11, i.e. 10 Scorpio. Thus we have found the target in 3 comparisons. Sequential search would be taken 10 comparisons. We will now formulate the algorithm for binary search.
Algorithm Binary Search
This represents the binary search method to find a required item in a list sorted in increasing order .
INPUT: Sorted LIST of size N, Target Value T
OUTPUT: Position of T in the LIST = I
BEGIN
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 234
1. MAX = N
MIN = 1
FOUND = false
2. WHILE (FOUND is false) and (MAX > = MIN)
2.1 MID = (MAX + MIN)DIV 2
2.2 If T = LIST [MID]
I=MID
FOUND = true
Else If T < LIST[MID]
MAX = MID-1
Else
MIN = MD+1
END
It is recommended that the student apply this algorithm to some examples.
Analysis of Binary Search :
In general, the binary search method needs no; more than [Iog2n] + 1 comparisons. This implies that for an array of a million entries, only about twenty comparisons will be needed. Contrast this with the case of sequential search which on the average will need comparisons.
The conditions (MAX = MIN) is necessary to ensure that step 2 terminates even in the case that the required element is not present. Consider the example of Zodiac signs. Suppose the l0th item was Solar (an imaginary Zodiac sign). Then at that point we would have
MID = 10
MAX =11
MIN = 9
And from 2.2 get
MAX = MID-l = 9
(n+1)
2
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 235
In the next iteration we get
(2.1) MID = (9 + 9) DIV 2 = 9
(2.2) MAX= 9-1 = 8.
Since MAX <MIN, the loop terminates. Since FOUND is false, we consider
the target was not found.
In the binary search method just described above, it is always the key in the
middle of the list currently being examined that is used for comparison. The
splitting of the list can be illustrated through a binary decision tree in which
the value of a node is the index of the key being tested. Suppose there are
31 records, then the first key compared is at location 16 of the list since (1 +
31)/2 = 16. If the key is less than the key at location 16 the location 8 is
tested since (1 + 15)/2 = 8; or if key is less than the key at location 16, then
the location 24 is tested. The binary tree describing this process is shown
below Figure.
8.3 Illustrations of C Programmes
Example : Program to search for an item using Binary Search.
[ interpolation search ]
#include<stdio.h>
int search(item, a,low, high)
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 236
int item; /* Element to search */
int a[]; /* Element to be searched */
int low; /* Points to the first element */
int high; /* Point tot the last element */
{
int mid; /* Point to the middle element of the table */
if(low>high) /* No item found */
return -1;
mid= low+(high-low) * ((item-a[low])/(a[high]-a[low]));
return(item==a[mid]?mid+1: /* return the middle element */
item<a[mid]?
search(item,a,low,mid-1): /* search left part */
search(item,a,mid+1,high)) /* search right part */
}
void main()
{
int n, i,a[20],item,pos;
printf(“enter the number of elements \n”);
scanf(“%d”,&n);
printf(“Enter %d items \n”,n);
for(i=0;i<n;i++)
{
scanf(“%d”,&a[i]);
}
printf(“Enter the item to be searched \n”);
scanf(“%d”,&item);
pos=search(item,a,0,n-1); /* 0- low index and n-1 is the high index */
if(pos== -1)
printf(“Item not found \n”);
Data Structures using „C‟ Unit 8
Sikkim Manipal University Page No.: 237
else
printf(“Item found at %d position \n”,pos);
8.4 Summary
In this book discussed about the Basics Searching Techniques, Algorithmic Notation and mainly two type of searching techniques that are Sequential Search [Linear search], Binary Search Searching methods are designed to take advantage of the file organisation and optimize the search for a particular record or to establish its absence. The file organisation and searching method chosen can make a substantial difference to an application's performance.
8.5 Terminal Questions
1. Explain the types of Basic Searching Technique.
2. Explain the Sequential (Linear Search) Search with appropriate algorithm.
3. Write a „C‟ program to search for an item using linear search.
4. Write an algorithm with analysis steps for Binary Search.
5. Write a „C‟ program to search for an item using Binary Search.

No comments:

Post a Comment