Anyone searching for a name in a paper phone book will open the phone book towards the end when searching for Valez and towards the beginning when searching for Brown. Having records in sorted order helps one find things faster. If the phone book has the names in arbitrary order, one would have to scan all the entries from beginning to end until the name is found or the end of the list is reached.
Let’s switch from names to integers as it is easier to see which one of two entries is smaller when we are dealing with integers. If the list is in unsorted order, a systematic solution is to look through the list, one record at a time, starting at one of the two ends. The process terminates if the entry is found or the end of the list is reached (in which case the entry is not in the list).
Assume we have the sorted list consisting of the following 15 numbers:
12, 15, 44, 47, 88, 90, 92, 140, 150, 200, 300, 303, 550, 966, 980
We want to know if a given number, say 90, is a number in the list. In the unsorted list, we may have to look at every entry. Can one do better in a sorted list? Yes!
Look at the entry in middle of the list – 140
12, 15, 44, 47, 88, 90, 92, 140, 150, 200, 300, 303, 550, 966, 980
Since we are looking for 90 and 90 < 140 holds, we don’t have to look at any of the entries in the right half of the list. We only need to look at the names to the left of 140 (the underlined entries).
We now repeat the process on a smaller list (from the beginning to the entry just before 140). This is the principle underlying binary search.
The process terminates when we find 90 or we have nothing left to search in. At every step we discard half of the elements currently under consideration. If we start with n entries, the next phase works with n/2 entries, then n/4, then n/8, etc until only one entry is left or we find what we are looking for. At this point we stop. Binary search is a very fast way to search in a sorted sequence.
Program: Algorithms_4_binarysearch.py
The program contains a non-recursive version of binary search. It uses a break-statement to exit the loop after the element has been found.
Discussion Topics
- The idea is simple, but anyone writing the actual code will realize that one has to be careful with the needed bookkeeping (which becomes clear when working out a larger example). When we have a list of 15 numbers, the “middle” is well defined (7 numbers to the left and 7 number to the right). If we had 16 numbers, the middle would be either the 7th or 8th number in the list.
- There exists lots of literature on binary search. The Wikipedia entry http://en.wikipedia.org/wiki/Binary_search_algorithm is complete. Expect that both the recursive and non-recursive solutions (which uses a while loop) are likely to appear.
- Depending on the background of students, discussing recursion in connection with binary search makes sense.
- How many elements in the list do we look at before the search terminates? If n = 2k, at most we look at k elements (k = log n, where the log is base 2).
Logarithms of base 2 play a role in the design and analysis of algorithms as well as in the representation of numbers in binary notation. - Compare linear search and binary search. See Example: Linear search and Video: Linear search in Python