BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

Algorithms and Data Structures
.NET 1.1+

Binary Search Algorithm

The binary search algorithm provides fast searching within sorted arrays or collections. This iterative approach to searching disregards approximately half of the values in the collection with each pass to give O(log n) scalability.

Binary Search

When you need to find the position of an item within a collection that is sorted, the binary search algorithm provides a suitable and fast approach. This algorithm is included in the generic List class, which was introduced in the .NET framework version 2.0. Although this means that you will not normally be required to recreate it, it is interesting to understand how this algorithm works.

The binary search algorithm is iterative. During each iteration you pick the central item in the array, according to its index. If the value found is the one that you are looking for, the algorithm is complete. Assuming that the array is sorted with the lowest values first, if the value at the midpoint is lower than the item being searched for, the midpoint and all items before it are discarded, halving the size of the remaining elements. If the midpoint value is higher than the search item, all of the items from that index to the end of the array are discarded. These items can be disregarded because the collection is sorted. If it is not, the binary search approach does not work.

With the array halved in size, the process is repeated. Each iteration removes half of the possible values or identifies the item that you wish to find. This continues until the item is found or until all items have been excluded from the search, at which point you know that the search item was not present.

In reality, it is usually inefficient to actually remove elements from the array or collection. It is much more likely that you would use pointers to identify the first and last index of the search. These are initialised to point at the first and last elements in the sequence and one of the pointers is moved during each iteration to halve the effective length of the collection. In either case, this gives the algorithm O(log n) complexity.

Let's walk through an example using the values shown below, searching for the number thirty-six.

10, 15, 18, 25, 29, 31, 36, 41, 44, 48, 49, 50, 52, 54, 58, 61, 65, 70, 95, 99

In the first iteration we would start by identifying the midpoint. As there is an even number of items in the collection, we can pick either the tenth or eleventh item. The decision is arbitrary so let's use the leftmost of the midpoint values, which is forty-eight.

10, 15, 18, 25, 29, 31, 36, 41, 44, [48], 49, 50, 52, 54, 58, 61, 65, 70, 95, 99

Forty-eight is larger than the number for which we are searching. This means that all of the items with higher indexes must also be larger than thirty-six, so we can disregard the higher half of the collection. We can then look for the midpoint of the smaller set of values, which is twenty-nine.

10, 15, 18, 25, [29], 31, 36, 41, 44

Twenty-nine is lower than the number that we wish to find so we can disregard it and all values beneath it. This leaves just four items to search. When we locate the midpoint of these remaining elements we find that it is the thirty-six that we set out to find. This means that we managed to search twenty elements in just three iterations.

31, [36], 41, 44

Implementation

Let's implement the binary search algorithm using C#. You can follow the example code using a Console Application project, or download the code using the link at the start of this article. We'll create a method to perform a binary search over a sorted array of integers. We'll assume that the array is sorted in ascending order. The return value for the method will be the index of the search item, or -1 if the item could not be found.

To begin, add the following method signature within the class that was automatically generated when you created the project. The method has two parameters. The first receives the array to be searched and the second the number to find.

public static int BinarySearch(int[] items, int find)
{
}

Rather than trying to reduce the size of the array during each iteration we'll use pointers for the start and end of the "chunk" of the array that could still hold the search item. At the start of the process this is the entire array. The start pointer will therefore be pointing at index zero and the end pointer will be one less than the length of the array.

To set up the pointers and call the iterative part of the algorithm, add the following code within the BinarySearch method:

int start = 0;
int end = items.Length - 1;

return RecursiveFind(items, find, start, end);

Recursive Find Method

The recursive part of the algorithm has four steps:

  1. Check that the start and end pointers are still valid, indicating a length of array that is greater than zero. If the item we are looking for does not exist in the array, after the final iteration the start and end pointers will overlap, giving a chunk of array with no items. If this happens, we'll return -1 to indicate that the item was not present.
  2. Find the index of the midpoint of the remaining items. This is calculated easily using the start pointer and the length of the remaining chunk of items.
  3. Check if the item at the midpoint is the value that we wish to find. If it is, we'll return the midpoint index and the algorithm will end.
  4. Check if the item at the midpoint is larger than the search term. If it is, we'll move the end pointer to one item before the midpoint and start the next iteration from step two above. If not, the midpoint value must be less than the search item so we'll move the start pointer to the item immediately following the midpoint and return to step two.

You can see these steps in the method code below:

private static int RecursiveFind(int[] items, int find, int start, int end)
{
    int chunkSize = 1 + (end - start);

    if (chunkSize == 0)
        return -1;

    int midpoint = start + (chunkSize / 2);

    if (items[midpoint] == find)
        return midpoint;
    else if (items[midpoint] > find)
        return RecursiveFind(items, find, start, midpoint - 1);
    else
        return RecursiveFind(items, find, midpoint + 1, end);
}
31 July 2013