BlackWasp
Algorithms
.NET 2.0+

A Generic Searchable Range Collection

A common programming task is to match a value to a group of ranges in order to find a value associated with that range. The .NET framework does not provide a collection class to support this functionality so a new generic collection must be created.

Tabular Lookup

There are many programming situations where a one value is linked to another and, given the first value, you need to determine the second. In some cases the link is a simple calculation within the code. For example, the relationship between a value and its square can be determined with a simple multiplication.

In more complicated situations, or where the link between two values is arbitrary, a tabular lookup may be used. Here you may decide to store the information in a dictionary collection such as a SortedList or Hashtable. The collection's key element can hold the lookup value and the value element can contain the related item. This is ideal for relationships where calculations would be inefficient or impossible such as those in the table of customer codes below:

Customer Code (Key)Customer Name (Value)
ABC001ABC Technology Limited
ABC002ABC Construction Plc
JOH001John Smith
......

As the table uses discrete values for the customer codes, finding a related name is easy. When stored using a Hashtable, the items can be added as key / value pairs and found using the unique key as an indexer.

Hashtable customers = new Hashtable();

customers.Add("ABC001", "ABC Technology Limited");
customers.Add("ABC002", "ABC Construction Plc");
customers.Add("JOH001", "John Smith");

Console.WriteLine(customers["ABC002"]);     // Outputs "ABC Construction Plc"

The tabular lookup method becomes problematic when the key values are divided into ranges. As an example, the following table shows the available rankings for the 1980's computer game "Elite" and the number of enemies that must be destroyed to achieve each rating.

Kills (Key)Rating (Value)
0 - 3Harmless
4 - 7Mostly Harmless
8 - 15Poor
16 - 31Below Average
32 - 63Average
64 - 127Above Average
128 - 999Competent
1,000 - 2,999Dangerous
3,000 - 5,999Deadly
6,000 +Elite

This information could be held in a simple dictionary but to permit a lookup using the standard indexer, each item would need to be repeated multiple times. As an example, the "Harmless" option would require four entries so that a lookup could be performed for zero, one, two or three battles won. To support all of the possible searches the collection would include 6,001 items and a rule would be added to the code to check for the special case of over six thousands kills.

The problem of duplicating entries becomes even more serious when the lookup values are not integers. If strings or floating-point values are included in the key column of the table, the table size is limited only by the length of string or accuracy of the fractional values. In either case, a simple lookup is unusable except for very small tables.

The resolution to this problem is to have a collection class that represents value ranges. Unfortunately no such class exists in the .NET framework. The remainder of this article will describe the process and the code required for creating one.

RangeCollection Requirements

The RangeCollection class will hold ranges of keys with each range linked to a value. The new dictionary will meet the following requirements:

  • The collection's keys will be of any type that implements IEnumerable. The use of this interface will permit the range keys to be compared with a search value to determine which of two is the larger.
  • The collection's related values will be of any type.
  • Each key in the collection will indicate the lower boundary of a range. The upper boundary will be determined by the next highest key in the collection or, if there is no higher key, the largest value that can be held in the key's data type.
  • The class will include a method that performs a binary search on the ranges to quickly find the range in which a specific value falls, even when there are many thousands, or even millions, of ranges in the collection.

Creating the RangeCollection Project

For simplicity in this article, the RangeCollection class will be created in a console application project. You may wish to create the collection in a class library instead for use within your own projects. To begin, create a new console application named "RangeCollectionDemo". Add a new class file to the project named "RangeCollection".

Class Definition

The RangeCollection will hold two values for each item in the collection. The key value will hold the start point for a range and the value element will contain the related value. The data type of each item will be variable according to a specific instance's use. This means that the class will be very similar to the generic SortedList collection. In fact, the only difference between RangeCollection and SortedList is the additional binary search method. For this reason, the RangeCollection class will inherit its base functionality from SortedList.

The generic SortedList class is in the System.Collections.Generic namespace. To keep the code neat, ensure that the following using directive appears at the top of the RangeCollection class file.

using System.Collections.Generic;

With the using directive in place, the class can be declared. A declaration will automatically have been added to the code but this must be modified to indicate that the new class is a subclass of SortedList. Change the declaration as follows:

public class RangeCollection<key, value>
    : SortedList<key, value> where key : IComparable
{
}

There are several items of note in the updated declaration. The class is derived from the generic SortedList class. The "key" and "value" items specify the names of the two generic types that will be used for the key and value elements in each collection entry. As we will be performing comparison operations on the key's data type, the key must implement the IComparable interface. This rule is enforced by the where element.

Constructors

The generic SortedList class provides six constructors. These constructors allow the initial capacity to be set, permit selection of the IComparer that will be used to sort the list and initialise the collection using the contents of another dictionary.

The RangeCollection currently includes only the default constructor, as constructors are not automatically inherited from the parent class. Three constructors will be defined for the new class. These will be the default constructor, a constructor that permits the capacity to be set and a constructor that provides an initialising dictionary. To declare the three constructors and have each simply call the matching underlying SortedList function, add the following code to the class:

public RangeCollection() : base() { }
public RangeCollection(int capacity) : base(capacity) { }
public RangeCollection(IDictionary<key, value> dictionary) : base(dictionary) { }

FindValueInRange Method

To complete the RangeCollection class the search functionality needs to be added. This will be provided by a new method named "FindValueInRange" that performs a binary search of the dictionary keys and returns the corresponding value. As it will be possible to pass a value that does not appear within any range, a default value will be supplied as a second parameter to the method. Where the search value is not found the default value will be returned instead.

The search value must be of the same generic type as the keys in the collection and the default value must match the type of the related values. Therefore, the declaration for the new method to be added to the class is as follows:

public value FindValueInRange(key lookup, value defaultValue)
{
}

Initial Checks

The first task in the search method is to test for unusual conditions. An if statement checks that the RangeCollection has been populated with at least one range. A second condition ensures that the value to search for is greater than or equal to the lowest range key. If either condition is not met, the default value will be returned immediately.

To add the conditions, add the following code to the FindValueInRange method:

if (this.Count == 0) return defaultValue;
if (lookup.CompareTo(this.Keys[0]) < 0) return defaultValue;

NB: The lowest range will be the item at index zero as the collection is sorted into ascending order by the underlying SortedList class.

Preparing the Binary Search

The search will be performed within a while loop that continues until a matching range is found. The matching range will be the range that starts at a value equal to or less than the search value and is either the last range in the collection or is directly before another range that has a key higher than the search value.

To maximise the performance of the process for large collections, a binary search algorithm will be used. Initially the boundaries of the search will be the first and last items. Each time a check is made, at least half of the range will be discarded before the loop is restarted.

To initialise the search range and declare the loop, add the following to the method:

int searchStart = 0;
int searchEnd = this.Count - 1;

while (searchStart <= searchEnd)
{
}

Get the Range to Check

During each iteration of the loop, the range item half way between the two search boundaries will be checked. As the search boundaries will change each time, the middle item is calculated by adding the following line within the loop.

int checkRange = searchStart + (searchEnd - searchStart) / 2;

Comparing the Check Range

Once the range to check has been identified, several conditions are checked. Firstly, if the value being searched for is greater than or equal to the check range's key there is a possibility of a match. If this is not the case, there is no match and the end of the search range can be lowered before the loop restarts.

The second condition tests if the value being searched for is the same as the key of the range to check. If so, a match has been found and the related value for the item can be returned. If not, a match is still possible if the range being checked is that last item in the collection or if the next range has a key larger than the search value. If this check gives no match, the lower boundary of the search can be raised before the loop restarts.

To include these conditions, add the following code to the loop:

// Check the lookup value against the check range.
if (lookup.CompareTo(this.Keys[checkRange]) >= 0)
{
    if (checkRange == searchEnd)
    {
        return this.Values[checkRange];
    }
    else
    {
        // Is the next range above the check value, meaning
        // that the check range is the correct range.
        if (lookup.CompareTo(this.Keys[checkRange + 1]) < 0)
        {
            return this.Values[checkRange];
        }
        else
        {
            searchStart = checkRange + 1;
        }
    }
}
else
{
    // Range to find is lower than the check range
    // so move the end of the search.
    searchEnd = checkRange - 1;
}

Checking for Unexpected Errors

The use of ranges with no maximum value and the initial testing for a value lower than the smallest range means that there should be no situation where the loop exits because the search boundaries overlap. However, in case this does happen for an unexpected reason, the following line can be added after the end brace (}) of the loop to throw an exception.

throw new InvalidOperationException("Search unexpectedly exhausted.");

Testing the RangeCollection

The RangeCollection class can now be tested. For an example test, change the Main method in the Program class as follows and execute the program:

static void Main(string[] args)
{
    RangeCollection<int, string> ranges = new RangeCollection<int, string>();

    ranges.Add(0, "Harmless");
    ranges.Add(4, "Mostly Harmless");
    ranges.Add(8, "Poor");
    ranges.Add(16, "Below Average");
    ranges.Add(32, "Average");
    ranges.Add(64, "Above Average");
    ranges.Add(128, "Competent");
    ranges.Add(1000, "Dangerous");
    ranges.Add(3000, "Deadly");
    ranges.Add(6000, "Elite");

    for (int i = -1; i < 10000; i = (i + 1) * 2)
    {
        string rating = ranges.FindValueInRange(i, "Unknown");
        Console.WriteLine("{0} kills = {1}", i, rating);
    }
}

/* OUTPUT

-1 kills = Unknown
0 kills = Harmless
2 kills = Harmless
6 kills = Mostly Harmless
14 kills = Poor
30 kills = Below Average
62 kills = Average
126 kills = Above Average
254 kills = Competent
510 kills = Competent
1022 kills = Dangerous
2046 kills = Dangerous
4094 kills = Deadly
8190 kills = Elite

*/
Link to this Page8 March 2008
TwitterTwitter RSS Feed RSS