BlackWaspTM
LINQ
.NET 3.5+

A LINQ Style Mode Operator

Language Integrated Query (LINQ) includes the Average operator that can be used to calculate the mean value of a sequence. This article implements a LINQ operator that determines the mode, which is the most common value or group of values.

Mode

In previous articles we have seen the Average standard query operator, which calculates the mean of a sequence of values by summing them and dividing by the number of elements present. We have also implemented a Median operator, which places the sequence in order and selects the middle value, or calculates the mean of the two middle values in a sequence with an even number of elements.

Another method for finding an average of a set of values is using the mode. The mode of a sequence is the value that appears the most frequently. Unlike the mean and median, there can be more than one mode when two or more values occur the same number of times. For example, the sequence 1, 2, 2, 3, 4, 5, 5 is bimodal. The values 2 and 5, which both occur twice are the set's modes.

In this article we will create a method with a syntax similar to the existing Language-Integrated Query (LINQ) standard query operators. It will calculate the mode, or modes, for a sequence and return them as a new sequence.

Creating the Class

To begin we need to create a static class to contain the extension method. Create a new project and add a class named "ModeExtensions". Modify the class' definition as follows:

public static class ModeExtensions
{
}

Creating the Method

The custom Mode operator will have various overloaded versions, allowing it to work with a sequence of any type, using a custom equality comparer and including a projection function that obtains the values from which to generate the mode. We'll implement this, the most complex overload, first. Later we'll add simpler versions, each calling the original.

Let's start by creating the signature for the Mode method:

public static IEnumerable<R> Mode<T, R>(
    this IEnumerable<T> source, Func<T, R> selector, IEqualityComparer<R> comparer)
{
}

This may look quite complicated at first glance but it's actually quite simple. We start by declaring that we will be returning an IEnumerable<R>. This is a sequence containing the mode values. The type parameter 'R' is the resultant type, which may be different from the that of the input sequence after projection.

The first parameter, source, is the input sequence. This is declared as IEnumerable<T>, where 'T' is the type of the elements in the sequence. The selector function will be applied to each item from the source sequence before the modes are calculated. It will cause the conversion between the two types of the generic method, if those types differ. Finally, the comparer parameter allows you to provide an alternative comparer to the default. For example, if you are calculating the mode for a series of strings, you may decide to use a case-insensitive comparer. Later we will create overloads that do not include a projection function or a comparer parameter.

When the method is called, it is possible that a comparer will not be provided and that this argument will be null. In this case, we will use the default comparer. To detect this possibility and obtain the comparer that will be used, add the following line to the method:

var actualComparer = comparer == null ? EqualityComparer<R>.Default : comparer;
18 May 2011