.NET 3.5+

Fisher-Yates-Durstenfeld Shuffle

by Richard Carr, published at http://www.blackwasp.co.uk/FisherYatesShuffle.aspx

Sometimes it is necessary to randomise the order of a sequence of values. The Fisher-Yates algorithm provides a paper-based method, which was later computerised by Richard Durstenfeld. This article implements the algorithm as a custom LINQ operator.

Previous: Page 1

Creating the Class

To begin we need to create a static class to contain our extension method. In the downloadable sample I've created this in a console application. For real-world use you would probably want to use a class library. Create a new class named, "ShuffleExtensions" and modify its declaration to make it public and static as shown below. We'll also need a random number generator, which is a private, static field.

public static class ShuffleExtensions
{
    static Random _random = new Random();
}

Creating the Public Method

The extension method is simple. It validates the input sequence to ensure that it is not null before calling an implementation method named, "ShuffleImpl". This means that an invalid sequence is identified immediately. However, as ShuffleImpl provides an iterator, the processing of the sequence will be deferred until the results are first used.

public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> sequence)
{
    if (sequence == null) throw new ArgumentNullException("sequence");

    return ShuffleImpl(sequence);
}

Creating the Implementation Method

The implementation method has several steps. Firstly, the sequence is materialised in an array with the ToArray operator. This allows the length of the sequence to be determined without the possibility of the items being exhausted, preventing further processing. This would happen for sequences that only support a single pass.

Once the array is populated, the Durstenfeld algorithm can commence. In this method we loop from the extent of the last array index to one. We don't loop all the way to index zero; this does not require randomisation as we'll have only one remaining item. During each iteration we call the ExtractRandomItem method to perform steps 3 and 4 of the algorithm and yield a random item from the source sequence. Once the loop is completed we return the one remaining item to complete the process.

private static IEnumerable<T> ShuffleImpl<T>(IEnumerable<T> sequence)
{
    var buffer = sequence.ToArray();

    for (int max = buffer.Length - 1; max > 0; max--)
    {
        yield return ExtractRandomItem(buffer, max);
    }

    yield return (buffer[0]);
}

The extraction process has two steps. First we pick a random number between zero and the value of the loop control variable from the for loop. This represents the index of the next item in the shuffled sequence. We then swap the positions of this item and the item at the loop control value's index, ensuring that the same item will not be selected twice. Finally we return the selected value.

private static T ExtractRandomItem<T>(T[] buffer, int max)
{
    int random = GetRandomNumber(max + 1);
    SwapOut(buffer, random, max);
    return buffer[max];
}

When generating random numbers we need to use the Next method from the Random class. This method is not guaranteed to be thread-safe so to ensure that we don't create problems in multithreaded applications we should only call Next in a critical section, contained within a lock statement.

private static int GetRandomNumber(int max)
{
    lock (_random)
    {
        return _random.Next(max);
    }
}

We can complete the class by adding the method that swaps two items in the array.

private static void SwapOut<T>(T[] buffer, int randomItem, int swapItem)
{
    if (randomItem != swapItem)
    {
        T temp = buffer[swapItem];
        buffer[swapItem] = buffer[randomItem];
        buffer[randomItem] = temp;
    }
}

Next: Page 3

2 January 2013