Synchronisation and Aggregation
One of the common uses of loops is to perform aggregation. This is where a single value is calculated by the entire loop, with each iteration potentially modifying that value. A simple aggregation operation is summing a series of values. This is simple when you are using the sequential for or foreach loops. When using parallel loops, as we have seen earlier in the tutorial, synchronisation of the aggregated value between threads becomes a problem. If two or more threads access the shared, mutable value simultaneously there is a chance that they will use inconsistent values leading to an incorrect result.
In this article we will look at two ways to apply synchronisation that are useful when aggregating values. To follow the examples, create a console application project and add the following using directive to simplify the use of the Parallel class.
using System.Threading.Tasks;
Aggregation in Sequential Loops
To demonstrate aggregation in parallel loops we will recreate the same functionality several times. Each example sums all of the integers between one and one hundred million. This is basic functionality but demonstrates the problems caused when parallel iterations are incorrectly synchronised. Let's start with a sequential foreach loop that acts upon the results generated by the LINQ method, Enumerable.Range. As the loop runs on a single thread there is no risk of synchronisation issues so the result is always correct:
long total = 0;
foreach (int value in Enumerable.Range(1, 100000000))
{
total += value;
}
// total = 5000000050000000
Aggregation in Parallel Loops
If we attempt to parallelise the above loop by converting the foreach syntax to a call to Parallel.ForEach, we introduce synchronisation errors. Try running the code below.
long total = 0;
Parallel.ForEach(Enumerable.Range(1, 100000000), value =>
{
total += value;
});
// total = 2769693850306679
The result shown in the comment was achieved using a dual-core processor. As multiple threads of execution read and updated the total variable it was left in inconsistent states multiple times. The above result is a little more than half of the correct total. On machines with more than two processors the result is likely to be much smaller and further from the correct value. However, on a single processor machine you may consistently achieve the correct answer, masking the problem.
1 September 2011