Parallel Programming

by Richard Carr, published at http://www.blackwasp.co.uk/ParallelProgramming.aspx

This is the first in a series of articles introducing the parallel programming techniques that are available in the C# programming language and the .NET framework version 4.0. The first part describes some of the concepts of parallel programming.

Tutorial Prerequisites

In this tutorial I will be introducing parallel programming using the new facilities provided by the .NET framework version 4.0. We will look at the use of various parts of the framework, including the Task Parallel Library and the parallel version of Language-Integrated Query, known as PLINQ. To follow the sample code in the articles you will need to be comfortable with C#, object-oriented programming and LINQ. If you are not, you should consider reading the C# Fundamentals tutorial, the C# Object-Oriented Programming tutorial and the LINQ to Objects tutorial before you begin.

To compile the samples and run them you will need to install the .NET framework 4.0. Ideally, you should compile and run the samples from within Visual Studio 2010. If this is not available you can use Visual C# Express 2010, which can be downloaded free-of-charge from Microsoft, or an alternative development tool that can target the .NET framework 4.0. Finally, although not strictly necessary to run the sample code, you will need a computer with multiple core to see the effects of parallelism.

Why is Parallel Programming Required?

In the early years of personal computers, machines were built with a single central processing unit (CPU). Some CPUs could run at different clock speeds and some could be overclocked to improve their performance. As new, improved CPU designs were made available, the clock speeds of processors increased dramatically.

Between the early 1990's and the mid 2000's, the clock speed of the CPU in a personal computer increased from a mere 33 megahertz to around 3.5 gigahertz. This alone represents an increase in performance of over one hundred times. In addition, each new processor model introduced additional efficiency improvements and extra technology to make the speed improvement even greater. For example, many CPUs in 1990 could not natively perform floating point operations so floating point arithmetic was many times slower that CPUs with that facility running at the same clock speed.

During this period, if you wanted your program you run faster, you could either optimise it or buy better hardware. Businesses would often use the latter approach as it was generally more cost-effective; the price of a new computer could be less than the charges for a day or two of programmer time.

Since 2005, the increase in CPU clock speed has stalled. One of the key reasons is that faster processors produce many times more heat than slower ones. Dissipating this heat to keep the processor operating within a safe temperature range is much more difficult. There are other reasons too, linked to the design of CPUs and the amount of additional power required for higher clock speeds.

The solution that the major CPU designers have selected is to move away from trying to increase clock speed and instead focus on adding more processor cores. Each core acts like a single processor that can do work. If you have two cores in your processor, it can process two independent tasks in parallel without the inefficiency of task-switching. As you increase the number of cores, you also increase the amount of code or data that can be processed in parallel, leading to an overall performance improvement without a change in clock speed.

At the time of writing it is becoming difficult to purchase a new computer that has a processor with only one core. Desktop computers commonly have quad-core or six core CPUs with technology that gives eight or twelve virtual processors. Notebook computers usually include at least a dual-core processor and often include four cores. Netbooks, which are designed for web browsing and are less powerful that notebooks, often include dual-core CPUs too. Even some mobile telephones have more than one core. This trend is likely to continue, with companies such as Intel indicating that future CPUs may include a thousand cores.

Sequential Code

The problem that we face as developers is that we have been trained to think about programming in a sequential manner. If we continue to program in this way our software will not take advantage of the improvements made available by parallel processing. A standard .NET program that does not create new threads will only use a single core. On current hardware this may mean that only a half or a quarter of the available processing power is available to us. In the future, programs like these may only use a tiny fraction of the processor. Similar software that fully utilises parallel programming will perform better and likely be favoured by our users.

Before .NET 4.0, C# developers could obtain the improved performance of newer CPUs by creating multi-threaded software. Often this type of software only creates a few additional threads to speed up a process or to allow the user interface to remain responsive whilst a background task is completed. If the number of threads is increased far beyond the number of cores, the overhead may cause the program to run slower over all. If there aren't enough threads to keep every core busy, the program may underperform on some hardware. You could inspect the processor in your code and spawn new threads accordingly but this can be complex.

With .NET 4.0, Microsoft introduced new tools that are designed to simplify the creation of parallel code. These remove some, but not all, of the complexities of multi-threading. They also allow the same code to run on different computers with varying numbers of cores, taking advantage of all of the available processors.

Next: Page 2

8 August 2011