BlackWaspTM
LINQ
.NET 3.5+

Language-Integrated Query

by Richard Carr, published at http://www.blackwasp.co.uk/Linq.aspx

The first part of the LINQ to Objects tutorial describes the language-integrated query (LINQ) features that were introduced in version 3.5 of the .NET framework. LINQ provides a standardised means to query information from many different data sources.

What is LINQ?

A common development task is to query information, extracting a filtered list of items that may be ordered, grouped or aggregated. The data may be retrieved from an existing in-memory collection, a database, an XML file or many other sources. Usually, the type of the data source being examined dictates the syntax of the query, which may vary greatly, reducing the code's portability. Often the query will be generated within a string. This increases the risk of invalid or inaccurate queries as the strings will include no syntax checking and little or no support from the integrated development environment (IDE).

To alleviate these problems, Microsoft introduced language-integrated query (LINQ) into the .NET framework version 3.5. LINQ provides a set of standard query operators that can be used to perform simple or complex queries against a number of different data sources. The queries are integrated with other source code written in .NET languages such as C# or Visual Basic. This allows Visual Studio to provide syntax checking and Intellisense support.

In this tutorial we will be examining LINQ to Objects, which allows you to execute queries against in-memory data structures. Further LINQ providers are available to allow querying against SQL Server databases, XML, DataSets and many other data sources. You can also create your own providers to attach to domain-specific information.

A number of new features were added to the .NET framework and the C# programming language that provide support for LINQ. Extension methods and lambda expressions are used extensively to build queries. To allow data to be returned from a query without the need to first define a class or structure, LINQ often generates results using anonymous types. In addition, the previously available generics features are important. If you are unsure of any of these topics, follow the links to find articles describing them.

Standard Query Operators

The System.Linq namespace contains a number of standard query operators. These are extension methods that are available for all classes that implement the IEnumerable or generic IEnumerable<T> interfaces. The methods allow collections to be queried, aggregated and sorted. The methods can be chained together to generate more complex queries. The easiest way to understand the query operators is to see them in action. Try executing the following code in a console application. Ensure you include a using directive for the System.IO namespace.

var folders = Directory.GetDirectories(@"C:\").Where(d => d.Length > 10).OrderBy(d => d.Length);

foreach (string folder in folders)
{
    Console.WriteLine(folder);
}

The above sample is quite simple but performs a task that would otherwise require several lines of code instead of just one. The first line is the important one. It retrieves a filtered list of folders from the path "C:\". The Where operator and its lambda expression parameter specifies that only paths that are greater than ten characters in length are retrieved. The OrderBy method sorts the results according to their length. The remainder of the code simply outputs the results of the operation.

One of the advantages of LINQ is that it employs a declarative style of programming, whereas standard C# provides imperative code. Imperative programming requires that you specify exactly how an algorithm operates. An imperative version of the previous sample code would require that you used a loop and an if statement to determine which folder names should be added to a collection. You would then implement a sorting algorithm to order the results. The declarative approach adds a layer of abstraction, allowing you to specify what you wish to achieve without knowing the underlying algorithms that will be applied.

12 June 2010