
.NET 3.5+Language-Integrated Query (2)
The first part of the LINQ to Objects tutorial describes the language-integrated query (LINQ) features that were introduced in version 3.5 of the .NET framework. LINQ provides a standardised means to query information from many different data sources.
Query Expression Syntax
In addition to the standard query operators, LINQ provides a new query syntax. This allows queries to be created in a format that some developers find more natural. The queries are similar to those used with structured query language (SQL) for querying databases. We can recreate the previous example using the query syntax as follows:
var folders =
from d in Directory.GetDirectories(@"C:\")
where d.Length > 10
orderby d.Length
select d;
foreach (string folder in folders)
{
Console.WriteLine(folder);
}
The above query is quite easy to read. Although it is actually a single statement I have separated it into several lines to highlight the key operations. Firstly, the data source is specified with the from keyword. Secondly, the filter is applied by the where clause. The sort order is determined by the orderby element and the items to include in the results are specified in the select element.
Deferred Execution
At first glance, you might imagine that the LINQ queries described above return a collection of strings as soon as the line of code is processed. In reality, LINQ employs deferred execution, or lazy loading. Queries that return a single value or object do execute immediately. However, those that return a list of items are usually not executed until the first time that the results are used.
Deferred execution provides several benefits. If you have a number of queries, with each retrieving details from the results of another, the queries can be combined into a single operation. With LINQ to Objects this reduces the memory overhead that would be required for the interim lists and potentially improves the query's performance. For other LINQ providers it may minimise network traffic or database activity. The main disadvantage to this approach is that you can be surprised by the results of a query if the data changes after the query is defined but before it is executed.
We can demonstrate deferred execution by running the following sample code. This creates a list of strings containing three items and then defines a query that returns all values from the list. Following the query line, another item is added to the original source list and the results of the query are outputted to the console. If the query had been executed when it was encountered, the "values" variable would contain three results. However, because the query is executed when the results are first read, the additional value is included and four strings are outputted to the console.
var source = new List<string> { "A", "B", "C" };
var values =
from s in source
select s;
source.Add("D");
foreach (string value in values)
{
Console.WriteLine(value);
}
/* OUTPUT
A
B
C
D
*/
LINQ to Objects Tutorial
The articles in this tutorial will describe the use of LINQ to Objects for querying in-memory data structures. The first group of articles will describe how to construct queries using basic operators and query expression syntax. This will include how information from multiple sources can be joined and how to aggregate numeric data. The later articles will describe groups of related standard query operators.
12 June 2010