BlackWasp
LINQ
.NET 3.5+

LINQ One-to-Many Projection

The fourth part of the LINQ to Objects tutorial continues the investigation of the projection operators and clauses. This article describes the use of the SelectMany operator, which allows one-to-many object hierarchies to be flattened and queried.

One-to-Many Relationships

One-to-many relationships are commonplace in object-oriented programming. They usually take the form of a parent-child relationship. For example, an Employee class may include a property that holds a collection of skills. In this case the parent object is the employee and the skills are the children.

Parent-child relationships are difficult to query with LINQ's Select standard query operator. Using this operator you could easily query the skills of a single employee but when you have a collection of employees, each with a set of skills, querying all employees' skills is more complex. However, you can easily perform such operations using the SelectMany standard query operator.

SelectMany Operator

The SelectMany standard query operator is an extension method that allows any collection that implements the IEnumerable<T> interface to be queried. The method's parameter is used to extract a child collection from each parent item. Finally, all of the child collections are combined, or flattened, into a single IEnumerable. Further operations may then be performed upon the new set of data.

This can initially sound like a complex operation. It can be easier to understand when given an example. Let's consider a system that includes a collection of employees, each with a set of skills. The employees and their skills are listed below:

  • Bob (Senior Developer)
    • ASP.NET
    • C#
    • JavaScript
    • SQL
    • XML
  • Sam (Developer)
    • ASP.NET
    • C#
    • Oracle
    • XML
  • Mel (Developer)
    • C#
    • C++
    • SQL
  • Jim (Developer)
    • HTML
    • Visual Basic

The employees are represented using Employee objects that each include a collection of strings holding the skills. When the SelectMany operator is used to read the Skills collections, the four sets of skills are extracted and then combined into a single IEnumerable. The final result, once the collection is read, is the following list. This assumes that no further projection or filtering is applied:

  • ASP.NET
  • C#
  • JavaScript
  • SQL
  • XML
  • ASP.NET
  • C#
  • Oracle
  • XML
  • C#
  • C++
  • SQL
  • HTML
  • Visual Basic

You can see that all of the skills are now in a single set, which may be queried further as desired. To demonstrate the use of the SelectMany method, we will recreate the example data above using C# code and perform some queries. To begin, we need to create the Employee class. This is shown below:

public class Employee
{
    public string Name { get; set; }
    public string Title { get; set; }
    public Collection<string> Skills { get; set; }
}

We can now create the collection of employees and skills as per the example data:

var bob = new Employee
{
    Name = "Bob",
    Title = "Senior Developer",
    Skills = new Collection<string> { "ASP.NET", "C#", "JavaScript", "SQL", "XML" }
};
var sam = new Employee
{
    Name = "Sam",
    Title = "Developer",
    Skills = new Collection<string> { "ASP.NET", "C#", "Oracle", "XML" }
};
var mel = new Employee
{
    Name = "Mel",
    Title = "Developer",
    Skills = new Collection<string> { "C#", "C++", "SQL", }
};
var jim = new Employee
{
    Name = "Jim",
    Title = "Junior Programmer",
    Skills = new Collection<string> { "HTML", "Visual Basic" }
};

var employees = new List<Employee> { bob, sam, mel, jim };

One-to-Many Projection

For the first example we will perform a very simple one-to-many query of the employees' skills. This query returns the complete list of skills with no filtering. As you can see in the sample code below, the employees collection is queried using the SelectMany method. The method is provided with a delegate in the form of a lambda expression. This lambda expression is executed for each employee and is used to return the Skills collections. The four collections are then combined into a single set of strings.

var skills = employees.SelectMany(e => e.Skills);

/* RESULTS

ASP.NET
C#
JavaScript
SQL
XML
ASP.NET
C#
Oracle
XML
C#
C++
SQL
HTML
Visual Basic

*/

Accessing Parent Object Data

In some cases you will wish to project the results of a SelectMany query and include data from both the parent and child objects. If the child collection contains objects that include a reference to their parent then this may be used. If there is no link to the parent from the children, you can use an overloaded version of the SelectMany method. This version adds a second parameter, called the result selector, that also accepts a delegate. In this case, the delegate will be called with two parameters. The first parameter represents the parent object and the second provides the child. The delegate can use these arguments to create the object to be included in the results. Therefore, to project the results of the query to an anonymous type that contains the parent and child objects, you could use the lambda expression "(p, c) => new { p, c }".

In the following example, the results include the name from the employee object and the entire Skill string.

var skills = employees.SelectMany(e => e.Skills, (e, s) => new { e.Name, s });

/* RESULTS

{ Name = Bob, s = ASP.NET }
{ Name = Bob, s = C# }
{ Name = Bob, s = JavaScript }
{ Name = Bob, s = SQL }
{ Name = Bob, s = XML }
{ Name = Sam, s = ASP.NET }
{ Name = Sam, s = C# }
{ Name = Sam, s = Oracle }
{ Name = Sam, s = XML }
{ Name = Mel, s = C# }
{ Name = Mel, s = C++ }
{ Name = Mel, s = SQL }
{ Name = Jim, s = HTML }
{ Name = Jim, s = Visual Basic }

*/>

Filtering Data

For the final example we will add some filtering to the query. Filtering can be applied to both the parent and the child objects. For the parent object, use the Where method to apply filtering against the parent collection. Add the SelectMany after this Where call. To apply filtering to the child objects, add a Where operator after the SelectMany.

The next example filters the employee list so that only developers are returned. This uses the lambda expression, e => e.Title == "Developer". The SelectMany operator then retrieves all of the skills for the filtered developers, returning each as an anonymous type containing the employee's name and the skill. Finally, these employee and skill combinations are filtered to include only skills that start with the letter C, using the lambda, empAndSkill => empAndSkill.s.StartsWith("C").

var skills = employees
    .Where(e => e.Title == "Developer")
    .SelectMany(e => e.Skills, (e, s) => new { e.Name, s })
    .Where(empAndSkill => empAndSkill.s.StartsWith("C"));

/* RESULTS

{ Name = Sam, s = C# }
{ Name = Mel, s = C# }
{ Name = Mel, s = C++ }

*/>

One-to-Many Projection with Query Expression Syntax

In the final sections of this article we will recreate the previous example queries using query expression syntax. This syntax does not include a new clause for SelectMany. Instead, you can add an extra from clause, which obtains data from the results of the first from clause. When parsing such a query, the first from clause translates to a Select standard query operator. All other from clauses become SelectMany operations.

The following code recreates the initial example, which created a combined list of skills from all of the employees. Note that the second from clause is used to obtain the skills from the Skills property of the results of the first from clause.

var skills =
    from e in employees
    from s in e.Skills
    select s;

In the second example we combined the employees' names and skills in an anonymous type. With query expression syntax this becomes a simple projection that uses values from the two range variables.

var skills =
    from e in employees
    from s in e.Skills
    select new { e.Name, s };

The third example performed filtering of both the parent and child collections using two Where operators. The query syntax variation of this is very similar. Two where clauses are used, each using the range variable for the collection being filtered.

var skills =
    from e in employees
    where e.Title == "Developer"
    from s in e.Skills
    where s.StartsWith("C")
    select new { e.Name, s };
Link to this Page15 July 2010
TwitterTwitter RSS Feed RSS