BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

LINQ
.NET 3.5+

LINQ Joins

The eighth part of the LINQ to Objects tutorial investigates the Join standard query operator and the equivalent use of the join clause. These allow two or more sets of data to be combined according to key information in each collection.

Joins

It is common practice to normalise data, both in databases and other storage, and in objects in memory. In simple terms, normalisation is the process of extracting duplicated information from a single set of data and holding it without duplication in another set. A key value present in both lists is used to link the information back together at a later time. For example, the stock items in a stock control system may include category information. As all of the items in the "Fresh Produce" category share the same category information, this can be extracted and held separately in a category list. The normalisation process reduces the amount of data that must be held and allows a single update to a parent item to be reflected in all of its children.

When working with databases you may achieve this normalisation by creating two tables and a foreign key relationship, often called a one-to-many or parent-child relationship. In XML you could create the category as an XML node and use child nodes for the products. When working with objects you may include a reference to a parent category in the stock item class using a property. In some cases you may have two separate collections, with one holding all categories and one holding all of the products. If there is a key property or field that can be used to match a stock item to its category, you can join the two collections using LINQ.

In this article I will demonstrate the use of joins using the Join standard query operator and the join clause of the query expression syntax. To begin, we need two sample classes to work with. The first represents a stock item with a property that will provide a link to its category:

public class StockItem
{
    public string Name { get; set; }
    public string Category { get; set; }
    public double Price { get; set; }

    public StockItem(string name, string category, double price)
    {
        Name = name;
        Category = category;
        Price = price;
    }
}

The second class will hold the details of the categories of stock item that are available. To keep the examples simple, the class includes a Name property, which will match the category from the stock item. All stock items in the same category will also share a major category. Later we will join data of both types to create lists of objects containing both the stock item data and category and major category details.

public class StockCategory
{
    public string Name { get; set; }
    public string MajorCategory { get; set; }

    public StockCategory(string name, string majorCategory)
    {
        Name = name;
        MajorCategory = majorCategory;
    }
}

To create a list of sample stock items, add the following code:

var stock = new List<StockItem>
{
    new StockItem("Apple", "Fruit", 0.30),
    new StockItem("Banana", "Fruit", 0.35),
    new StockItem("Orange", "Fruit", 0.29),
    new StockItem("Cabbage", "Vegetable", 0.49),
    new StockItem("Carrot", "Vegetable", 0.29),
    new StockItem("Lettuce", "Vegetable", 0.30),
    new StockItem("Milk", "Dairy", 1.12)
};

To complete the test data add the following code to create sample categories:

var categories = new List<StockCategory>
{
    new StockCategory("Dairy", "Chilled"),
    new StockCategory("Fruit", "Fresh"),
    new StockCategory("Vegetable", "Fresh")
};

Join Standard Query Operator

In our first example we will join the two collections using the basic version of the Join standard query operator. This is an extension method of all types that support IEnumerable<T>. It joins two lists, known as the outer and inner data, using four parameters. The first parameter supplies the inner list; the outer list is the object that the extension method is being executed against. The other three arguments accept Func delegates, each of which we will supply as a lambda expression. The three delegates are as follows:

  • Outer Key Selector. This delegate accepts a single argument of the type of the outer collection's items. It returns a key value, which may be of any type.
  • Inner Key Selector. This delegate is similar to the outer key selector. It defines how the key values are extracted from the inner collection. Where the outer and inner keys match, the data will be joined in the final results.
  • Result Selector. The third Func delegate defines how the results of the join will be projected. The delegate has two parameters, which will receive the outer and inner items for each matching key pair. The returned value is added to the collection returned by the Join method.

To see the Join operator in action, try executing the following code:

var joined = stock.Join(categories, s => s.Category, c => c.Name,
    (stockItem, category) => new
    {
        Name = stockItem.Name,
        Price = stockItem.Price,
        MinorCategory = category.Name,
        MajorCategory = category.MajorCategory
    });

Here the outer collection is provided by the stock variable and the inner collection by the categories list. The first lambda expression obtains a key from each stock item by returning the Category property. The second argument creates a key for each category by returning the Name property. Wherever these two items are equal, a resultant item will be created using the third lambda. This generates an anonymously typed object containing the name and price from the stock item and the category and major category from the stock category object. The results of the operation are as follows:

{ Name = Apple, Price = 0.3, MinorCategory = Fruit, MajorCategory = Fresh }
{ Name = Banana, Price = 0.35, MinorCategory = Fruit, MajorCategory = Fresh }
{ Name = Orange, Price = 0.29, MinorCategory = Fruit, MajorCategory = Fresh }
{ Name = Cabbage, Price = 0.49, MinorCategory = Vegetable, MajorCategory = Fresh }
{ Name = Carrot, Price = 0.29, MinorCategory = Vegetable, MajorCategory = Fresh }
{ Name = Lettuce, Price = 0.3, MinorCategory = Vegetable, MajorCategory = Fresh }
{ Name = Milk, Price = 1.12, MinorCategory = Dairy, MajorCategory = Chilled }
21 August 2010