
.NET 3.5+LINQ Outer Joins
One commonly used feature of Language-Integrated Query (LINQ) is the facility to combine two sequences of related data using joins. The standard join operation provides an inner join but with a minor modification can be changed to give a left outer join.
Inner and Outer Joins
When you use the LINQ join clause in the query expression syntax to combine two sets of related information, you perform an inner join. This means that you provide an expression that determines for each item in the first sequence, the matching items in the second. Where there is a match, one item is added to the resultant sequence. Where there are several matches for a single item, several results are generated. If an item in the first sequence has no matching items in the second, that item is excluded from the results.
An inner join is useful in many situations but there are some circumstances where the omission of non-matching results is undesirable. For example, imagine that you have a set of data containing product categories and another containing stock items. If you wish to generate a list of categories and their products, an inner join presents the possibility that an empty product category will be missing from the results. If you wish to include empty categories you need to perform an outer join.
An outer join includes items in the resultant sequence that were present in only one of the input collections. LINQ allows you to create left outer joins using a slight modification to the standard join syntax. This allows you to select all of the items from the first sequence, which appears on the left of the statement, even if they have no matching values in the second. If there are no matches, one result is generated using a combination of data from the first collection and provided default values. If there are multiple matching items, the behaviour is unchanged; multiple results are generated.
Example Data
To demonstrate the difference between an inner join and a left outer join we need some sample classes and data. The first class we will use represents a category of stock. The category includes a name and the name of the major category that it belongs to.
public class StockCategory
{
public string Name { get; set; }
public string MajorCategory { get; set; }
public StockCategory(string name, string majorCategory)
{
Name = name;
MajorCategory = majorCategory;
}
}
The second class is used to represent stock items. Each includes a name and the name of the category that it belong to.
public class StockItem
{
public string Name { get; set; }
public string Category { get; set; }
public StockItem(string name, string category)
{
Name = name;
Category = category;
}
}
With the classes created we can create some sample stock items and categories. Note that the categories list includes the item, "Ices", which has no related stock items.
var stock = new List<StockItem>
{
new StockItem("Apple", "Fruit"),
new StockItem("Banana", "Fruit"),
new StockItem("Orange", "Fruit"),
new StockItem("Cabbage", "Vegetable"),
new StockItem("Carrot", "Vegetable"),
new StockItem("Lettuce", "Vegetable"),
new StockItem("Milk", "Dairy")
};
var categories = new List<StockCategory>
{
new StockCategory("Dairy", "Chilled"),
new StockCategory("Ices", "Frozen"),
new StockCategory("Fruit", "Fresh"),
new StockCategory("Vegetable", "Fresh")
};
27 October 2011