Creating a grouped join on two sequences using the LINQ GroupJoin operator
April 6, 2017 Leave a comment
The GroupJoin operator is very similar to the Join operator. I won’t repeat the things we discussed there, so I recommend you read that post first if you’re not familiar with that operator.
With GroupJoin we can not only join elements in two sequences but group them at the same time.
GroupJoin has the same signature as Join with one exception. The result selector in Join has the following type:
Func<TOuter, TInner, TResult>
…whereas the result selector in GroupJoin looks as follows:
Func<TOuter, IEnumerable<TInner>, TResult>
We have the following sample data structure:
public class Singer { public int Id { get; set; } public string FirstName { get; set; } public string LastName { get; set; } } public class Concert { public int SingerId { get; set; } public int ConcertCount { get; set; } public int Year { get; set; } } public static IEnumerable<Singer> GetSingers() { return new List<Singer>() { new Singer(){Id = 1, FirstName = "Freddie", LastName = "Mercury"} , new Singer(){Id = 2, FirstName = "Elvis", LastName = "Presley"} , new Singer(){Id = 3, FirstName = "Chuck", LastName = "Berry"} , new Singer(){Id = 4, FirstName = "Ray", LastName = "Charles"} , new Singer(){Id = 5, FirstName = "David", LastName = "Bowie"} }; } public static IEnumerable<Concert> GetConcerts() { return new List<Concert>() { new Concert(){SingerId = 1, ConcertCount = 53, Year = 1979} , new Concert(){SingerId = 1, ConcertCount = 74, Year = 1980} , new Concert(){SingerId = 1, ConcertCount = 38, Year = 1981} , new Concert(){SingerId = 2, ConcertCount = 43, Year = 1970} , new Concert(){SingerId = 2, ConcertCount = 64, Year = 1968} , new Concert(){SingerId = 3, ConcertCount = 32, Year = 1960} , new Concert(){SingerId = 3, ConcertCount = 51, Year = 1961} , new Concert(){SingerId = 3, ConcertCount = 95, Year = 1962} , new Concert(){SingerId = 4, ConcertCount = 42, Year = 1950} , new Concert(){SingerId = 4, ConcertCount = 12, Year = 1951} , new Concert(){SingerId = 5, ConcertCount = 53, Year = 1983} }; }
In the example on Join we joined the single elements in the singers list with the concert list based on the singer ids. We will do the same here but at the same time we want to read the total number of concerts for each singer. The following query will do just that:
IEnumerable<Singer> singers = DemoCollections.GetSingers(); IEnumerable<Concert> concerts = DemoCollections.GetConcerts(); var singerConcerts = singers.GroupJoin(concerts, s => s.Id, c => c.SingerId, (s, co) => new { Id = s.Id , SingerName = string.Concat(s.FirstName, " ", s.LastName) , ConcertCount = co.Sum(c => c.ConcertCount) }); foreach (var res in singerConcerts) { Console.WriteLine(string.Concat(res.Id, ": ", res.SingerName, ", ", res.ConcertCount)); }
Each type in the lambda expressions will be inferred by the compiler:
- ‘s’ is of type Singer for the outer selector
- ‘c’ is of type Concert for the inner selector
- ‘co’ is type IEnumerable of Concert which will be used for the grouping function, Sum in this case
Just like the Join operator, GroupJoin can also accept an IEqualityComparer of TKey in case the comparison key requires some special treatment. In the above example only integers are compared so there’s no need for anything extra.
You can view all LINQ-related posts on this blog here.