Grouping elements in LINQ .NET using GroupBy and an EqualityComparer
September 13, 2016 Leave a comment
The GroupBy operator has the same function as GROUP BY in SQL: group elements in a sequence by a common key. The GroupBy operator comes with 8 different signatures. Each returns a sequence consisting of objects that implement the IGrouping interface of type K – the key type – and T – the type of the objects in the sequence. IGrouping implements IEnumerable of T. So when we iterate through the result the we can first look at the outer sequence of keys and then the inner sequence of each object with that key.
The simplest version of GroupBy accepts a Func delegate of T and K, which acts as the key selector. It will compare the objects in the sequence using a default comparer. E.g. if you want to group the objects by their integer IDs then you can let the default comparer do its job. Another version of GroupBy lets you supply your own comparer to define a custom grouping or if the Key is an object where you want to define your own rules for equality.
We’ll need an example sequence which has an ID. In the posts on LINQ we often take the following collections for the demos:
IEnumerable<Singer> singers = new List<Singer>() { new Singer(){Id = 1, FirstName = "Freddie", LastName = "Mercury"} , new Singer(){Id = 2, FirstName = "Elvis", LastName = "Presley"} , new Singer(){Id = 3, FirstName = "Chuck", LastName = "Berry"} , new Singer(){Id = 4, FirstName = "Ray", LastName = "Charles"} , new Singer(){Id = 5, FirstName = "David", LastName = "Bowie"} }; IEnumerable<Concert> concerts = new List<Concert>() { new Concert(){SingerId = 1, ConcertCount = 53, Year = 1979} , new Concert(){SingerId = 1, ConcertCount = 74, Year = 1980} , new Concert(){SingerId = 1, ConcertCount = 38, Year = 1981} , new Concert(){SingerId = 2, ConcertCount = 43, Year = 1970} , new Concert(){SingerId = 2, ConcertCount = 64, Year = 1968} , new Concert(){SingerId = 3, ConcertCount = 32, Year = 1960} , new Concert(){SingerId = 3, ConcertCount = 51, Year = 1961} , new Concert(){SingerId = 3, ConcertCount = 95, Year = 1962} , new Concert(){SingerId = 4, ConcertCount = 42, Year = 1950} , new Concert(){SingerId = 4, ConcertCount = 12, Year = 1951} , new Concert(){SingerId = 5, ConcertCount = 53, Year = 1983} };
The singers collection won’t actually be needed for the code example, it simply shows the purpose of the concerts collection. Let’s imagine that our Singers collection includes both male and female singers and that ids below 3 are female singers and the others are all male singers. Our goal is to group the concerts based on gender using this information. We can have the following custom equality comparer:
public class SingerGenderComparer : IEqualityComparer<int> { private int _femaleSingerIdLimit = 3; public bool Equals(int x, int y) { return IsPerformedByFemaleSinger(x) == IsPerformedByFemaleSinger(y); } public int GetHashCode(int obj) { return IsPerformedByFemaleSinger(obj) ? 1 : 2; } public bool IsPerformedByFemaleSinger(int singerId) { return singerId < _femaleSingerIdLimit; } }
Here’s the grouping of the Concerts collection using the custom comparer:
SingerGenderComparer comparer = new SingerGenderComparer(); IEnumerable<IGrouping<int, Concert>> concertGroups = concerts.GroupBy(c => c.SingerId, comparer); foreach (IGrouping<int, Concert> concertGroup in concertGroups) { Console.WriteLine("Concerts of {0} singers: ", comparer.IsPerformedByFemaleSinger(concertGroup.Key) ? "female" : "male"); foreach (Concert concert in concertGroup) { Console.WriteLine("Number of concerts: {0}, in the year of {1} by singer {2}.", concert.ConcertCount, concert.Year, concert.SingerId); } }
This yields the following output:
You can define the type of object that will be selected from the base sequence using another version of GroupBy which allows you to provide a key selector:
IEnumerable<IGrouping<int, int>> concertGroupsFiltered = concerts.GroupBy(c => c.SingerId, c => c.ConcertCount, comparer); foreach (IGrouping<int, int> concertGroup in concertGroupsFiltered) { Console.WriteLine("Concerts of {0} singers: ", comparer.IsPerformedByFemaleSinger(concertGroup.Key) ? "female" : "male"); foreach (int concertCount in concertGroup) { Console.WriteLine("Number of concerts: {0}.", concertCount); } }
…which gives the following output:
You can view all LINQ-related posts on this blog here.