Joining unique values from two sequences with the LINQ Union operator
February 10, 2016 Leave a comment
Say you have two sequences of the same object type:
string[] first = new string[] {"hello", "hi", "good evening", "good day", "good morning", "goodbye" }; string[] second = new string[] {"whatsup", "how are you", "hello", "bye", "hi"};
You’d then like to join the two sequences containing the values from both but filtering out duplicates. Here’s how to achieve that with the first prototype of the LINQ Union operator:
IEnumerable<string> union = first.Union(second); foreach (string value in union) { Console.WriteLine(value); }
You’ll see that “hello” and “hi” were filtered out from the second sequence as they already figure in the first. This version of the Union operator used a default comparer to compare the string values. As .NET has a good default comparer for strings you could rely on that to filter out duplicates.
However, if you have custom objects then .NET won’t automatically know how to compare them so the comparison will be based on reference equality which is not what you want. Say you have the following object:
public class Singer { public int Id { get; set; } public string FirstName { get; set; } public string LastName { get; set; } }
…and the following sequences:
IEnumerable<Singer> singersA = new List<Singer>() { new Singer(){Id = 1, FirstName = "Freddie", LastName = "Mercury"} , new Singer(){Id = 2, FirstName = "Elvis", LastName = "Presley"} , new Singer(){Id = 3, FirstName = "Chuck", LastName = "Berry"} }; IEnumerable<Singer> singersB = new List<Singer>() { new Singer(){Id = 1, FirstName = "Freddie", LastName = "Mercury"} , new Singer(){Id = 2, FirstName = "Elvis", LastName = "Presley"} , new Singer(){Id = 4, FirstName = "Ray", LastName = "Charles"} , new Singer(){Id = 5, FirstName = "David", LastName = "Bowie"} };
If you try the following:
IEnumerable<Singer> singersUnion = singersA.Union(singersB); foreach (Singer s in singersUnion) { Console.WriteLine(s.Id); }
…then you’ll see that the duplicates weren’t in fact filtered out and that’s expected. This is where the second version of Union enters the picture where you can provide your custom comparer, like the following:
public class DefaultSingerComparer : IEqualityComparer<Singer> { public bool Equals(Singer x, Singer y) { return x.Id == y.Id; } public int GetHashCode(Singer obj) { return obj.Id.GetHashCode(); } }
You can use this comparer as follows:
IEnumerable<Singer> singersUnion = singersA.Union(singersB, new DefaultSingerComparer()); foreach (Singer s in singersUnion) { Console.WriteLine(s.Id); }
Problem solved!
You can view all LINQ-related posts on this blog here.