Finding the set difference between two sequences using the LINQ Except operator

Say you have the following two sequences:

string[] first = new string[] {"hello", "hi", "good evening", "good day", "good morning", "goodbye" };
string[] second = new string[] {"whatsup", "how are you", "hello", "bye", "hi"};

If you’d like to find the values that only figure in “first” then it’s easy to achieve using the LINQ Except operator:

IEnumerable<string> except = first.Except(second);
foreach (string value in except)

The “except” variable will include “good evening”, “good day”, “good morning”, “goodbye” as “hello”, “hi” also figure in “second”.

The Except operator uses a comparer to determine whether two elements are equal. In this case .NET has a built-in default comparer to compare strings so you didn’t have to implement any special comparer. However, if you have custom objects in the two arrays then the default object reference comparer won’t be enough:

public class Singer
	public int Id { get; set; }
	public string FirstName { get; set; }
	public string LastName { get; set; }

IEnumerable<Singer> singersA = new List<Singer>() 
	new Singer(){Id = 1, FirstName = "Freddie", LastName = "Mercury"} 
	, new Singer(){Id = 2, FirstName = "Elvis", LastName = "Presley"}
	, new Singer(){Id = 3, FirstName = "Chuck", LastName = "Berry"}


IEnumerable<Singer> singersB = new List<Singer>() 
	new Singer(){Id = 1, FirstName = "Freddie", LastName = "Mercury"} 
	, new Singer(){Id = 2, FirstName = "Elvis", LastName = "Presley"}
	, new Singer(){Id = 4, FirstName = "Ray", LastName = "Charles"}
	, new Singer(){Id = 5, FirstName = "David", LastName = "Bowie"}

IEnumerable<Singer> singersDiff = singersA.Except(singersB);
foreach (Singer s in singersDiff)

The singersDiff sequence will include everything from singersA of course as each object is different as far as their references are concerned. This is where the second prototype of the operator enters the scene where you can define your own comparison function:

public class DefaultSingerComparer : IEqualityComparer<Singer>
	public bool Equals(Singer x, Singer y)
		return x.Id == y.Id;

	public int GetHashCode(Singer obj)
		return obj.Id.GetHashCode();

So we say that singerA == singerB if their IDs are equal. You can use this comparer as follows:

IEnumerable<Singer> singersDiff = singersA.Except(singersB, new DefaultSingerComparer());
foreach (Singer s in singersDiff)

singersDiff will now correctly include singer #3 only.

You can view all LINQ-related posts on this blog here.

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


A great site

Elliot Balynn's Blog

A directory of wonderful thoughts

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog


Once Upon a Camayoc

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: