Using the HashSet of T object in C# .NET to store unique elements

The generic HashSet of T is at first sight a not very “sexy” collection. It simply stores objects with no order, index or key to look up individual elements.

Here’s a simple HashSet with integers:

HashSet<int> intHashSet = new HashSet<int>();
intHashSet.Add(1);
intHashSet.Add(3);
intHashSet.Add(5);
intHashSet.Add(2);
intHashSet.Add(10);

HashSets can be handy when you want to guarantee uniqueness. The following example will only put the unique integers in the set:

HashSet<int> intHashSet = new HashSet<int>();
intHashSet.Add(1);
intHashSet.Add(1);
intHashSet.Add(3);
intHashSet.Add(1);
intHashSet.Add(3);

Only 1 and 3 will be available in the set.

Uniqueness is based on equality. The above works fine for value types where equality is well defined within .NET. E.g. it’s easy to determine whether two integers are equal.

However what about reference types like your own objects, such as this one?

public class Band
{
	public string Name { get; set; }
	public int YearFormed { get; set; }
	public int NumberOfMembers { get; set; }
	public int NumberOfRecords { get; set; }
}
HashSet<Band> bands = new HashSet<Band>();
bands.Add(new Band() { YearFormed = 1979, Name = "Great band", NumberOfMembers = 4, NumberOfRecords = 10 });
bands.Add(new Band() { YearFormed = 1985, Name = "Best band", NumberOfMembers = 5, NumberOfRecords = 15 });
bands.Add(new Band() { YearFormed = 1979, Name = "Great band", NumberOfMembers = 4, NumberOfRecords = 10 });
bands.Add(new Band() { YearFormed = 1985, Name = "Best band", NumberOfMembers = 5, NumberOfRecords = 15 });
bands.Add(new Band() { YearFormed = 1985, Name = "Best band", NumberOfMembers = 5, NumberOfRecords = 15 });
bands.Add(new Band() { YearFormed = 1979, Name = "Great band", NumberOfMembers = 4, NumberOfRecords = 10 });
Debug.WriteLine(bands.Count);

All 6 Band objects will be inserted into the HashSet even if we clearly see that there are only 2 unique bands. Why? Each Band object is a reference type meaning they have a position in memory denoted by an integer. As the 6 Band objects are instantiated one after the other they will all point to different positions in memory. C# has no way of knowing how these band object are compared and equated. How would it know automatically?

Luckily for us there’s an easy solution: we need to supply an IEqualityComparer of T object like we saw in the post dedicated to that interface. We’ll reuse it here. We say that two bands are equal if their names are equal:

public class BandNameComparer : IEqualityComparer<Band>
{
	public bool Equals(Band x, Band y)
	{
		return x.Name.Equals(y.Name, StringComparison.InvariantCultureIgnoreCase);
	}

	public int GetHashCode(Band obj)
	{
		return obj.Name.GetHashCode();
	}
}

Let’s use it in the bands HashSet:

HashSet<Band> bands = new HashSet<Band>(new BandNameComparer());
bands.Add(new Band() { YearFormed = 1979, Name = "Great band", NumberOfMembers = 4, NumberOfRecords = 10 });
bands.Add(new Band() { YearFormed = 1985, Name = "Best band", NumberOfMembers = 5, NumberOfRecords = 15 });
bands.Add(new Band() { YearFormed = 1979, Name = "Great band", NumberOfMembers = 4, NumberOfRecords = 10 });
bands.Add(new Band() { YearFormed = 1985, Name = "Best band", NumberOfMembers = 5, NumberOfRecords = 15 });
bands.Add(new Band() { YearFormed = 1985, Name = "Best band", NumberOfMembers = 5, NumberOfRecords = 15 });
bands.Add(new Band() { YearFormed = 1979, Name = "Great band", NumberOfMembers = 4, NumberOfRecords = 10 });
Debug.WriteLine(bands.Count);

The hash set will now only include 2 Band objects, i.e. the two unique ones.

View all various C# language feature related posts here.

Advertisements

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

ultimatemindsettoday

A great WordPress.com site

Elliot Balynn's Blog

A directory of wonderful thoughts

Robin Sedlaczek's Blog

Developer on Microsoft Technologies

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

the software architecture

thoughts, ideas, diagrams,enterprise code, design pattern , solution designs

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

Anything around ASP.NET MVC,WEB API, WCF, Entity Framework & AngularJS

Cyber Matters

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: