The Don’t-Repeat-Yourself (DRY) design principle in .NET Part 3

We’ll finish up the DRY series with the Repeated Execution Pattern. This pattern can be used when you see similar chunks of code repeated at several places. Here we talk about code bits that are not 100% identical but follow the same pattern and can clearly be factored out.

Here’s an example:

static void Main(string[] args)
{
	Console.WriteLine("About to run the DoSomething method");
	DoSomething();
	Console.WriteLine("Finished running the DoSomething method");
	Console.WriteLine("About to run the DoSomethingAgain method");
	DoSomethingAgain();
	Console.WriteLine("Finished running the DoSomethingAgain method");
	Console.WriteLine("About to run the DoSomethingMore method");
	DoSomethingMore();
	Console.WriteLine("Finished running the DoSomethingMore method");
	Console.WriteLine("About to run the DoSomethingExtraordinary method");
	DoSomethingExtraordinary();
	Console.WriteLine("Finished running the DoSomethingExtraordinary method");
	
	Console.ReadLine();
}

private static void DoSomething()
{
	WriteToConsole("Nils", "a good friend", 30);
}

private static void DoSomethingAgain()
{
	WriteToConsole("Christian", "a neighbour", 54);
}

private static void DoSomethingMore()
{
	WriteToConsole("Eva", "my daughter", 4);
}

private static void DoSomethingExtraordinary()
{
	WriteToConsole("Lilly", "my daughter's best friend", 4);
}

private static void WriteToConsole(string name, string description, int age)
{
	Console.WriteLine(format, name, description, address, age);
}

We’re simulating a simple logging function every time we run we run one of these “dosomething” methods. The pattern is clear: write a message to the console, carry out an action and write another message to the console. The actions have an identical void, parameterless signature. The logging message all have the same format, it’s only the method name that varies. If this chain of actions continues to grow then we have to come back here and add the same type of logging messages. Also, if you later wish to change the logging message format then you’ll have to do it in many different places.

The first step is to factor out a single console-action-console chunk to its own method:

private static void ExecuteStep()
{
	Console.WriteLine("About to run the DoSomething method");
	DoSomething();
	Console.WriteLine("Finished running the DoSomething method");
}

This is of course not good enough as the method is very rigid. It is hard coded to execute the first step only. We can vary the action to be executed using the Action object:

private static void ExecuteStep(Action action)
{
	Console.WriteLine("About to run the DoSomething method");
	action();
	Console.WriteLine("Finished running the DoSomething method");
}

We can call this method as follows:

static void Main(string[] args)
{
	ExecuteStep(DoSomething);
	ExecuteStep(DoSomethingAgain);
	ExecuteStep(DoSomethingExtraordinary);
	ExecuteStep(DoSomethingMore);
	Console.ReadLine();
}

Except that we’re not logging the method names correctly. That’s still hard coded to “DoSomething”. That’s easy to fix as the Action object has public properties to read off the method name:

private static void ExecuteStep(Action action)
{
	string methodName = action.Method.Name;
	Console.WriteLine("About to run the {0} method", methodName);
	action();
	Console.WriteLine("Finished running the {0} method", methodName);
}

We’re almost done. If you look at the Main method then the ExecuteStep(somemethod) is called 4 times. That is also a form of DRY-violation. Imagine that you have a long workflow, such as the steps in a chemical experiment. In that case you may need to repeat the call to ExecuteStep many times.

We can instead put the methods to be executed in a collection of actions:

private static IEnumerable<Action> GetExecutionSteps()
{
	return new List<Action>()
	{
		DoSomething
		, DoSomethingAgain
		, DoSomethingExtraordinary
		, DoSomethingMore
	};
}

You can use this from within Main as follows:

static void Main(string[] args)
{
	IEnumerable<Action> actions = GetExecutionSteps();
	foreach (Action action in actions)
	{
		ExecuteStep(action);
	}
	Console.ReadLine();
}

Now it’s not the responsibility of the Main method to define the steps to be executed. It only iterates through a loop and calls ExecuteStep for each action.

View the list of posts on Architecture and Patterns here.

The Don’t-Repeat-Yourself (DRY) design principle in .NET Part 2

We’ll continue with our discussion of the Don’t-Repeat-Yourself principle where we left off in the previous post. The next issue we’ll consider is repetition of logic.

Repeated logic

Consider that you have the following two domain objects:

public class Product 
{
	public long Id { get; set; }
	public string Description { get; set; }
}
public class Order
{
	public long Id { get; set; }
}

Let’s say that the IDs are not automatically assigned when inserting a new row in the database. Instead, it must be calculated. So you come up with the following function to construct an ID which is probably unique:

private long CalculateId()
{
	TimeSpan ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
	long id = Convert.ToInt64(ts.TotalMilliseconds);
	return id;
}

You might include this type of logic in both domain objects:

public class Product 
{
	public long Id { get; set; }
	public string Description { get; set; }

	public Product()
	{
		Id = CalculateId();
	}

	private long CalculateId()
	{
		TimeSpan ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
		long id = Convert.ToInt64(ts.TotalMilliseconds);
		return id;
	}
}
public class Order
{
	public long Id { get; set; }

	public Order()
	{
		Id = CalculateId();
	}

	private long CalculateId()
	{
		TimeSpan ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
		long id = Convert.ToInt64(ts.TotalMilliseconds);
		return id;
	}
}

This situation may arise if the two domain objects have been added to your application with a long time delay and you’ve forgotten about the ID generation solution. Also, if you want to keep the ID generation logic independent for each object, then you might continue with this solution thinking that some day the ID generation strategies may be different. However, at some point the rules change and all IDs of type long must be constructed using the CalculateId method. Then you absolutely want to have this logic in one place only. Otherwise if the rule changes, then you probably don’t want to make the same change for every single domain object, right?

Probably a very common solution would be to factor out this logic to a static method:

public class IdHelper
{
	public static long CalculateId()
	{
		TimeSpan ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
		long id = Convert.ToInt64(ts.TotalMilliseconds);
		return id;
	}
}

The updated objects look as follows:

public class Order
{
	public long Id { get; set; }

	public Order()
	{
		Id = IdHelper.CalculateId();
	}
}
public class Product 
{
	public long Id { get; set; }
	public string Description { get; set; }

	public Product()
	{
		Id = IdHelper.CalculateId();
	}
}

If you’ve followed through the discussion on the SOLID design principles then you’ll know by now that static methods can be a design smell that indicate tight coupling. In this case there’s a hard dependency of the Product and Order classes on IdHelper.

If all objects in your domain must have an ID of type long then you may let every object derive from a superclass such as this:

public abstract class EntityBase
{
	public long Id { get; private set; }

	public EntityBase()
	{
		Id = CalculateId();
	}

	private long CalculateId()
	{
		TimeSpan ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
		long id = Convert.ToInt64(ts.TotalMilliseconds);
		return id;
	}
}

The Product and Order objects will derive from this class:

public class Product : EntityBase
{
	public string Description { get; set; }

	public Product()
	{}
}
public class Order : EntityBase
{
	public Order()
	{}
}

Then if you construct a new Order or Product class elsewhere then the ID will be assigned by the EntityBase constructor automatically.

In case you don’t like the base class approach then Constructor injection is another approach that can work. We delegate the ID generation logic to an external class which we hide behind an interface:

public interface IIdGenerator
{
	long CalculateId();
}

We have the following implementing class:

public class DefaultIdGenerator : IIdGenerator
{
	public long CalculateId()
	{
		TimeSpan ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
		long id = Convert.ToInt64(ts.TotalMilliseconds);
		return id;
	}
}

You can inject the interface dependency into the Order object as follows:

public class Order
{
	private readonly IIdGenerator _idGenerator;
	public long Id { get; private set; }

	public Order(IIdGenerator idGenerator)
	{
		if (idGenerator == null) throw new ArgumentNullException();
		_idGenerator = idGenerator;
		Id = _idGenerator.CalculateId();
	}
}

You can apply the same method to the Product object. Of course you can mix the above two solutions with the following EntityBase superclass:

public abstract class EntityBase
{
	private readonly IIdGenerator _idGenerator;

	public long Id { get; private set; }

	public EntityBase(IIdGenerator idGenerator)
	{
		if (idGenerator == null) throw new ArgumentNullException();
		_idGenerator = idGenerator;
		Id = _idGenerator.CalculateId();
	}
}

These are some of the possible solutions that you can employ to factor out common logic so that it becomes available for different objects. Obviously if this logic occurs only within the same class then just simply create a private method for it:

private void DoRepeatedLogic()
{
	Order order = new Order();
	TimeSpan ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
	long orderId = Convert.ToInt64(ts.TotalMilliseconds);
	order.Id = orderId;

	Product product = new Product();
	ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
	long productId = Convert.ToInt64(ts.TotalMilliseconds);
	product.Id = productId;
}

This is of course not very clever and you can quickly make it better:

private void DoRepeatedLogic()
{
	Order order = new Order();
	order.Id = CalculateId();

	Product product = new Product();
	product.Id = CalculateId();
}

private long CalculateId()
{
	TimeSpan ts = DateTime.UtcNow - (new DateTime(1970, 1, 1, 0, 0, 0));
	long id = Convert.ToInt64(ts.TotalMilliseconds);
	return id;
}

This is more likely to occur in long classes and methods where you lose track of all the code you’ve written. At some point you realise that some logic is repeated over and over again but it’s rooted deeply nested in a long, complicated method.

If statements

If statements are very important building blocks of an application. It would probably be impossible to write any real life app without them. However, it does not mean they should be used without any limitation. Consider the following domains:

public abstract class Shape
{
}

public class Triangle : Shape
{
	public int Base { get; set; }
	public int Height { get; set; }
}

public class Rectangle : Shape
{
	public int Width { get; set; }
	public int Height { get; set; }
}

Then in Program.cs of a Console app we can simulate a database lookup as follows:

private static IEnumerable&lt;Shape&gt; GetAllShapes()
{
	List&lt;Shape&gt; shapes = new List&lt;Shape&gt;();
	shapes.Add(new Triangle() { Base = 5, Height = 3 });
	shapes.Add(new Rectangle() { Height = 6, Width = 4 });
	shapes.Add(new Triangle() { Base = 9, Height = 5 });
	shapes.Add(new Rectangle() { Height = 3, Width = 2 });
	return shapes;
}

Say you want to calculate the total area of the shapes in the collection. The first approach may look like this:

private static double CalculateTotalArea(IEnumerable&lt;Shape&gt; shapes)
{
	double area = 0.0;
	foreach (Shape shape in shapes)
	{
		if (shape is Triangle)
		{
			Triangle triangle = shape as Triangle;
			area += (triangle.Base * triangle.Height) / 2;
		}
		else if (shape is Rectangle)
		{
			Rectangle recangle = shape as Rectangle;
			area += recangle.Height * recangle.Width;
		}
	}
	return area;
}

This is actually quite a common approach in a software design where our domain objects are mere collections of properties and are void of any self-contained logic. Look at the Triangle and Rectangle classes, they contain no logic whatsoever, they only have properties. They are reduced to the role of data-transfer-objects (DTOs). If you don’t understand at first what’s wrong with the above solution then I suggest you go through the Liskov Substitution Principle here. I won’t repeat what’s written in that post.

This post is about DRY so you may ask what this method has to do with DRY at all as we do not seem to repeat anything. Yes we do, although indirectly. Our initial intention was to create a class hierarchy so that we can work with the abstract class Shape elsewhere. Well, guess what, we’ve failed miserably. In this method we need to reveal not only the concrete implementation types of Shape but we’re forcing an external class to know about the internals of those concrete types.

This is a typical example for how not to use if statements in software. In the posts on the SOLID design principles we mentioned the Tell-Don’t-Ask (TDA) principle. It basically states that you should not ask an object questions about its current state before you ask it to perform something. Well, this piece of code is a clear violation of TDA although the lack of logic in the Triangle and Rectangle classes forced us to ask these questions.

The solution – or at least one of the viable solutions – will be to hide this calculation logic behind each concrete Shape class:

public abstract class Shape
{
	public abstract double CalculateArea();
}

public class Triangle : Shape
{
	public int Base { get; set; }
	public int Height { get; set; }

	public override double CalculateArea()
	{
		return (Base * Height) / 2;
	}
}

public class Rectangle : Shape
{
	public int Width { get; set; }
	public int Height { get; set; }

	public override double CalculateArea()
	{
		return Width * Height;
	}
}

The updated total area calculation looks as follows:

private static double CalculateTotalArea(IEnumerable&lt;Shape&gt; shapes)
{
	double area = 0.0;
	foreach (Shape shape in shapes)
	{
		area += shape.CalculateArea();
	}
	return area;
}

We’ve got rid of the if statements, we don’t violate TDA and the logic to calculate the area is hidden behind each concrete type. This allows us even to follow the above mentioned Liskov Substitution Principle.

View the list of posts on Architecture and Patterns here.

The Don’t-Repeat-Yourself (DRY) design principle in .NET Part 1

Introduction

The idea behind the Don’t-Repeat-Yourself (DRY) design principle is an easy one: a piece of logic should only be represented once in an application. In other words avoiding the repetition of any part of a system is a desirable trait. Code that is common to at least two different parts of your system should be factored out into a single location so that both parts call upon in. In plain English all this means that you should stop doing copy+paste right away in your software. Your motto should be the following:

Repetition is the root of all software evil.

Repetition does not only refer to writing the same piece of logic twice in two different places. It also refers to repetition in your processes – testing, debugging, deployment etc. Repetition in logic is often solved by abstractions or some common service classes whereas repetition in your process is tackled by automation. A lot of tedious processes can be automated by concepts from Continuous Integration and related automation software such as TeamCity. Unit testing can be automated by testing tools such as nUnit. You can read more on Test Driven Development and unit testing here.

In this ahort series on DRY I’ll concentrate on the ‘logic’ side of DRY. DRY is known by other names as well: Once and Only Once, and Duplication is Evil (DIE).

Examples

Magic strings

These are hard-coded strings that pop up at different places throughout your code: connection strings, formats, constants, like in the following code example:

class Program
{
	static void Main(string[] args)
	{
		DoSomething();
		DoSomethingAgain();
		DoSomethingMore();
		DoSomethingExtraordinary();
		Console.ReadLine();
	}

	private static void DoSomething()
	{
		string address = "Stockholm, Sweden";
		string format = "{0} is {1}, lives in {2}, age {3}";
		Console.WriteLine(format, "Nils", "a good friend", address, 30);
	}

	private static void DoSomethingAgain()
	{
		string address = "Stockholm, Sweden";
		string format = "{0} is {1}, lives in {2}, age {3}";
		Console.WriteLine(format, "Christian", "a neighbour", address, 54);
	}

	private static void DoSomethingMore()
	{
		string address = "Stockholm, Sweden";
		string format = "{0} is {1}, lives in {2}, age {3}";
		Console.WriteLine(format, "Eva", "my daughter", address, 4);
	}

	private static void DoSomethingExtraordinary()
	{
		string address = "Stockholm, Sweden";
		string format = "{0} is {1}, lives in {2}, age {3}";
		Console.WriteLine(format, "Lilly", "my daughter's best friend", address, 4);
	}
}

This is obviously a very simplistic example but imagine that the methods are located in different sections or even different modules in your application. In case you want to change the address you’ll need to find every hard-coded instance of the address. Likewise if you want to change the format you’ll need to update it in several different places. We can put these values into a separate location, such as Constants.cs:

public class Constants
{
	public static readonly string Address = "Stockholm, Sweden";
	public static readonly string StandardFormat = "{0} is {1}, lives in {2}, age {3}";
}

If you have a database connection string then that can be put into the configuration file app.config or web.config.

The updated programme looks as follows:

class Program
{
	static void Main(string[] args)
	{
		DoSomething();
		DoSomethingAgain();
		DoSomethingMore();
		DoSomethingExtraordinary();
		Console.ReadLine();
	}

	private static void DoSomething()
	{
		string address = Constants.Address;
		string format = Constants.StandardFormat;
		Console.WriteLine(format, "Nils", "a good friend", address, 30);
	}

	private static void DoSomethingAgain()
	{
		string address = Constants.Address;
		string format = Constants.StandardFormat;
		Console.WriteLine(format, "Christian", "a neighbour", address, 54);
	}

	private static void DoSomethingMore()
	{
		string address = Constants.Address;
		string format = Constants.StandardFormat;
		Console.WriteLine(format, "Eva", "my daughter", address, 4);
	}

	private static void DoSomethingExtraordinary()
	{
		string address = Constants.Address;
		string format = Constants.StandardFormat;
		Console.WriteLine(format, "Lilly", "my daughter's best friend", address, 4);
	}
}

This is a step to the right direction. If we change the constants in Constants.cs then the change will be propagated through the application. However, we still repeat the following bit over and over again:

string address = Constants.Address;
string format = Constants.StandardFormat;

The VALUES of the constants are now stored in one place, but what if we change the location of our constants to a different file? Or decide to read them from a file or a database? Then again we’ll need to revisit all these locations. We can move those variables to the class level and use them in our code as follows:

class Program
	{
		private static string address = Constants.Address;
		private static string format = Constants.StandardFormat;

		static void Main(string[] args)
		{
			DoSomething();
			DoSomethingAgain();
			DoSomethingMore();
			DoSomethingExtraordinary();
			Console.ReadLine();
		}

		private static void DoSomething()
		{
			Console.WriteLine(format, "Nils", "a good friend", address, 30);
		}

		private static void DoSomethingAgain()
		{
			Console.WriteLine(format, "Christian", "a neighbour", address, 54);
		}

		private static void DoSomethingMore()
		{
			Console.WriteLine(format, "Eva", "my daughter", address, 4);
		}

		private static void DoSomethingExtraordinary()
		{
			Console.WriteLine(format, "Lilly", "my daughter's best friend", address, 4);
		}
	}

We’ve got rid of the magic string repetition, but we can do better. Notice that each method performs basically the same thing: write to the console. This is an example of duplicate logic. The data written to the console is very similar in each case, we can factor it out to another method:

private static void WriteToConsole(string name, string description, int age)
{
	Console.WriteLine(format, name, description, address, age);
}

The updated Program class looks as follows:

class Program
	{
		private static string address = Constants.Address;
		private static string format = Constants.StandardFormat;

		static void Main(string[] args)
		{
			DoSomething();
			DoSomethingAgain();
			DoSomethingMore();
			DoSomethingExtraordinary();
			Console.ReadLine();
		}

		private static void DoSomething()
		{
			WriteToConsole("Nils", "a good friend", 30);
		}

		private static void DoSomethingAgain()
		{
			WriteToConsole("Christian", "a neighbour", 54);
		}

		private static void DoSomethingMore()
		{
			WriteToConsole("Eva", "my daughter", 4);
		}

		private static void DoSomethingExtraordinary()
		{
			WriteToConsole("Lilly", "my daughter's best friend", 4);
		}

		private static void WriteToConsole(string name, string description, int age)
		{
			Console.WriteLine(format, name, description, address, age);
		}
	}

Magic numbers

It’s not only magic strings that can cause trouble but magic numbers as well. Imagine that you have the following class in your application:

public class Employee
{
	public string Name { get; set; }
	public int Age { get; set; }
	public string Department { get; set; }
}

We’ll imitate a database lookup as follows:

private static IEnumerable<Employee> GetEmployees()
{
	return new List<Employee>()
	{
		new Employee(){Age = 30, Department="IT", Name="John"}
		, new Employee(){Age = 34, Department="Marketing", Name="Jane"}
		, new Employee(){Age = 28, Department="Security", Name="Karen"}
		, new Employee(){Age = 40, Department="Management", Name="Dave"}
	};
}

Notice the usage of the index 1 in the following method:

private static void DoMagicInteger()
{
	List<Employee> employees = GetEmployees().ToList();
	if (employees.Count > 0)
	{
		Console.WriteLine(string.Concat("Age: ", employees[1].Age, ", department: ", employees[1].Department
			, ", name: ", employees[1].Name));
	}
}

So we only want to output the properties of the second employee in the list, i.e. the one with index 1. One issue is a conceptual one: why are we only interested in that particular employee? What’s so special about him/her? This is not clear for anyone investigating the code. The second issue is that if we want to change the value of the index then we’ll need to do it in three places. If this particular index is important elsewhere as well then we’ll have to visit those places too and update the index.

We can solve both issues using the same simple techniques as in the previous example. Set a new constant in Constants.cs:

public class Constants
{
	public static readonly string Address = "Stockholm, Sweden";
	public static readonly string StandardFormat = "{0} is {1}, lives in {2}, age {3}";
	public static readonly int IndexOfMyFavouriteEmployee = 1;
}

Then introduce a new class level variable in Program.cs:

private static int indexOfMyFavouriteEmployee = Constants.IndexOfMyFavouriteEmployee;

The updated DoMagicInteger() method looks as follows:

private static void DoMagicInteger()
{
	List<Employee> employees = GetEmployees().ToList();
	if (employees.Count > 0)
	{
		Employee favouriteEmployee = employees[indexOfMyFavouriteEmployee];
		Console.WriteLine(string.Concat("Age: ", favouriteEmployee.Age, 
			", department: ", favouriteEmployee.Department
			, ", name: ", favouriteEmployee.Name));
	}
}

View the list of posts on Architecture and Patterns here.

ultimatemindsettoday

A great WordPress.com site

Elliot Balynn's Blog

A directory of wonderful thoughts

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

WEB APPLICATION DEVELOPMENT TUTORIALS WITH OPEN-SOURCE PROJECTS

Once Upon a Camayoc

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: