Breaking parallel loops in .NET C# using the Break method

It’s not uncommon to break the execution of a for/foreach loop using the ‘break’ keyword. A for loop can look through a list of integers and if the loop body finds some matching value then the loop can be exited. It’s another discussion that ‘while’ and ‘do until’ loops might be a better alternative, but there you go.

You cannot simply break out from a parallel loop using the break keyword. However, we can achieve the effect with the ParallelLoopState class. In this post we looked at using the Stop method of the ParallelLoopState class. Here we’ll look at a slightly different method of the same class: Break(). Let’s say we have the following integer array…:

List<int> integers = new List<int>();

for (int i = 0; i <= 100; i++)
{
	integers.Add(i);
}

…and we want to break the loop as soon as we’ve found a number higher than 50. Both Parallel.For() and Parallel.ForEach() accepts an Action of T parameter as we saw before. This Action object has an overloaded version: Action of T and ParallelLoopState. The loop state is created automatically by the Parallel class. The loop state object has a Break method which stops the loop execution. To be more precise: if Break is called in the 5th iteration, then only those iterations will be started afterwards that are required to process items 1-4. Other iterations may have been started by the scheduler of course and they will run complete. So it is guaranteed that at least the first five items will be processed. Break() can even be called multiple times if the processing of multiple items results in breaking the code. In the below example if n separate threads are started with an integer higher than 50 then Break will be called n times:

Parallel.ForEach(integers, (int item, ParallelLoopState state) =>
{
	if (item > 50)
	{
		Console.WriteLine("Higher than 50: {0}, exiting loop.", item);
		state.Break();
	}
	else
	{
		Console.WriteLine("Less than 50: {0}", item);
	}
});

If Break is called more than once then the lowest item will be taken as a boundary by the Parallel class. If Break is called at items 5, 10 and 15 then all iterations required to process items 1-5 will be completed.

View the list of posts on the Task Parallel Library here.

How to cancel parallel LINQ queries in .NET C#

We saw in several posts on TPL on this blog how the CancellationToken object can be used to cancel Tasks. They can be used to cancel parallel queries as well. An instance of the token must be supplied to the WithCancellation extension method.

Define the cancellation token and the data source:

CancellationTokenSource cancellationTokenSource
	= new CancellationTokenSource();

int[] integerArray = new int[10000000];
for (int i = 0; i < integerArray.Length; i++)
{
	integerArray[i] = i;
}

Define the query. Notice how the token is provided to the query:

IEnumerable<double> query = integerArray
	.AsParallel()
	.WithCancellation(cancellationTokenSource.Token)
	.Select(item =>
	{
		return Math.Sqrt(item);
	});

Start a separate task that will cancel the token after 5 seconds:

Task.Factory.StartNew(() =>
{
	Thread.Sleep(5000);
	cancellationTokenSource.Cancel();
	Console.WriteLine("Token source cancelled");
});

Loop through the query results and catch the OperationCancelledException:

try
{
	foreach (double d in query)
	{
		Console.WriteLine("Result: {0}", d);
	}
}
catch (OperationCanceledException)
{
	Console.WriteLine("Caught cancellation exception");
}

Do not assume that cancelling the token will cause the item processing to stop immediately. Items that were already being processed when the token was cancelled will be completed.

View the list of posts on the Task Parallel Library here.

Parallel LINQ in .NET C#: keeping the order

In this post we saw a simple example of PLINQ. We saw that the items are processed in an arbitrary order.

Consider the following:

int[] sourceData = new int[10];
for (int i = 0; i < sourceData.Length; i++)
{
	sourceData[i] = i;
}

IEnumerable<int> parallelResults =
	from item in sourceData.AsParallel()
	where item % 2 == 0
	select item;

foreach (int item in parallelResults)
{
	Console.WriteLine("Item {0}", item);
}

When I ran the code on my machine I got the following output:

0, 4, 6, 8, 2

What if you want the items to be processed in an ascending order? Just append the AsOrdered() extension method:

IEnumerable<int> parallelResults =
	from item in sourceData.AsParallel().AsOrdered()
	where item % 2 == 0
	select item;

This produces the same order as a sequential query execution but with the benefits of parallel execution. However, there’s a small cost to this. Without ordering PLINQ could freely decide how to start and optimise the threads. Now there’s a performance cost to restore the order with the AsOrdered extension method. Keep this in mind and only use ordering if it really matters. Otherwise you can order the resulting list with “normal” LINQ afterwards.

View the list of posts on the Task Parallel Library here.

Handling exceptions in parallel loops in .NET C#

If a thread in a parallel loop throws an exception then no new iterations will be started. However, iterations already in progress are allowed to run to completion.

You can check if some other iteration has thrown an exception using the the ParallelLoopState class. This class has an IsExceptional property:

Parallel.For(0, 10, (int index, ParallelLoopState state) =>
{
	if (state.IsExceptional)
	{
		//do something
	}
});

All exceptions throw within the parallel loop are collected within an AggregateException object:

List<int> integers = new List<int>();

for (int i = 0; i <= 100; i++)
{
	integers.Add(i);
}

try
{
	Parallel.ForEach(integers, (int item, ParallelLoopState state) =>
	{
		if (item > 50)
		{
			throw new ArgumentNullException("Something has happened");
		}
		else
		{
			Console.WriteLine("Less than 50: {0}", item);
		}

		if (state.IsExceptional)
		{
			Console.WriteLine("Exception!");
		}

	});
}
catch (AggregateException ae)
{
	ae.Handle((inner) =>
	{
		Console.WriteLine(inner.Message);
		return true;
	});
}

You can read more about handling exceptions throw by tasks here, here and here.

Please view the comment section for a link with another example.

View the list of posts on the Task Parallel Library here.

Supply loop options to parallel loops in .NET C#

It’s possible to supply additional options to parallel for and foreach loops. The ParallelOptions class has the following properties:

  • CancellationToken: sets the cancellation to break a parallel loop
  • MaxDegreeOfParallelism: sets the max concurrency for a parallel loop. A value of -1 means no limit
  • TaskScheduler: this is normally null as the default task scheduler is very efficient. However, if you have a custom task scheduler then you can supply it here

We won’t go into building a custom task scheduler as it is a heavy topic and probably 98% of all .NET programmers are unlikely to ever need one.

So let’s see what MaxDegreeOfParallelism can do for us. It cannot do much directly other than setting a limit to the number of tasks created during a parallel loop. Note the word ‘limit’: setting a high number won’t guarantee that this many Tasks will be started. It is only an upper limit that the task scheduler will take into account. A value of 0, i.e. no concurrency at all, will cause an exception to be thrown in Parallel.For and Parallel.ForEach. A value of 1 is practically equal to sequential execution.

Declare the options object as follows:

ParallelOptions parallelOptions	= new ParallelOptions() { MaxDegreeOfParallelism = 1 };

You can supply the options to the Parallel.For and ForEach methods as input parameters:

Parallel.For(0, 10, parallelOptions, index =>
{
	Console.WriteLine("For Index {0} started", index);
	Thread.Sleep(100);
	Console.WriteLine("For Index {0} finished", index);
});

int[] integers = new int[] { 0, 2, 4, 6, 8 };

Parallel.ForEach(integers, parallelOptions, index =>
{
	Console.WriteLine("ForEach Index {0} started", index);
	Thread.Sleep(100);
	Console.WriteLine("ForEach Index {0} finished", index);
});

Run the code an you should see that the index values are processed in a sequential manner. Then set MaxDegreeOfParallelism to e.g. 5 and you should see something different.

View the list of posts on the Task Parallel Library here.

Breaking parallel loops in .NET C# using the Stop method

It’s not uncommon to break the execution of a for/foreach loop using the ‘break’ keyword. A for loop can look through a list of integers and if the loop body finds some matching value then the loop can be exited. It’s another discussion that ‘while’ and ‘do until’ loops might be a better alternative, but there you go.

You cannot simply break out from a parallel loop using the break keyword. However, we can achieve this effect with the ParallelLoopState class. Let’s say we have the following integer array…:

List<int> integers = new List<int>() { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

…and we want to break the loop as soon as we’ve found a number higher than 5. Both Parallel.For() and Parallel.ForEach() accepts an Action of T parameter as we saw before. This Action object has an overloaded version: Action of T and ParallelLoopState. The loop state is created automatically by the Parallel class. The loop state object has a Stop() method which stops the loop execution. Or more specifically it requests the loop to be stopped and the task scheduler will go about doing that. Keep in mind that some iterations may be continued even after the call to Stop was made:

Parallel.ForEach(integers, (int item, ParallelLoopState state) =>
{
	if (item > 5)
	{
		Console.WriteLine("Higher than 5: {0}, exiting loop.", item);
		state.Stop();
	}
	else
	{
		Console.WriteLine("Less than 5: {0}", item);
	}
});

So parallel loops can be exited but not with the same precision as in the case of synchronous loops.

View the list of posts on the Task Parallel Library here.

Parallel stepped for loops in .NET C#

We saw in this post how to create a parallel for loop using the Parallel object in .NET. One drawback of that method is the we cannot create the parallel equivalent of a synchronous stepped for loop:

for (int i = 0; i < 20; i += 2)

Parallel.For accepts the start index and the end index but not the step value. Parallel.ForEach comes to the rescue: we create an integer list with only those values that should be processed. Create the stepped integer list as follows:

public IEnumerable<int> SteppedIntegerList(int startIndex,
			int endEndex, int stepSize)
{
        for (int i = startIndex; i < endEndex; i += stepSize)
	{
		yield return i;
	}
}

Then call Parallel.ForEach:

Parallel.ForEach(SteppedIntegerList(0, 20, 2), index =>
{
	Console.WriteLine("Index value: {0}", index);
});

View the list of posts on the Task Parallel Library here.

Parallel for-each loops in .NET C#

It’s trivial to create parallel for-each loops in .NET using the built-in Parallel class. It has a ForEach() method with a wide range of overloaded varieties. One of the easier ones accepts 2 parameters:

  • An IEnumerable object
  • An Action of T which is used to process each item in the list

The parallel ForEach loop will process any type of object in an IEnumerable enumeration. Example:

List<string> dataList = new List<string> 
{
         "this", "is", "random", "sentence", "hello", "goodbye"
};

Parallel.ForEach(dataList, item =>
{
	Console.WriteLine("Item {0} has {1} characters",
		item, item.Length);
});

Run the code and you’ll see that the items in the string list are not processed sequentially.

View the list of posts on the Task Parallel Library here.

Parallel for loops in .NET C#

It’s trivial to start parallel loops in .NET using the built-in Parallel class. The class has a For() method with a wide variety of overloads. One of the easier ones accepts 3 parameters:

  • The start index: this is inclusive, i.e. this will be the first index value in the loop
  • The end index: this is exclusive, so it won’t be processed in the loop
  • An Action of int which represents the method that should be executed in each loop, where int is the actual index value

Example:

Parallel.For(0, 10, index =>
{
	Console.WriteLine("Task Id {0} processing index: {1}",
		Task.CurrentId, index);
});

If you run this code then you’ll see that the index values are indeed processed in a parallel fashion. The printout on the console windows may show that index 5 is processed a bit before 4 or that thread id 1 processed index 2 and thread #2 processed index 7. This all depends on the task scheduler so use parallel loops only in case you don’t care about the actual processing order.

View the list of posts on the Task Parallel Library here.

Parallel LINQ in .NET C#: the ForAll() extension

The ForAll() extension is part of the Linq to Objects library. It allows to execute an Action on each item in the query. It can be used in conjunction with parallel queries as well to perform actions such as filtering on each item of a ParallelQuery.

The following example takes each item in an integer array, selects only the even numbers and then prints each item and it square root:

int[] integerArray = new int[50];
for (int i = 0; i < integerArray.Length; i++)
{
	integerArray[i] = i;
}

integerArray.AsParallel()
	.Where(item => item % 2 == 0)
	.ForAll(item => Console.WriteLine("Item {0} Result {1}",
		item, Math.Sqrt(item)));

View the list of posts on the Task Parallel Library here.

Elliot Balynn's Blog

A directory of wonderful thoughts

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

WEB APPLICATION DEVELOPMENT TUTORIALS WITH OPEN-SOURCE PROJECTS

Once Upon a Camayoc

ARCHIVED: Bite-size insight on Cyber Security for the not too technical.