Flatten sequences with the C# LINQ SelectMany operator
February 2, 2017 Leave a comment
Suppose that we have an object with a collection of other objects, like a customer with order items. Then we can also have a sequence of customers where each customer will have her own list of orders. It happens that we want to analyse all orders regardless of the customer, like how many of product A have been sold. There are several options to collect all orders from all customers and place them into one unified collection for further analysis.
The C# SelectMany operator has been specifically designed to extract collections of objects and flatten those collections into one. This post will provide a couple of examples to demonstrate its usage.
The demo objects
Imagine that we’re analysing the URLs of a web page. A URL item is modeled by the following object:
public class PageComponent { public string Name { get; set; } public string Type { get; set; } public string Extension { get; set; } public int SizeBytes { get; set; } public override string ToString() { return JsonConvert.SerializeObject(this); } }
…where the overridden ToString method uses the widely used JSON.NET library to produce the JSON representation of the page component object.
Then we have a WebPage class as well to hold the page name and the URL components:
public class WebPage { public string Name { get; set; } public IEnumerable<PageComponent> PageComponents { get; set; } }
Here’s our demo collection with 3 web pages, each with its own set of URLs:
IEnumerable<WebPage> webPages = new List<WebPage>() { new WebPage() { Name = "www.greatsite.com", PageComponents = new List<PageComponent>() { new PageComponent() { Name = "home.html", Type = "html", Extension = "html", SizeBytes = 35623}, new PageComponent() { Name = "datatables.js", Type = "script", Extension = "js", SizeBytes = 2345}, new PageComponent() { Name = "nicepic.img", Type = "image", Extension = "img", SizeBytes = 24140}, new PageComponent() { Name = "favicon.ico", Type = "image", Extension = "ico", SizeBytes = 64152}} }, new WebPage() { Name = "www.awesomeness.com", PageComponents = new List<PageComponent>() { new PageComponent() { Name = "home.html", Type = "html", Extension = "html", SizeBytes = 35623}, new PageComponent() { Name = "myscript.ts", Type = "script", Extension = "ts", SizeBytes = 86124}, new PageComponent() { Name = "daily.img", Type = "image", Extension = "img", SizeBytes = 52667}, new PageComponent() { Name = "selfie.png", Type = "image", Extension = "png", SizeBytes = 22922}, new PageComponent() { Name = "mybeautifulface.png", Type = "image", Extension = "png", SizeBytes = 78416}, new PageComponent() { Name = "hello.img", Type = "image", Extension = "img", SizeBytes = 65046}} }, new WebPage() { Name = "www.boring.com", PageComponents = new List<PageComponent>() { new PageComponent() { Name = "datatables.css", Type = "stylesheet", Extension = "css", SizeBytes = 27470}, new PageComponent() { Name = "combined.css", Type = "stylesheet", Extension = "css", SizeBytes = 24627}, new PageComponent() { Name = "externallink.html", Type = "html", Extension = "html", SizeBytes = 72639}, new PageComponent() { Name = "googleads.html", Type = "html", Extension = "html", SizeBytes = 15873}, new PageComponent() { Name = "nicepic.img", Type = "image", Extension = "img", SizeBytes = 24140}} } };
Usage examples
The simplest application of SelectMany is to collect all Page Component objects into a single Page Components collection:
List<PageComponent> allPageComponents = webPages.SelectMany(wps => wps.PageComponents).ToList(); allPageComponents.ForEach(pc => Console.WriteLine(pc.ToString()));
SelectMany accepts a collection selector which is a function that takes the items in the source collection, i.e. webPages in the above example, and returns a collection of some type. In our case we simply return the page components.
The code produces the following output:
{“Name”:”home.html”,”Type”:”html”,”Extension”:”html”,”SizeBytes”:35623}
{“Name”:”datatables.js”,”Type”:”script”,”Extension”:”js”,”SizeBytes”:2345}
{“Name”:”nicepic.img”,”Type”:”image”,”Extension”:”img”,”SizeBytes”:24140}
{“Name”:”favicon.ico”,”Type”:”image”,”Extension”:”ico”,”SizeBytes”:64152}
{“Name”:”home.html”,”Type”:”html”,”Extension”:”html”,”SizeBytes”:35623}
{“Name”:”myscript.ts”,”Type”:”script”,”Extension”:”ts”,”SizeBytes”:86124}
{“Name”:”daily.img”,”Type”:”image”,”Extension”:”img”,”SizeBytes”:52667}
{“Name”:”selfie.png”,”Type”:”image”,”Extension”:”png”,”SizeBytes”:22922}
{“Name”:”mybeautifulface.png”,”Type”:”image”,”Extension”:”png”,”SizeBytes”:78416}
{“Name”:”hello.img”,”Type”:”image”,”Extension”:”img”,”SizeBytes”:65046}
{“Name”:”datatables.css”,”Type”:”stylesheet”,”Extension”:”css”,”SizeBytes”:27470}
{“Name”:”combined.css”,”Type”:”stylesheet”,”Extension”:”css”,”SizeBytes”:24627}
{“Name”:”externallink.html”,”Type”:”html”,”Extension”:”html”,”SizeBytes”:72639}
{“Name”:”googleads.html”,”Type”:”html”,”Extension”:”html”,”SizeBytes”:15873}
{“Name”:”nicepic.img”,”Type”:”image”,”Extension”:”img”,”SizeBytes”:24140}
What happens if the collection selector returns the web page name?
List<char> chars = webPages.SelectMany(wps => wps.Name).ToList(); chars.ForEach(c => Console.Write(c));
The result is a collection of chars where each char is taken from each web page name:
http://www.greatsite.comwww.awesomeness.comwww.boring.com
There’s nothing stopping us from combining SelectMany with other LINQ operators. We can also attach other LINQ operators within the collection selector function if we need to return some other collection type. In the following example we collect every URL extension from every URL element in the web pages collection, filter out the distinct ones and put them in alphabetical order:
var uniqueExtensions = webPages.SelectMany(wps => wps.PageComponents .Select(pc => new { Extension = pc.Extension })).Distinct().OrderBy(n => n.Extension).ToList(); uniqueExtensions.ForEach(p => Console.WriteLine(p.Extension));
Here’s the result:
css
html
ico
img
js
png
ts
SelectMany has a variation where we can supply an index to the result selector function. The index will be the position of the object where the collection item was taken from. Here’s an example:
var allPageComponentsWithIndex = webPages.SelectMany ((wps, index) => wps.PageComponents.Select(c => new { ComponentName = c.Name, PositionOfContainer = index })).ToList(); allPageComponentsWithIndex.ForEach(pc => Console.WriteLine(string.Concat(pc.PositionOfContainer, ": ", pc.ComponentName)));
Here’s the output:
0: home.html
0: datatables.js
0: nicepic.img
0: favicon.ico
1: home.html
1: myscript.ts
1: daily.img
1: selfie.png
1: mybeautifulface.png
1: hello.img
2: datatables.css
2: combined.css
2: externallink.html
2: googleads.html
2: nicepic.img
Home.html has index 0 because it belongs to the 0th item, i.e. http://www.greatsite.com in the original collection. http://www.greatsite.com has position 0 in the source collection “webPages”. datatables.css figures in the URL collection of http://www.boring.com which has position 2 in the webPages source collection.
SelectMany has an overload which accepts a result selector function. The result selector accepts the items in the source collection, i.e. WebPage in our case, and the items generated in the collection selector function. The result selector returns an object of some type that will be added to the resulting collection.
Here’s an example where we collect render each URL item to its page name:
var pagesWithComponents = webPages .SelectMany(wps => wps.PageComponents, (wp, cp) => new { PageName = wp.Name, ComponentName = cp.Name }).ToList(); pagesWithComponents.ForEach(p => Console.WriteLine(string.Concat(p.PageName, ": ", p.ComponentName)));
The “wp” parameter in the result selector will be a WebPage object since that is the element type in the source collection. “cp” will be of type PageComponent as we select page components in the collection selector function before. The result selector returns an anonymous object with 2 properties: PageName and ComponentName. Here’s the output:
http://www.greatsite.com: home.html
http://www.greatsite.com: datatables.js
http://www.greatsite.com: nicepic.img
http://www.greatsite.com: favicon.ico
http://www.awesomeness.com: home.html
http://www.awesomeness.com: myscript.ts
http://www.awesomeness.com: daily.img
http://www.awesomeness.com: selfie.png
http://www.awesomeness.com: mybeautifulface.png
http://www.awesomeness.com: hello.img
http://www.boring.com: datatables.css
http://www.boring.com: combined.css
http://www.boring.com: externallink.html
http://www.boring.com: googleads.html
http://www.boring.com: nicepic.img
The above overload of SelectMany has a version with the index parameter which has the same role as the index we saw above. In the final example we add the position of the page to the above output:
var pagesWithComponentsAndIndex = webPages.SelectMany((wps, index) => wps.PageComponents .Select(c => new { ComponentName = c.Name, PositionOfContainer = index }), (wp, cp) => new { PageName = wp.Name, ComponentName = cp.ComponentName, PositionOfContainer = cp.PositionOfContainer }).ToList(); pagesWithComponentsAndIndex.ForEach(c => Console.WriteLine(string.Concat("Page: ", c.PageName, ", position: ", c.PositionOfContainer, ", component: ", c.ComponentName)));
Note that we select anonymous objects in the collection selector function. Each item will have two properties: ComponentName and PositionOfContainer. The “cp” input parameter in the result selector will therefore also be an anonymous function where we have access to those two properties. “wp” will still be of type WebPage.
Here’s the output:
Page: http://www.greatsite.com, position: 0, component: home.html
Page: http://www.greatsite.com, position: 0, component: datatables.js
Page: http://www.greatsite.com, position: 0, component: nicepic.img
Page: http://www.greatsite.com, position: 0, component: favicon.ico
Page: http://www.awesomeness.com, position: 1, component: home.html
Page: http://www.awesomeness.com, position: 1, component: myscript.ts
Page: http://www.awesomeness.com, position: 1, component: daily.img
Page: http://www.awesomeness.com, position: 1, component: selfie.png
Page: http://www.awesomeness.com, position: 1, component: mybeautifulface.png
Page: http://www.awesomeness.com, position: 1, component: hello.img
Page: http://www.boring.com, position: 2, component: datatables.css
Page: http://www.boring.com, position: 2, component: combined.css
Page: http://www.boring.com, position: 2, component: externallink.html
Page: http://www.boring.com, position: 2, component: googleads.html
Page: http://www.boring.com, position: 2, component: nicepic.img
You can view all LINQ-related posts on this blog here.