Introduction to MongoDb with .NET part 29: aggregation in the .NET driver using strong typing and dedicated functions
May 18, 2016 1 Comment
Introduction
In the previous post we looked at how to build an aggregation pipeline in the MongoDb .NET driver using loosely typed BsonDocument objects and the AppendStage function. AppendStage can accept any type of aggregation stage and each stage is described with plain strings. An advantage with AppendStage is that we can express any kind of aggregation stage with it, including those for which there’s no suitable dedicated function, like $redact.
In this post we’ll look at the strongly typed alternative to the previous solution.
The dedicated aggregation stage functions
The aggregate function opens a fluent API where AppendStage is just one alternative. As you type…
modelContext.ZipCodes.Aggregate().
…in Visual Studio you’ll see that there are many other functions available that are the dedicated versions of the various stages like $match or $group.
Let’s see how we can use them to rewrite our original JSON query:
db.zipcodes.aggregate([ {$group: { "_id" : "$state", "population" : {$sum : "$pop"}}}, {$match : {population : {$gte : 5000000}}}, {$sort: {"_id" : 1}}, {$limit : 5}])
Grouping
The group stage is the most complex of these, the others are straightforward. The Group function has 3 overloads and the third one is the most flexible one which accepts an ID selector and the grouping selector. Both are expressed using LINQ.
The $group phase above can be expressed as follows:
ModelContext modelContext = ModelContext.Create(new ConfigFileConfigurationRepository(), new AppConfigConnectionStringRepository()); var result = modelContext.ZipCodes.Aggregate().Group(key => key.State, value => new { State = value.Key, Population = value.Sum(key => key.Population) }).ToList(); foreach (var item in result) { Console.WriteLine(string.Format("{0}: {1}", item.State, item.Population)); }
The first selector specifies that we want to group by the State field of the zip codes. The value will be an anonymous object where we indicate what we want to have returned. First we need the key, which will be the state and the a Population which will the sum of the Population field of the zip code objects.
The above statement returns a list of anonymous objects with 2 properties: State and Population. Here are the first 5 elements:
MN: 4372982
SC: 3486703
RI: 1003218
OK: 3145585
MA: 6016425
It looks promising indeed. While we are at it we can also discuss how to convert the anonymous object into a concrete one. The generic As extension method helps us with that. We need an object that will hold those properties. We’ll also need to tell MongoDb which of those are Bson elements:
public class ZipCodeGroupResult { [BsonId] [BsonElement(elementName: "_id")] public string State { get; set; } [BsonElement(elementName: "Population")] public int Population { get; set; } }
We can now rewrite the above C# statement into the following:
var resultWithAsFunction = modelContext.ZipCodes.Aggregate().Group(key => key.State, value => new { State = value.Key, Population = value.Sum(key => key.Population) }) .As<ZipCodeGroupResult>().ToList(); foreach (var item in resultWithAsFunction) { Console.WriteLine(string.Format("{0}: {1}", item.State, item.Population)); }
The above code will produce the exact same output.
We can even use the standard Select LINQ method after ToList() to map the anonymous object into a “real” one. In that case we don’t even need the Bson-related attributes in the ZipCodeGroupResult class:
var resultWithSelectExtension = modelContext.ZipCodes.Aggregate().Group(key => key.State, value => new { State = value.Key, Population = value.Sum(key => key.Population) }).ToList() .Select(z => new ZipCodeGroupResult() { Population = z.Population, State = z.State });
The other stages
The other stages are simple and can be easily located by their function names. Here’s the solution which returns a list of anonymous objects:
var fullQueryResultAnon = modelContext.ZipCodes.Aggregate().Group(key => key.State, value => new { State = value.Key, Population = value.Sum(key => key.Population) }) .Match(z => z.Population > 5000000) .SortBy(z => z.State) .Limit(5) .ToList(); foreach (var item in fullQueryResultAnon) { Console.WriteLine(string.Format("{0}: {1}", item.State, item.Population)); }
It produces the same result as the pure JSON solution above:
CA: 29754890
FL: 12686644
GA: 6478216
IL: 11427576
IN: 5544136
The As extension function can still be used to map the anonymous object to a concrete one:
var fullQueryResultConcrete = modelContext.ZipCodes.Aggregate().Group(key => key.State, value => new { State = value.Key, Population = value.Sum(key => key.Population) }) .Match(z => z.Population > 5000000) .SortBy(z => z.State) .Limit(5) .As<ZipCodeGroupResult>() .ToList();
This is the last post dedicated to the MongoDb aggregation framework in this series. We’ve now seen the most important query types including CRUD operations and aggregations.
The next post will start a slightly different topic, namely indexes.
You can view all posts related to data storage on this blog here.
I’m new to MongoDB and following your blog series – thanks for sharing. I’m surprised this is the last post on the aggregation framework. The docs show a more standard LINQ API with AsQueryable(), but you don’t mention it. How does what you’ve shown here relate?