Introduction to MongoDb with .NET part 8: searching in arrays and nested documents
April 7, 2016 Leave a comment
Introduction
In the previous post we went through 4 logical operators in MongoDb: $or, $and, $not and $nor. We saw a couple examples of their usage. The syntax for $or, $and and $nor are similar in that we have to provide an array of filters that are linked with the operator whereas the syntax for $not is simpler. You’ll probably not use the $and operator too often as AND conditions can be simplified by chaining together the filters by a comma.
In this post we’ll see how to query arrays and embedded documents.
Searching within arrays
We’ll start with the simplest case where a document can have an array with comma separated values. I mean something like the “loc” array with geo-coordinates in the zipcodes collection:
{ "_id" : "01001", "city" : "AGAWAM", "loc" : [ -72.622739, 42.070206 ], "pop" : 15338, "state" : "MA" }
Other typical examples include e.g. a comments string array for an article document or a grades array for a student document. It turns out that we already know how to query within these simple arrays, we just didn’t know about it. Recall our first filter where we queried for documents having a field “name” and value “tv”:
db.products.find({"name" : "tv"})
The above query doesn’t only check whether a document in the products collection has a property “name” and a value “tv”. It also checks whether the document has a property called “name” and whether that property has an array which contains a string value “tv”. MongoDb will look directly within the array but not in a nested array within it, i.e. the search is only one level deep. However, it’s enough for most cases. The following query will find all documents whose “loc” array includes the value 42.070206 regardless of its position within the array. It’s an unlikely scenario but our imported databases do not have a better fit for this example:
db.zipcodes.find({"loc" : 42.070206})
The query will only find one document.
The restaurants collection also has an array called grades but its structure is more complex. It includes an array of sub-documents instead of single values. We need to learn another querying technique in order to search within those documents. We’ll do that right now.
Searching in sub-documents
An example of a sub-document or embedded document is the “address” property of a restaurant document. The embedded document in turn also includes a number of properties such as “building”:
"address" : { "building" : "2780", "coord" : [ -73.98241999999999, 40.579505 ], "street" : "Stillwell Avenue", "zipcode" : "11224" }
How can we find all restaurants whose address has the zipcode 11224? This is where the dot notation enters the scene. Here’s the solution:
db.restaurants.find({"address.zipcode" : "11224"})
The above query means that please MongoDb give me all the documents that have an address field which in turn has a zipcode field with a string value “11224” or has a zipcode array which includes the value “11224”. The query returns 72 documents.
Searching within an array of embedded documents
We now know how to search within the grades array of the restaurant collection. Let’s find all the documents that have received at least one grade “C”. I.e. all documents where the grades array includes an embedded document which has a “grade” property with a value “C”:
db.restaurants.find({"grades.grade" : "C"})
It will find 2708 documents.
Array-related operators
The $all operator will help provide multiple search filters for arrays such as “give me all documents where the grade property of the grades array has both C and B values”:
db.restaurants.find({"grades.grade" : {$all : ["C", "B"]}})
The $all operator introduces an array in the filter just like the $or operator. Only those documents will be returned that have at least one C AND one B grade. If either condition fails the document is not selected. The query returns 1565 documents, here’s an example:
{ "_id" : ObjectId("56edc2ff03a1cd840734dbb5"), "address" : { "building" : "1269", "coord" : [ -73.871194, 40.6730975 ], "street" : "Sutter Avenue", "zipcode" : "11208" }, "borough" : "Brooklyn", "cuisine" : "Chinese", "grades" : [ { "date" : ISODate("2014-09-16T00:00:00Z"), "grade" : "B", "score" : 21 }, { "date" : ISODate("2013-08-28T00:00:00Z"), "grade" : "A", "score" : 7 }, { "date" : ISODate("2013-04-02T00:00:00Z"), "grade" : "C", "score" : 56 }, { "date" : ISODate("2012-08-15T00:00:00Z"), "grade" : "B", "score" : 27 }, { "date" : ISODate("2012-03-28T00:00:00Z"), "grade" : "B", "score" : 27 } ], "name" : "May May Kitchen", "restaurant_id" : "40358429" }
The above document is also returned if we want to find restaurants with all 3 grade types, i.e. A, B and C:
db.restaurants.find({"grades.grade" : {$all : ["C", "B", "A"]}})
Note that the ordering of the elements within the array doesn’t matter.
Another operator slightly related to arrays is the $in operator which corresponds to an IN clause in SQL, e.g. WHERE id IN (1,2,3). The following query will return all the places in the zipcodes collection that are located in Florida (FL), Indiana (IN) and Maine (ME):
db.zipcodes.find({"state" : {$in : ["FL", "IN", "ME"]}})
There are 1890 such places in the zipcodes collection.
Read the next part here.
You can view all posts related to data storage on this blog here.