Introduction to MongoDb with .NET part 33: indexing arrays

Introduction

In the previous post we looked at compound indexes in MongoDb. Compound indexes can be useful if you’d like to make multi-field searches more efficient. E.g. if there’s a frequent search with a filter on “name” and “age” then a compound index is a good candidate to make the search more efficient. The exact sequence of the keys in the compound definition is important. The leftmost key is the primary key, followed by the secondary key and so on. A search on the secondary key field won’t use the compound index.

In this post we’ll look at indexing array fields.

Array indexes

Indexing arrays is in fact not much different from indexing simple data types like integers and strings. Our demo collections – zip codes and restaurants – have no suitable array fields. We’ll therefore create our own little collection. We created one called “people” before but we’ll create it again from scratch. Run the following commands in a MongoDb client shell:

use model
db.people.drop()
db.people.insert({"name" : "John", "sports" : ["football", "badminton", "squash"], "grades" : [50, 60,45, 43]})
db.people.insert({"name" : "Mary", "sports" : ["cricket", "badminton"], "grades" : [90, 60, 53]})
db.people.insert({"name" : "William", "sports" : ["tennis", "basketball", "squash"], "grades" : [30, 40,64, 90, 78]})
db.people.insert({"name" : "Bridget", "sports" : ["baseball", "athletics", "running"], "grades" : [50, 90,95, 40]})
db.people.insert({"name" : "Susan", "sports" : ["football", "boxing", "karate"], "grades" : [70, 80,78]})
db.people.insert({"name" : "Andrew", "sports" : ["cyckling", "swimming", "basketball"], "grades" : [10, 90,60, 50]})
db.people.insert({"name" : "Phil", "sports" : ["handball", "skating", "baseball"], "grades" : [55, 65]})
db.people.insert({"name" : "Joanna", "sports" : ["judo", "karate", "running"], "grades" : [88, 77,66, 55]})
db.people.insert({"name" : "Christina", "sports" : ["running", "swimming", "badminton"], "grades" : [51, 61,46, 44]})
db.people.insert({"name" : "Homer", "sports" : ["football", "skating", "boxing"], "grades" : [45, 47,90, 99]})

That’s 10 documents in total with each having 2 array fields. We now want to find all people that like karate. That should be easy by now I hope:

db.people.find({"sports" : "karate"})

It finds two people: Susan and Joanna.

We add an index on an array field by the same createIndex command we saw before in action:

db.people.createIndex({"sports" : 1})

We can see the index in action by running our good friend the explain() function:

db.people.explain(true).find({"sports" : "karate"})

¨

It tells us that an index scan was performed using the sports_1 index which we’ve just created and that exactly two documents were scanned in total. Since that’s equal to the number of returned documents we’re satisfied.

Can we have a compound index on the name and the sports fields? Let’s see:

db.people.createIndex({"name" : 1, "sports" : 1})

Yes, that’s possible. What if we want to have a compound index on 2 array fields? After all a potential query could be to extract all people who like football and have at least one score above 80. Running the following command…

db.people.createIndex({"sports" : 1, "grades" : 1})

…tells us that compound indexes with 2 or more array fields are not allowed:

{
        "ok" : 0,
        "errmsg" : "cannot index parallel arrays [grades] [sports]",
        "code" : 10088
}

Parallel arrays cannot be indexed. Each combination of the array values, such as (football, 90), (football, 85), (basketball, 55) etc. would need to be indexed which quickly causes the index size to explode. At this point of time compound indexes on 2 or more arrays are simply not allowed in MongoDb.

We’ll take up indexing fields in sub-documents in the next post.

You can view all posts related to data storage on this blog here.

Advertisements

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

ultimatemindsettoday

A great WordPress.com site

Elliot Balynn's Blog

A directory of wonderful thoughts

Robin Sedlaczek's Blog

Developer on Microsoft Technologies

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

the software architecture

thoughts, ideas, diagrams,enterprise code, design pattern , solution designs

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

Anything around ASP.NET MVC,WEB API, WCF, Entity Framework & AngularJS

Cyber Matters

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: