Introduction to MongoDb with .NET part 35: index options

Introduction

In the previous post we investigated how to create indexes within sub-documents. The dot notation which we saw when querying sub-documents also provides the syntax for deep indexing. We also quickly saw how to retrieve all indexes of a collection. In addition we learnt that array indexes are called multi-key indexes in MongoDb. The query plan returned by the explain function shows whether a certain index used for the query execution is multi-key or not.

In this post we’ll look at a couple of options available when creating indexes.

Index name

We’ve already seen this one. We can add a name to the index which overrides the default name given my MongoDb. We created a small people collection in the previous post. If you don’t have it then use the above link to navigate back to the previous part and create the collection.

Let’s create an index on the name field and also give the index a name:

db.people.createIndex({"name" : 1}, {"name" : "index-on-name"})

We can remove this index by using the JSON which created it:

db.people.dropIndex({"name" : 1})

…or you can reference it by its name as a string parameter:

db.people.dropIndex("index-on-name")

Unique indexes

Say you only want to have unique values of certain fields. E.g. you probably don’t want to have 2 customers with the same name in the collection. There is no UNIQUE constraint in MongoDb like in traditional SQL databases. Instead we can create a unique index. Consider the following small companies collection:

use model
db.companies.insert({"name" : "Sony", "phone" : "1234"})
db.companies.insert({"name" : "Microsoft", "phone" : "9876"})
db.companies.insert({"name" : "Ericsson", "phone" : "45765"})
db.companies.insert({"name" : "Apple", "phone" : "42454746"})
db.companies.insert({"name" : "10gen", "phone" : "9283746"})
db.companies.insert({"name" : "Atlassian"})

We want to keep the name values unique. The unique option provides the solution:

db.companies.createIndex({"name" : 1}, {"unique" : true})

If we try to add a new post with an existing company name…:

db.companies.insert({"name" : "Sony", "phone" : "1234"})

…then we get a resounding NO:

WriteResult({
        "nInserted" : 0,
        "writeError" : {
                "code" : 11000,
                "errmsg" : "E11000 duplicate key error collection: model.companies index: name_1 dup key: { : \"Sony\" }"
        }
})

Sparse indexes

The sparse option is used in conjunction with the unique indexes. Consider the phone property in the above company documents. They are certainly unique but Atlassian has no phone defined, i.e. it’s null.

We’ll create a unique index on the phone field:

db.companies.createIndex({"phone" : 1}, {"unique" : true})

That’s fine, the index is created. We now want to add a new company which also lacks a phone number:

db.companies.insert({"name" : "General Motors"})

We’re rejected:

E11000 duplicate key error collection: model.companies index: phone_1 dup key: { : null }

Why is that? The above document lacks a phone field, i.e. it is null. However, we already have a document with a null phone value so we’re facing a duplication issue.

The solution comes in the form of a sparse index. The sparse index will not index those documents that lack the indexed field.

Let’s remove the above index first:

db.companies.dropIndex({"phone" : 1})

…and recreate it with the sparse option on:

db.companies.createIndex({"phone" : 1}, {"unique" : true, "sparse" : true})

We can now insert General Motors:

db.companies.insert({"name" : "General Motors"})

We can easily prove that the two documents with no phone field were not indexed. We’ll first query for a concrete phone number using the explain function:

db.companies.explain(true).find({"phone" : "1234"})

We’ll see an index scan stage and that 1 document was scanned in total and that was also returned.

However, if we run a search for the documents that lack a phone number:

db.companies.explain(true).find({"phone" : null})

…then we’ll see that a total collection scan was performed. All 7 documents were examined and 2 were returned.

Background index creation

By default indexes are created on a foreground thread in MongoDb. This means that index creation has priority and while the index creation process is running all other reads and writes are blocked on the database. If it takes a long time creating the index then you can unintentionally lock down your entire application which relies on the database.

One option to avoid that is to let the index creation run on a background thread which has a lower priority. Reads and writes won’t be blocked. However, the process will also take a longer time compared to creating an index on a foreground thread. The background option can be set to true for this purpose:

db.companies.createIndex({"name" : 1}, {"unique" : true, "background" : true})

In the next post we’ll look some other bits and pieces about the query plan returned by the explain function.

You can view all posts related to data storage on this blog here.

Advertisements

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

ultimatemindsettoday

A great WordPress.com site

iReadable { }

.NET Tips & Tricks

Robin Sedlaczek's Blog

Developer on Microsoft Technologies

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

the software architecture

thoughts, ideas, diagrams,enterprise code, design pattern , solution designs

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

Anything around ASP.NET MVC,WEB API, WCF, Entity Framework & AngularJS

Cyber Matters

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: