Introduction to MongoDb with .NET part 35: index options
May 26, 2016 Leave a comment
Introduction
In the previous post we investigated how to create indexes within sub-documents. The dot notation which we saw when querying sub-documents also provides the syntax for deep indexing. We also quickly saw how to retrieve all indexes of a collection. In addition we learnt that array indexes are called multi-key indexes in MongoDb. The query plan returned by the explain function shows whether a certain index used for the query execution is multi-key or not.
In this post we’ll look at a couple of options available when creating indexes.
Index name
We’ve already seen this one. We can add a name to the index which overrides the default name given my MongoDb. We created a small people collection in the previous post. If you don’t have it then use the above link to navigate back to the previous part and create the collection.
Let’s create an index on the name field and also give the index a name:
db.people.createIndex({"name" : 1}, {"name" : "index-on-name"})
We can remove this index by using the JSON which created it:
db.people.dropIndex({"name" : 1})
…or you can reference it by its name as a string parameter:
db.people.dropIndex("index-on-name")
Unique indexes
Say you only want to have unique values of certain fields. E.g. you probably don’t want to have 2 customers with the same name in the collection. There is no UNIQUE constraint in MongoDb like in traditional SQL databases. Instead we can create a unique index. Consider the following small companies collection:
use model db.companies.insert({"name" : "Sony", "phone" : "1234"}) db.companies.insert({"name" : "Microsoft", "phone" : "9876"}) db.companies.insert({"name" : "Ericsson", "phone" : "45765"}) db.companies.insert({"name" : "Apple", "phone" : "42454746"}) db.companies.insert({"name" : "10gen", "phone" : "9283746"}) db.companies.insert({"name" : "Atlassian"})
We want to keep the name values unique. The unique option provides the solution:
db.companies.createIndex({"name" : 1}, {"unique" : true})
If we try to add a new post with an existing company name…:
db.companies.insert({"name" : "Sony", "phone" : "1234"})
…then we get a resounding NO:
WriteResult({ "nInserted" : 0, "writeError" : { "code" : 11000, "errmsg" : "E11000 duplicate key error collection: model.companies index: name_1 dup key: { : \"Sony\" }" } })
Sparse indexes
The sparse option is used in conjunction with the unique indexes. Consider the phone property in the above company documents. They are certainly unique but Atlassian has no phone defined, i.e. it’s null.
We’ll create a unique index on the phone field:
db.companies.createIndex({"phone" : 1}, {"unique" : true})
That’s fine, the index is created. We now want to add a new company which also lacks a phone number:
db.companies.insert({"name" : "General Motors"})
We’re rejected:
E11000 duplicate key error collection: model.companies index: phone_1 dup key: { : null }
Why is that? The above document lacks a phone field, i.e. it is null. However, we already have a document with a null phone value so we’re facing a duplication issue.
The solution comes in the form of a sparse index. The sparse index will not index those documents that lack the indexed field.
Let’s remove the above index first:
db.companies.dropIndex({"phone" : 1})
…and recreate it with the sparse option on:
db.companies.createIndex({"phone" : 1}, {"unique" : true, "sparse" : true})
We can now insert General Motors:
db.companies.insert({"name" : "General Motors"})
We can easily prove that the two documents with no phone field were not indexed. We’ll first query for a concrete phone number using the explain function:
db.companies.explain(true).find({"phone" : "1234"})
We’ll see an index scan stage and that 1 document was scanned in total and that was also returned.
However, if we run a search for the documents that lack a phone number:
db.companies.explain(true).find({"phone" : null})
…then we’ll see that a total collection scan was performed. All 7 documents were examined and 2 were returned.
Background index creation
By default indexes are created on a foreground thread in MongoDb. This means that index creation has priority and while the index creation process is running all other reads and writes are blocked on the database. If it takes a long time creating the index then you can unintentionally lock down your entire application which relies on the database.
One option to avoid that is to let the index creation run on a background thread which has a lower priority. Reads and writes won’t be blocked. However, the process will also take a longer time compared to creating an index on a foreground thread. The background option can be set to true for this purpose:
db.companies.createIndex({"name" : 1}, {"unique" : true, "background" : true})
In the next post we’ll look some other bits and pieces about the query plan returned by the explain function.
You can view all posts related to data storage on this blog here.