Introduction to MongoDb with .NET part 41: the write concern in a replica set
June 10, 2016 Leave a comment
Introduction
In the previous post we introduced the topic of write concern in MongoDb. The write concern consists of two parts. The ingredient “w” stands for “write” and we can set the level of acknowledgement with it. The default is 1 and it means that we wait for an acknowledgement that the write operation was persisted to memory. 0 means fire-and-forget, i.e. we send the write operation to MongoDb and we don’t want to wait for any type of acknowledgement. The other ingredient of the write concern is “j” which stands for journal. The journal is a log where MongoDb registers all changes to the collection and its documents. We can either wait for the journal to be persisted to disk or not worry about it.
In this post we’ll talk a little bit about replica sets and what the write concern means in that scenario.
Replica set
A replica set in MongoDb is a set of MongoDb databases that behave as a single unit. The databases will most often be located on different servers. The goal with a replica set is that if one server – a node – goes down then the rest can still handle the database needs of your application. Therefore it’s a similar situation like we have for web applications that are hosted by multiple web servers. Each web server will have the same web application so that the end user gets the same content regardless of which server got the request from the load balancer.
A replica set is not about load balancing though. It’s really about increasing data availability. A real-life high-traffic web site is quite vulnerable if it only has a single database node in its backend infrastructure. If your backend store is MongoDb then a replica set is a good candidate for increased stability.
A replica set consists of 3 or more servers. There’s one “boss” called the primary server, the others are secondary. All write operations are directed at the primary server which is then responsible for replicating the data to the secondary servers asynchronously. Therefore each database will eventually have the same set of data in its own copy of the collection. The replication delay is usually very short, it’s a matter of milliseconds.
A replica set can consist of 2 servers if the original number was 3 – one primary and 2 secondary – and then one of them went down. If the primary server goes down then the remaining nodes select a new one among themselves.
Write concern in replica sets
All of this has implications for the write concern. We set it to 1 in the previous post and that’s the default value. The integer has a very specific meaning, however. 1 means that we want to wait for the acknowledgement from 1 server. In a single-server environment only the values 0 and 1 make sense. In a replica set we can increase that number to anything up to the total number of nodes. E.g. if w = 2 then we wait for the primary and one secondary server to acknowledge the write. We can still set w to 1 in a replica set, meaning that we only wait for the primary server to provide and acknowledgement.
There’s one more ingredient to all this: the write timeout value, abbreviated to wtimeout:
db.companies.insert({"name" : "Samsung", "phone" : 5345346}, {"writeConcern" : {"w" : 3, "j" : false, "wtimeout" : 10000}})
The wtimeout is set to the maximum time in milliseconds that we’re willing to wait for the secondary node(s) to provide to provide their acknowledgements.
Setting j to true in the above example implies that we wait for the log to be saved on disk on the primary node. There’s no setting to specify that we want to wait for the journal to be persisted on the secondary nodes as well.
The w parameter can have a special value of “majority”. It implies that want to get an acknowledgment from the majority of nodes in a replica set. That will be 2 out of 3, 3 out of 5 and so on. In addition w can be set to a tag value. The nodes in the replica set can have special tags, like “important” or “crucial” or whatever string. If w is set to a tag then we’ll wait for the acknowledgement from those nodes that have this tag value. You can read more about that here.
Read the next part here where we discuss the read preference in a multi-server scenario.
You can view all posts related to data storage on this blog here.