Introduction to MongoDb with .NET part 38: diagnostics
June 1, 2016 Leave a comment
Introduction
In the previous post we learnt about building indexes for text fields in MongoDb documents. Index fields are very useful if you want to search for certain bits of text within a larger text. The text search does a very good job at finding the best matches. We can even ask MongoDb to rank the matching documents for us.
We’re actually done with the discussion regarding indexes in MongoDb. This post will give a brief overview over the diagnostics tools available in MongoDb. They help us extract a lot of useful information about what is going on within MongoDb.
Default logging of slow queries
Mongod has a log target depending on how it was started. In this series we’ve been starting the Mongo database in the console window with “mongod”. Therefore MongoDb will write its output in that window. By default MongoDb will log every query that took more than 100ms to complete.
If we run one of our earlier aggregation examples on the zipcodes collection…:
db.zipcodes.aggregate([ {$group : { "_id" : {"state" : "$state", "city" : "$city"}, "population" : {$sum : "$pop"}}}, {$sort: {"_id.state" : 1, "population" : -1}}, {$group: {"_id" : {"state" : "$_id.state"}, "largest_city" : {$first: "$_id.city"}, "population" : {$first: "$population"}}}, {$sort : {"_id.state" : 1}}, {$project : {"_id" : 0, "state" : "$_id.state" ,"largest_city" : 1, "population": 1}} ])
…then in the absence of any relevant indexes the query will almost certainly take more than 100ms to complete. My mongod window therefore logged it to the console window:
2016-05-31T20:47:50.861+0200 I COMMAND [conn1] command model.zipcodes command: aggregate { aggregate: "zipcodes", pipeline: [ { $group: { _id: { state: "$state", city: "$city" }, population: { $sum: "$pop" } } }, { $sort: { _id.state: 1.0, population: -1.0 } }, { $group: { _id: { state: "$_id.state" }, largest_city: { $first: "$_id.city" }, population: { $first: "$population" } } }, { $sort: { _id.state: 1.0 } }, { $project: { _id: 0.0, state: "$_id.state", largest_city: 1.0, population: 1.0 } } ], cursor: {} } keyUpdates:0 writeConflicts:0 numYields:229 reslen:3501 locks:{ Global: { acquireCount: { r: 466 } }, Database: { acquireCount: { r: 233 } }, Collection: { acquireCount: { r: 233 } } } protocol:op_command 146ms
If I on the other hand run a find command on a small collection with only 10 documents…:
db.people.find()
…then the mongod console will remain “silent”. So this default logging facility can be a good starting point for finding slow queries.
The built-in profiler
The database profiler will give us more details about the queries. It writes its log to the system.profile collection. The default profiler level of mongod is 0, meaning it is turned off. With level 1 we can specify a response time in milliseconds. If a query takes longer than that to complete then it will be logged into the system.profile collection. Finally we have level 2 where we ask mongod to log every single query made against the database.
Profiling must be turned on when mongod is started as follows:
mongod --profile 1 --slowms 10
…where the slowms argument is the execution time limit in millis mentioned before. We set the limit at 10ms in the above example. Note that we have double dashes in front of te profile and slowms arguments.
If we run the same aggregation example as above then we’ll immediately see in the mongod console that the profile collection was created:
Creating profile collection: model.system.profile
We can get the documents of that collection as follows:
db.system.profile.find()
Here’s how the aggregation query was logged:
{ "op" : "command", "ns" : "model.zipcodes", "command" : { //command ignored }, "keyUpdates" : 0, "writeConflicts" : 0, "numYield" : 229, "locks" : { "Global" : { "acquireCount" : { "r" : NumberLong(466) } }, "Database" : { "acquireCount" : { "r" : NumberLong(233) } }, "Collection" : { "acquireCount" : { "r" : NumberLong(233) } } }, "responseLength" : 3501, "protocol" : "op_command", "millis" : 158, "execStats" : { }, "ts" : ISODate("2016-05-31T19:11:22.192Z"), "client" : "127.0.0.1", "allUsers" : [ ], "user" : "" }
The most interesting information here is the “millis” property which shows the time it took to execute the query. The “ns” property will show us which collection was queried. The “ts” timestamp property shows us when the query occurred. Also, since system.profile is a “normal” collection with BSON documents there’s nothing stopping us from querying it. We can query for the slowest queries, the queries that were executed in a certain time frame etc.
We can turn on profiling after mongod was started using the setProfilingLevel() function. It accepts two integer values: the profiling level and the limit in milliseconds:
db.setProfilingLevel(1, 10)
The getProfilingStatus function returns the current profiling mode:
db.getProfilingStatus()
Finally this is how to turn off profiling:
db.setProfilingLevel(0)
The setProfilingLevel command always returns the profiling level before the change was made.
Mongotop
Mongotop is another interesting tool which shows us what is happening in MongoDb overall. We can start it in a separate command window. We can supply an integer argument to it. It sets the interval for the mongotop output:
mongotop 5
If there’s nothing going on in the database then we’ll get a bunch of 0’s:
ns total read write 2016-05-31T21:35:07+02:00 admin.system.roles 0ms 0ms 0ms admin.system.version 0ms 0ms 0ms loadtest.customers 0ms 0ms 0ms loadtest.products 0ms 0ms 0ms local.startup_log 0ms 0ms 0ms local.system.replset 0ms 0ms 0ms model.companies 0ms 0ms 0ms model.people 0ms 0ms 0ms model.restaurants 0ms 0ms 0ms model.system.profile 0ms 0ms 0ms
You’ll probably understand the output here. It tells us how much time is spent on reads and writes in the various namespaces at the time of the diagnostics output. As soon as we start executing a couple of commands from the mongo shell the mongotop output will change. Here I only show the zipcodes collection stats:
ns total read write 2016-05-31T21:36:52+02:00 model.zipcodes 23ms 23ms 0ms ns total read write 2016-05-31T21:36:57+02:00 model.zipcodes 119ms 119ms 0ms ns total read write 2016-05-31T21:37:02+02:00 model.zipcodes 70ms 70ms 0ms
Mongotop is a great tool when you want to compare the general state of the database before and after an optimisation run. Ideally you’d execute the same set up of commands that should reflect some average traffic and queries of your database.
Mongostat
We can finally mention another diagnostics tool called mongostat. It can be started in a separate window like mongotop. Mongostat accepts a port number where the default value is 27017. We start mongostat with the mongostat command:
mongostat
It will start outputting some overall information about the number of CRUD operations against the database on the given port. The interval for the logs is 1 second and mongostats shows the status during that very second. Here’s some sample output with some queries executed:
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time *0 3 *0 *0 0 6|0 0.0 0.0 0 190.0M 23.0M 0|0 0|0 671b 49k 2 2016-05-31T21:49:36+02:00 *0 4 *0 *0 0 5|0 0.0 0.0 0 190.0M 23.0M 0|0 0|0 663b 59k 2 2016-05-31T21:49:37+02:00 *0 *0 *0 *0 0 7|0 0.0 0.0 0 190.0M 23.0M 0|0 0|0 518b 19k 2 2016-05-31T21:49:38+02:00 *0 3 *0 *0 0 4|0 0.0 0.0 0 190.0M 23.0M 0|0 0|0 517b 49k 2 2016-05-31T21:49:39+02:00
For more information about the usage, input parameters and output of mongostat consult this documentation page.
In the next post we’ll see how to work with indexes in the .NET driver.
You can view all posts related to data storage on this blog here.