Introduction to MongoDb with .NET part 1: background
March 14, 2016 2 Comments
Introduction
MongoDb is the most popular NoSql database out there at the time of writing this post. It’s used by a wide range companies as a data store. Large and small, well-established and freshly started organisations have all embraced this relatively new database technology. The default choice for storing data in a .NET project has most often been SQL Server. While SQL Server is probably still the most popular choice for .NET developersm they can choose from other well-tested alternatives depending on their project needs. MongoDb is very easy to set up and start working with.
In this series we’ll explore a number of features of MongoDb and how it can used in a .NET project. There is already a series dedicated to MongoDb in .NET on this blog starting here. However, it was a long time ago and MongoDb, like many other technologies within IT evolve very quickly so it’s time to revisit it. Also, in this updated series I’d like to devote more time on the raw queries we can send to the MongoDb server than in the previous one. So first we’ll do some querying and data manipulation through the MongoDb shell and then we’ll go over to MongoDb in .NET.
What is MongoDb?
MongoDb is a highly scalable document based database. It means that it stores its data in documents. The data is stored as JSON, or to be exact in binary JSON, abbreviated as BSON. I assume you know what JSON is. Here comes a simple example describing a customer:
{ "name": "NoSuchCustomer Inc.", "number-of-employees": 100, "number-of-products": 10 }
In relational databases, like MS SQL, MySql or Oracle the data is stored in tables. Tables have columns with property names and then rows where we show those properties. The above JSON could be stored in a table called Customer or Customers, depending on your naming conventions, which has 3 columns: name, number-of-employees and number-of-products. We would also define a schema for the table, i.e. set the data type of each column: name is some version of varchar, the other two would be integers. We’d also likely have an id field of type integer or GUID.
We can define a number of other constraints, like name cannot be null and that we only allow unique names. All these rules become the schema of the table and all records in the table must adhere to it. If “name” is set to NOT NULL then you cannot insert a customer with no name. Also, the table will have these 3 columns and every record will consequently have these properties. You cannot have customers with a varying number of properties in the same table.
If two tables are related then they can be connected with secondary keys. E.g. if a Customer has Products then there can be a separate Product table with each Product record linked to the Customer table through a customer_id secondary key. That is how object-graphs are modeled in relational databases. In object-oriented languages, such as Java or C# we can freely model our objects and how they are related in code:
public class Dog { public string Name { get; } public string Type { get; set; } public Dog(string name, string type) { Name = name; Type = type; } } public class DogLover { public string FirstName { get; } public string LastName { get; } public IEnumerable<Dog> Dogs { get; set; } public DogLover(string firstName, string lastName, IEnumerable<Dog> dogs) { FirstName = firstName; LastName = lastName; Dogs = dogs; } }
A DogLover can have a collection of dogs in the domain model in code. How would we show this in a relational database? We would need 2 tables, DogLover and Dog and link each Dog record to DogLover by a secondary key that refers to the id of the DogLover. At least that would be a possible solution. It seems unlikely that even the simplest hierarchy, like a DogLover can have 1 or more Dogs can be arranged in a single table.
How can we then arrange the Dog / DogLover relationship in a non-relational database? Easy: in exactly the same manner as they are represented in code. JSON is a very flexible data representation technique. Here’s an example of 2 entries in the DogLover collection where the dogs of each owner are embedded arrays:
[{ "FirstName": "Fred", "LastName": "Smith", "Dogs": [{ "Name": "Fluffy", "Type": "Husky" }, { "Name": "Killer", "Type": "Bulldog" } ] }, { "FirstName": "Jane", "LastName": "Miller", "Dogs": [{ "Name": "Max", "Type": "Terrier" }, { "Name": "Daisy", "Type": "Shepherd" } ] }]
That looks very much like a more precise representation of our object oriented domain model, right?
MongoDb can store just about any JSON within a collection. We’ll there’s a size limit of a single document but we’ll come to that later. You’d need a very large object graph to hit that ceiling anyway.
There’s nothing stopping us from adding a new property to the DogLover object without worrying about all the other entries in the collection:
{ "FirstName": "Joan", "LastName": "Clarke", "Age": 35, "Dogs": [{ "Name": "Brad", "Type": "Dax" }
We’ve just added Age to a new entry whereas it wasn’t present before. MongoDb won’t complain. Anything that can be represented in valid JSON can be saved in MongoDb too. We can even do something wild and store completely unrelated data in the same collection if we want to. Well, in practice that’s unlikely to happen, but the flexibility of JSON would allow us and therefore MongoDb will be happy to store it.
Advantages
There are some key advantages of MongoDb compared to traditional relational databases:
- Dynamic data structure with flexible schemas: you don’t need to define columns and tables. You can in fact store pretty much anything within the same collection
- Due to the lack of strict schemas data migrations become a lot easier too: if you change your domain structure, i.e. you business objects in code, the document will store the objects correspondingly. You can force a change in the schema through changing your custom objects automatically
- You don’t need separate tables to show relationships as shown in the DogLover / Dog collection JSON example above. If you extract a single item from a collection then you’ll immediately get its associated objects: order with all order items, rock band with all concerts, making it a breeze to perform operations on those linked objects
- MongoDb documents therefore allow storing your objects in an object oriented fashion which is sometimes difficult and awkward to solve in SQL Server with separate tables and keys
- Due to the lack of constraints such as secondary keys updating and deleting items will be easier: there’s no cascade delete, no orphans
- Speed: MongoDb is very fast and efficient in querying and inserting items in a collection
- Scalability: MongoDb is highly scalable. We can easily create database clusters with primary and secondary nodes to ensure that our data store is always available
- No fees: MongoDb is for free, you don’t need to pay a single penny to build a large MongoDb cluster. Well, you may need to pay for the database server(s) but not for any MongoDb licence
Disadvantages
All of the above is very well and good but there are couple of things that you need to be aware of if you’re coming from an SQL Server environment – which probably at least 95% of .NET developers do.
- Lack of professional tools: with SQL Server you can use SSMS for some very advanced GUI-based database operations, such as database profiling, SQL jobs, a query editor, IntelliSense and a whole lot more. There’s no equivalent in MongoDb. If you want to work directly with collections without one of the drivers – C#, Java, Php etc – then you’ll need to write your code in a console window or develop a custom solution yourself. There’s one visual tool though that you can use to improve the administration of your MongoDb database: RoboMongo. RoboMongo is a nice visual tool to view the records in a MongoDb but it’s nowhere near SQL Server Management Studio yet
- As of writing this post MongoDb doesn’t support transactions
- Many developers will say that the lack of a schema is actually a disadvantage: you cannot associate objects through keys, you cannot force a compulsory data structure with rules like “NOT NULL”. However, in return you are forced to implement your constraints in code instead which helps concentrate the business logic in the domain layer instead of it being hidden in some data store
- No stored procedures and triggers
- Difficult to retrieve lost data
- Business intelligence tools of MS SQL have no counterparts in MongoDb
Drivers
While you can interact with MongoDb directly in a command window you’ll probably prefer to do that in code through a client library, a.k.a a driver. There are drivers available for most mainstream languages, like C#, Java or PHP. You can check out the full list of language-specific drivers here.
Other NoSql databases
There is a whole suite of NoSQL databases out there besides MongoDb. Some examples:
Conclusion
At the time of writing this post MongoDb was the most popular NoSQL database according to db-engines. Even if you’re a staunch relational DBMS advocate and think you’ll never use anything else for your data storage solution it’s still beneficial to know that there are other tools out there. Also, you can have have a mixed data storage strategy where you store some objects in a relational DBMS and some others in a document store depending on the objects’ nature and the advantages and disadvantages of each storage mechanism. You know that “we’ve always done it this way” is a dangerous mindset. Studying MongoDb can widen your skills set and offer a different point of view to solving data storage challenges.
The company I currently work for has embraced MongoDb to a great extent. We haven’t said goodbye to MS SQL or anything. Our core business is still stored in relational tables but we have definitely found very good use of MongoDb in a number of our applications.
We’ll continue with the MongoDb installation process in the next post.
You can view all posts related to data storage on this blog here.
Thank you Andras! This Topic was/is really important too me. Great explantions.
Hi Andras, I learn a lot of things with all articles you’ve write…
Thanks a lot.
Now, I have a difficult question for you : Can you mix 2 of your articles concerning Mongodb and web.api including odata support from web.api to mongodb ?
So, is there a way to translate web.api odata queries into mongodb queries. (ie with, limit, skip, take, order by etc….).
Thanks in advance,