← Python language basics 87: reading from a text file

Checking whether an enum value exists by a parse test in C# →

Introduction to MongoDb with .NET part 1: background

March 14, 2016 2 Comments

Introduction

MongoDb is the most popular NoSql database out there at the time of writing this post. It’s used by a wide range companies as a data store. Large and small, well-established and freshly started organisations have all embraced this relatively new database technology. The default choice for storing data in a .NET project has most often been SQL Server. While SQL Server is probably still the most popular choice for .NET developersm they can choose from other well-tested alternatives depending on their project needs. MongoDb is very easy to set up and start working with.

In this series we’ll explore a number of features of MongoDb and how it can used in a .NET project. There is already a series dedicated to MongoDb in .NET on this blog starting here. However, it was a long time ago and MongoDb, like many other technologies within IT evolve very quickly so it’s time to revisit it. Also, in this updated series I’d like to devote more time on the raw queries we can send to the MongoDb server than in the previous one. So first we’ll do some querying and data manipulation through the MongoDb shell and then we’ll go over to MongoDb in .NET.

What is MongoDb?

MongoDb is a highly scalable document based database. It means that it stores its data in documents. The data is stored as JSON, or to be exact in binary JSON, abbreviated as BSON. I assume you know what JSON is. Here comes a simple example describing a customer:

{
	"name": "NoSuchCustomer Inc.",
	"number-of-employees": 100,
	"number-of-products": 10
}

In relational databases, like MS SQL, MySql or Oracle the data is stored in tables. Tables have columns with property names and then rows where we show those properties. The above JSON could be stored in a table called Customer or Customers, depending on your naming conventions, which has 3 columns: name, number-of-employees and number-of-products. We would also define a schema for the table, i.e. set the data type of each column: name is some version of varchar, the other two would be integers. We’d also likely have an id field of type integer or GUID.

We can define a number of other constraints, like name cannot be null and that we only allow unique names. All these rules become the schema of the table and all records in the table must adhere to it. If “name” is set to NOT NULL then you cannot insert a customer with no name. Also, the table will have these 3 columns and every record will consequently have these properties. You cannot have customers with a varying number of properties in the same table.

If two tables are related then they can be connected with secondary keys. E.g. if a Customer has Products then there can be a separate Product table with each Product record linked to the Customer table through a customer_id secondary key. That is how object-graphs are modeled in relational databases. In object-oriented languages, such as Java or C# we can freely model our objects and how they are related in code:

public class Dog
{
	public string Name { get; }
	public string Type { get; set; }

	public Dog(string name, string type)
	{
		Name = name;
		Type = type;
	}
}

public class DogLover
{
	public string FirstName { get; }
	public string LastName { get; }
	public IEnumerable<Dog> Dogs { get; set; }

	public DogLover(string firstName, string lastName, IEnumerable<Dog> dogs)
	{
		FirstName = firstName;
		LastName = lastName;
		Dogs = dogs;
	}
}

A DogLover can have a collection of dogs in the domain model in code. How would we show this in a relational database? We would need 2 tables, DogLover and Dog and link each Dog record to DogLover by a secondary key that refers to the id of the DogLover. At least that would be a possible solution. It seems unlikely that even the simplest hierarchy, like a DogLover can have 1 or more Dogs can be arranged in a single table.

How can we then arrange the Dog / DogLover relationship in a non-relational database? Easy: in exactly the same manner as they are represented in code. JSON is a very flexible data representation technique. Here’s an example of 2 entries in the DogLover collection where the dogs of each owner are embedded arrays:

[{
	"FirstName": "Fred",
	"LastName": "Smith",
	"Dogs": [{
			"Name": "Fluffy",
			"Type": "Husky"
		}, {
			"Name": "Killer",
			"Type": "Bulldog"
		}

	]
}, {
	"FirstName": "Jane",
	"LastName": "Miller",
	"Dogs": [{
			"Name": "Max",
			"Type": "Terrier"
		}, {
			"Name": "Daisy",
			"Type": "Shepherd"
		}

	]
}]

That looks very much like a more precise representation of our object oriented domain model, right?

MongoDb can store just about any JSON within a collection. We’ll there’s a size limit of a single document but we’ll come to that later. You’d need a very large object graph to hit that ceiling anyway.

There’s nothing stopping us from adding a new property to the DogLover object without worrying about all the other entries in the collection:

{
	"FirstName": "Joan",
	"LastName": "Clarke",
	"Age": 35,
	"Dogs": [{
			"Name": "Brad",
			"Type": "Dax"
		}

We’ve just added Age to a new entry whereas it wasn’t present before. MongoDb won’t complain. Anything that can be represented in valid JSON can be saved in MongoDb too. We can even do something wild and store completely unrelated data in the same collection if we want to. Well, in practice that’s unlikely to happen, but the flexibility of JSON would allow us and therefore MongoDb will be happy to store it.

Advantages

There are some key advantages of MongoDb compared to traditional relational databases:

Dynamic data structure with flexible schemas: you don’t need to define columns and tables. You can in fact store pretty much anything within the same collection
Due to the lack of strict schemas data migrations become a lot easier too: if you change your domain structure, i.e. you business objects in code, the document will store the objects correspondingly. You can force a change in the schema through changing your custom objects automatically
You don’t need separate tables to show relationships as shown in the DogLover / Dog collection JSON example above. If you extract a single item from a collection then you’ll immediately get its associated objects: order with all order items, rock band with all concerts, making it a breeze to perform operations on those linked objects
MongoDb documents therefore allow storing your objects in an object oriented fashion which is sometimes difficult and awkward to solve in SQL Server with separate tables and keys
Due to the lack of constraints such as secondary keys updating and deleting items will be easier: there’s no cascade delete, no orphans
Speed: MongoDb is very fast and efficient in querying and inserting items in a collection
Scalability: MongoDb is highly scalable. We can easily create database clusters with primary and secondary nodes to ensure that our data store is always available
No fees: MongoDb is for free, you don’t need to pay a single penny to build a large MongoDb cluster. Well, you may need to pay for the database server(s) but not for any MongoDb licence

Disadvantages

All of the above is very well and good but there are couple of things that you need to be aware of if you’re coming from an SQL Server environment – which probably at least 95% of .NET developers do.

Lack of professional tools: with SQL Server you can use SSMS for some very advanced GUI-based database operations, such as database profiling, SQL jobs, a query editor, IntelliSense and a whole lot more. There’s no equivalent in MongoDb. If you want to work directly with collections without one of the drivers – C#, Java, Php etc – then you’ll need to write your code in a console window or develop a custom solution yourself. There’s one visual tool though that you can use to improve the administration of your MongoDb database: RoboMongo. RoboMongo is a nice visual tool to view the records in a MongoDb but it’s nowhere near SQL Server Management Studio yet
As of writing this post MongoDb doesn’t support transactions
Many developers will say that the lack of a schema is actually a disadvantage: you cannot associate objects through keys, you cannot force a compulsory data structure with rules like “NOT NULL”. However, in return you are forced to implement your constraints in code instead which helps concentrate the business logic in the domain layer instead of it being hidden in some data store
No stored procedures and triggers
Difficult to retrieve lost data
Business intelligence tools of MS SQL have no counterparts in MongoDb

Drivers

While you can interact with MongoDb directly in a command window you’ll probably prefer to do that in code through a client library, a.k.a a driver. There are drivers available for most mainstream languages, like C#, Java or PHP. You can check out the full list of language-specific drivers here.

Other NoSql databases

There is a whole suite of NoSQL databases out there besides MongoDb. Some examples:

Conclusion

At the time of writing this post MongoDb was the most popular NoSQL database according to db-engines. Even if you’re a staunch relational DBMS advocate and think you’ll never use anything else for your data storage solution it’s still beneficial to know that there are other tools out there. Also, you can have have a mixed data storage strategy where you store some objects in a relational DBMS and some others in a document store depending on the objects’ nature and the advantages and disadvantages of each storage mechanism. You know that “we’ve always done it this way” is a dangerous mindset. Studying MongoDb can widen your skills set and offer a different point of view to solving data storage challenges.

The company I currently work for has embraced MongoDb to a great extent. We haven’t said goodbye to MS SQL or anything. Our core business is still stored in relational tables but we have definitely found very good use of MongoDb in a number of our applications.

We’ll continue with the MongoDb installation process in the next post.

You can view all posts related to data storage on this blog here.

Filed under .NET, MongoDb Tagged with c#, mongodb

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

2 Responses to Introduction to MongoDb with .NET part 1: background

smartis2812 says:

March 14, 2016 at 11:29 am

Thank you Andras! This Topic was/is really important too me. Great explantions.

Reply
DevMec says:

June 25, 2016 at 3:20 pm

Hi Andras, I learn a lot of things with all articles you’ve write…
Thanks a lot.

Now, I have a difficult question for you : Can you mix 2 of your articles concerning Mongodb and web.api including odata support from web.api to mongodb ?

So, is there a way to translate web.api odata queries into mongodb queries. (ie with, limit, skip, take, order by etc….).

Thanks in advance,

Reply

Exercises in .NET with Andras Nemes

Introduction to MongoDb with .NET part 1: background

2 Responses to Introduction to MongoDb with .NET part 1: background

Leave a comment Cancel reply

My profile

Andras Nemes

Verified Services

Follow my blog via email

Top Posts & Pages

History

My tweets

Blogs I Follow

Exercises in .NET with Andras Nemes

Introduction to MongoDb with .NET part 1: background

Share:

Related

2 Responses to Introduction to MongoDb with .NET part 1: background

Leave a comment Cancel reply

My profile

Andras Nemes

Verified Services

Follow my blog via email

Top Posts & Pages

History

My tweets

Keywords

Blogs I Follow