Introduction to CouchDB with .NET part 6: batch updates and insertions

Introduction

In the previous post we looked at the CouchDB concurrency implementation and the notion of eventual consistency. Concurrency in a database means that multiple threads might want to access and modify the same data record at the same time. CouchDB solves the concurrency problem by a mechanism called Multi-Version Concurrency Control MVCC. With MVCC CouchDB keeps the various revisions of the same document. If a thread wants to read the document while it is being updated then the reading thread will get the most recent complete copy of the document. The caller will in such a case get an outdated revision of the document. However, a subsequent request will then get the updated copy. This scenario is called eventual consistency. CouchDB reaches high availability due to the absence of data locks and sacrifices data consistency to some degree.

In this post we’ll look at batch updates and insertions.

Batch modifications

CouchDB can take care of groups of updates and insertions to be executed in a batch. It’s really simple actually. The most basic scenario is that we send multiple documents to the CouchDB HTTP API instead of just one. The JSON objects need to be enclosed in an array assigned to the “docs” property. In the previous post we created a database called persons. Let’s add 5 documents to it at one go. We’ll use the same POST request as before but we need to extend it with “_bulk_docs”:

POST http://localhost:5984/persons/_bulk_docs

We set the Content-Type header to application/json.

Here’s the JSON body:

{
	"docs": [{
		"first-name": "Elvis",
		"last-name": "Presley",
		"age": 54	
	},
	{
		"first-name": "Marilyn",
		"last-name": "Monroe",
		"age": 75
	},
	{
		"first-name": "Marlon",
		"last-name": "Brando",
		"age": 63
	},
	{
		"first-name": "Roger",
		"last-name": "Moore",
		"age": 87
	},
	{
		"first-name": "Greta",
		"last-name": "Garbo",
		"age": 79
	}]
}

If everything goes well then CouchDB will respond with a 201 Created and a collection of IDs and revision numbers for all 5 documents:

[
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a34904018a42",
    "rev": "1-4a82a53090f81a0766448db0de7bf7bb"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401990e",
    "rev": "1-9c71af451db58eda5a2ba985148a5c0c"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a34904019fa5",
    "rev": "1-b77c22d6889fb3595c9a13c55f0aed58"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401af41",
    "rev": "1-7d2c0c0874e94ba64edf168f10f75a77"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401b00f",
    "rev": "1-9d13836fe90d2a56e15ba6d41b3cab80"
  }
]

Now let’s try to update the first three documents in one batch. All we need to do is include the ID and revision number of the document to be updated. The URL is the same as for insertions. Here’s the JSON payload I’m going to test:

{
	"docs": [{
		"_id": "3559d9c81c785b6bfc27a34904018a42",
		"_rev": "1-4a82a53090f81a0766448db0de7bf7bb",
		"first-name": "Elvis",
		"last-name": "Presley",
		"age": 68	
	},
	{
		"_id": "3559d9c81c785b6bfc27a3490401990e",
		"_rev": "1-9c71af451db58eda5a2ba985148a5c0c",
		"first-name": "Marilyn",
		"last-name": "Monroe",
		"age": 79
	},
	{
		"_id": "3559d9c81c785b6bfc27a34904019fa5",
		"_rev": "1-b77c22d6889fb3595c9a13c55f0aed58",
		"first-name": "Marlon",
		"last-name": "Brando",
		"age": 66
	}]
}

CouchDB responds with 201 Created and a new set of revision numbers:

[
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a34904018a42",
    "rev": "2-f19b6df3bd9015872a7dde724c922c83"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401990e",
    "rev": "2-5eb075ac21567de210d0d080b4293f8a"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a34904019fa5",
    "rev": "2-53da1176c8605ee7c21b7db5a1be0164"
  }
]

Now let’s see what happens if one update fails. In the next payload we’ll update the last two Person documents, i.e. Roger Moore and Greta Garbo. Furthermore we’ll update Elvis Presley again but we’ll use the first revision number. We know from the previous post that this operation should fail due to an outdated record being updated. I’ll test with the following JSON payload:

{
	"docs": [{
		"_id": "3559d9c81c785b6bfc27a3490401af41",
		"_rev": "1-7d2c0c0874e94ba64edf168f10f75a77",
		"first-name": "Roger",
		"last-name": "Moore",
		"age": 92	
	},
	{
		"_id": "3559d9c81c785b6bfc27a3490401b00f",
		"_rev": "1-9d13836fe90d2a56e15ba6d41b3cab80",
		"first-name": "Greta",
		"last-name": "Garbo",
		"age": 81
	},
	{
		"_id": "3559d9c81c785b6bfc27a34904018a42",
		"_rev": "1-4a82a53090f81a0766448db0de7bf7bb",
		"first-name": "Elvis",
		"last-name": "Presley",
		"age": 72
	}]
}

CouchDB responds with the following:

[
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401af41",
    "rev": "2-37eeec3925f444598e932b66c77bb40d"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401b00f",
    "rev": "2-4c2625988e2c84c74c4e18e52086874b"
  },
  {
    "id": "3559d9c81c785b6bfc27a34904018a42",
    "error": "conflict",
    "reason": "Document update conflict."
  }
]

The first two updates got through but the last one has failed with a conflict error. In other words CouchDB handles each update and insertion in the batch as an independent unit. If one fails then it has no implications for the rest.

Transaction… sort of and no more

It’s worth noting that earlier versions of CouchDB supported the “all_or_nothing” flag to change that behaviour. When set to true CouchDB would only execute the batch if every document successfully went through. We would use it in the following way:

{
	"all_or_nothing": true,
	"docs": [{
		"_id": "3559d9c81c785b6bfc27a3490401af41",
		"_rev": "2-37eeec3925f444598e932b66c77bb40d",
		"first-name": "Roger",
		"last-name": "Moore",
		"age": 95	
	},
	{
		"_id": "3559d9c81c785b6bfc27a3490401b00f",
		"_rev": "2-4c2625988e2c84c74c4e18e52086874b",
		"first-name": "Greta",
		"last-name": "Garbo",
		"age": 83
	},
	{
		"_id": "3559d9c81c785b6bfc27a34904018a42",
		"_rev": "1-4a82a53090f81a0766448db0de7bf7bb",
		"first-name": "Elvis",
		"last-name": "Presley",
		"age": 78
	}]
}

However, this flag is not supported anymore in CouchDB 2:

[
  {
    "id": "3559d9c81c785b6bfc27a3490401af41",
    "rev": "2-37eeec3925f444598e932b66c77bb40d",
    "error": "not_implemented",
    "reason": "all_or_nothing is not supported"
  },
  {
    "id": "3559d9c81c785b6bfc27a3490401b00f",
    "rev": "2-4c2625988e2c84c74c4e18e52086874b",
    "error": "not_implemented",
    "reason": "all_or_nothing is not supported"
  },
  {
    "id": "3559d9c81c785b6bfc27a34904018a42",
    "rev": "1-4a82a53090f81a0766448db0de7bf7bb",
    "error": "not_implemented",
    "reason": "all_or_nothing is not supported"
  }
]

The all_or_nothing flag provided at least some support for transactions in CouchDB but it only worked in a single node environment. It wouldn’t work in a typical clustered database anyway. However, you might see it applied in a project with an older CouchDB version.

Batch modifications in CouchDB are said to be non-atomic which means that each modification is carried out in isolation and not as a unit.

Mixing updates and insertions

We can mix insertions and updates in the same batch. The next JSON payload updates two existing documents and inserts a new:

{
	"docs": [{
		"_id": "3559d9c81c785b6bfc27a3490401af41",
		"_rev": "2-37eeec3925f444598e932b66c77bb40d",
		"first-name": "Roger",
		"last-name": "Moore",
		"age": 95	
	},
	{
		"_id": "3559d9c81c785b6bfc27a3490401b00f",
		"_rev": "2-4c2625988e2c84c74c4e18e52086874b",
		"first-name": "Greta",
		"last-name": "Garbo",
		"age": 83
	},
	{
		"first-name": "Freddie",
		"last-name": "Mercury",
		"age": 80
	}]
}

The response shows that the third document got its first revision, i.e. it was inserted successfully:

[
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401af41",
    "rev": "3-30aadf4451a04569b82b01175b391458"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401b00f",
    "rev": "3-767e04b13edd0ebeda8ad71a1808c768"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401b892",
    "rev": "1-de9b80faccbc3bcd0dc04a8e2eee51a0"
  }
]

Batch deletions

We can delete multiple data records in the same batch. We set the _deleted property to true. The following JSON payload will remove the first three documents from the persons database, i.e. John W. Smith, Elvis Presley and Marilyn Monroe:

{
	"docs": [{
		"_id": "3559d9c81c785b6bfc27a349040177b0",
		"_rev": "3-a11c420a45f2ca5334522e72aefb899e",
		"_deleted": true
	},
	{
		"_id": "3559d9c81c785b6bfc27a34904018a42",
		"_rev": "2-f19b6df3bd9015872a7dde724c922c83",
		"_deleted": true
	},
	{
		"_id": "3559d9c81c785b6bfc27a3490401990e",
		"_rev": "2-5eb075ac21567de210d0d080b4293f8a",
		"_deleted": true
	}]
}

It should succeed and all three documents should get a new revision number:

[
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a349040177b0",
    "rev": "4-05a41ea3b37ea0dea540fc34890c9369"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a34904018a42",
    "rev": "3-af434f7f64b7304ae495bb58f05d1521"
  },
  {
    "ok": true,
    "id": "3559d9c81c785b6bfc27a3490401990e",
    "rev": "3-03208d7c4416be70af857a2b8ab22fbe"
  }
]

We’ll continue in the next post.

You can view all posts related to data storage on this blog here.

Advertisements

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

One Response to Introduction to CouchDB with .NET part 6: batch updates and insertions

  1. Pingback: CouchDB Weekly News, June 1, 2017 – CouchDB Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

ultimatemindsettoday

A great WordPress.com site

iReadable { }

.NET Tips & Tricks

Robin Sedlaczek's Blog

Developer on Microsoft Technologies

HarsH ReaLiTy

A Good Blog is Hard to Find

Softwarearchitektur in der Praxis

Wissenswertes zu Webentwicklung, Domain-Driven Design und Microservices

the software architecture

thoughts, ideas, diagrams,enterprise code, design pattern , solution designs

Technology Talks

on Microsoft technologies, Web, Android and others

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

Anything around ASP.NET MVC,WEB API, WCF, Entity Framework & AngularJS

Cyber Matters

Bite-size insight on Cyber Security for the not too technical.

Guru N Guns's

OneSolution To dOTnET.

Johnny Zraiby

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.

%d bloggers like this: