ansaurus

Question

What data is actually stored in a B-tree database in CouchDB?

Answer 1

+1 A:

CouchDB does not store diffs. When you update a document, it appends the whole new document with a new _rev and the same _id as the old version. The old version is removed during compaction.

titanoboa 2010-04-19 16:03:46

Yes, CouchDB doesn't store diffs. My question is how does it store documents internally in order to make both write-only save operations *and* current version retrievals, without locks?

Andrey Vlasovskikh 2010-04-19 21:29:47

Answer 2

+1 A:

The database file on disk is append-only; however the B-tree is conceptually modified in-place. When you update a document,

Its leaf node is written (via append to the DB file)
Its parent node is re-written to reference the new leaf (via append of course)
Repeat step 2 until you update the root node

When the root node is written, that is effectively when the newer revision is "committed." To find a document, you start at the end of the file, get the root node, and work down to your doc id. The latest revision will always be accessible this way.

jhs 2010-04-20 00:06:43

It's still unclear to me when the algorithm for determining the winning revision (http://books.couchdb.org/relax/reference/conflict-management) comes into play during the current document revision lookup. If the user is reading the document with the key ID1, then according to the scheme you've described he will get the **latest written** revision (thanks for your point on serializing writes using an Erlang process), **not the winning one**.

Andrey Vlasovskikh 2010-04-20 00:59:10

I guess I need to dig into the source code. It's quite observable: 18 KLOC.

Andrey Vlasovskikh 2010-04-20 01:39:43

The conflict management algorithm decides which order to store them in (i.e. does this get rev 4 and that get rev 5 or vice versa ). A simple lookup by ID always fetches the latest revision stored. In this example, revision 5 would be the "winner." The application may want to merge the conflict more meaningfully by creating a revision 6 that is the sum of 4 and 5.

jhs 2010-04-20 02:54:15

ansaurus

tags:

views:

answers:

What data is actually stored in a B-tree database in CouchDB?

related questions