ansaurus

Question

Principles for Modeling CouchDB Documents

Answer 1

+4 A:

The book says, if I recall correctly, to denormalize until "it hurts", while keeping in mind the frequency with which your documents might be updated.

What rules/principles do you use to divide up your documents (relationships, etc)?

As a rule of thumb, I include all data that is needed to display a page regarding the item in question. In other words, everything you would print on a real-world piece of paper that you would hand to somebody. E.g. a stock quote document would include the name of the company, the exchange, the currency, in addition to the numbers; a contract document would include the names and addresses of the counterparties, all information on dates and signatories. But stock quotes from distinct dates would form separate documents, separate contracts would form separate documents.

Is it okay to put the entire site into one document?

No, that would be silly, because:

you would have to read and write the whole site (the document) on each update, and that is very inefficient;
you would not benefit from any view caching.

Eero 2009-10-07 11:10:53

Thanks for getting into it with me a bit. I get the idea of "include all data that is needed to display a page regarding the item in question", but that is still very difficult to implement. A "page" could be a page of Comments, a page of Users, a page of Posts, or a page of Comments and Posts, etc. How would you divide them up then, principally? You could also have your Contract displayed with Users. I get the 'form-like' documents, that makes sense to keep them separate.

viatropos 2009-10-07 11:55:44

Answer 2

A:

Hi, I have been thinking about this problem for a while and I think it is harder than it first seem.

In my application the data model is process, each process contain activities. Also process has properties and activity has properties. In my case there might be screens that will show only the name of the process, but process is nothing without its activities so I guess I'll put the activities as part of the process document, not each activity as separate document. The properties of both process and activity might be update by different users at the same time so I don't want to put them as part of the process document because it will create far to much conflicts so I guess each property will a separate document but now I need to ensure "referential integrity" of the properties like delete them when the activity is deleted.

I think draw general (or even specific rules) like the books on relation database normalization will be good idea. Off course we need to address issues of replication, shards and any other thing CouchDB support that I'm not aware of yet.

Thank you, Ido.

Ido Ran 2010-09-12 13:55:37

Answer 3

+1 A:

Riak has a first class concept in their system called "Links", that essentially allow relations like functionality between Key Value stores.

https://wiki.basho.com/display/RIAK/Links

2010-10-01 07:08:01

ansaurus

tags:

views:

answers:

Principles for Modeling CouchDB Documents

related questions