tags:

views:

143

answers:

3

Hi,

I'm currently designing a database structure (it's strange to avoid the term scheme here) and ran into some issues. I am going to motivate my question by a similar example.

Think of a database for blog entries. Each blog entry has 0..n comments, 0..m tags and 0..r user votes (up/down) for entries.

Of course I could put everything together in one big JSON object and have one common object per entry including comments, tags and votes as fields.

That's however prone to conflicts when there are many concurrent changes (at least in my real application there are). So I think about to split up my structure into several documents. One per entry, and for each entry one (collection-)document for comments, one for tags and one for votes.

I'd also set up a fix name schema. Blog entries have UUID when created, the documents for comments, tags and votes are named <blog-entry-uuid>:(tags|votes|comments). Thus, I can always build the ID of the referred documents just by knowing the id of the blog post. When using views, I can use the power of view collation to "join" or take advantage of include_docs=true.

By using update handlers, I can further minimze the change of conflicts for adding a comment etc. The handler just takes the new comment, and puts it into the <blog-entry-uuid>:comments document.

Is this a feasible approach or are there better ways for joins? Is it bad practice to build IDs that way?

Thanks in advance

+1  A: 

From what I understand, Couch isn't really intended to handle 1:1 relationships between documents in that separate aspects of a single "thing" are contained in different documents. You can handle all of these without a specific collection document.

Probably, each comment can have its own document. Christopher Lenz has a blog article describing how to do a join-like view in just this situation.

Tags and up/down votes, on the other hand, should probably stay inlined within their owners. Yeah, you still have an edit-conflict issue, but those conflicts can probably be resolved easily in any event. I don't think it's worth complicating your database structure to that degree over. (Though if auditing the up/down votes is important, you could have a separate doc for each vote and use a reduce query to find a document's final score.)

LeafStorm
As outlined in my question, auto-inclusion of referenced ids and view collation are powerful tools for handling relationships in CouchDB.
PartlyCloudy
+1  A: 

Hi. LeafStorm makes a good point. I wouldn't say that CouchDB absolutely does not support 1:1; you just have to understand the implications. Primarily, you cannot update both documents at the same time atomically. Frankly with many types of applications that feature is overhyped.

View collation is an excellent way to "join" related documents together. It is very powerful.

Something else to consider is CouchDB 0.11 and later's feature to reference related documents by id in a view. Read about it in Jan's CouchDB 0.11 views blog post. Basically, when you emit a key/value, you can "trick" the include_docs=true feature to return any document, not just the one you were processing when you called emit(). Compare:

function(doc) {
  // The old way. include_docs would include this document (doc).
  emit(some_key, some_value); 

  // The new way. include_docs will fetch some_other_doc_id and return that.
  emit(some_other_key, {"_id": some_other_doc_id});
}
jhs
+1  A: 

The most common way that I have seen of handling document relationships, and more importantly updating those documents when they could be written to by multiple users concurrently, is to link the documents with actions. Ex., instead of thinking of a blog article's rating as a component, think of each rating as a different action and create a document for it. Then you use the power of Map/Reduce to aggregate all of those actions.

Rating action document:

{
  "docType": "rating",
  "of": {
    "id": "ID of the blog/video/comment document",
    "type": "blog/video/comment"
  },
  "by": "the user's document ID",
  "rating": 3
}

Map function:

function(doc) 
{
  if(doc.docType == "rating")
    emit([doc.of.id, doc.of.type], doc.rating);
}

Reduce function:

function(keys, values, rereduce)
{
  return sum(values);
}

You can now easily get the rating of specific items, types of items (ex., top 10 comments by rating), or every single type of media in your application. And that's without some of the cooler features in CouchDB 0.11.x and the upcoming 1.0.

Of course you might be looking for a different answer - let me know if that's the case.

And lastly, I would argue that CouchDB can handle 1:1 relationships just fine (at least by my definition): there's nothing stopping you from including document A's _id in document B.

Sam Bisbee