Why should I use document based database like CouchDB instead of using relational database. Are there any typical kinds of applications or domains where the document based database is more suitable than the relational database?
views:
3194answers:
5CouchDB (from their website)
A document database server, accessible via a RESTful JSON API. Generally, relational databases aren't simply accessed via REST services, but require a much more complex SQL API. Often these API's (JDBC, ODBC, etc.) are quite complex. REST is quite simple.
Ad-hoc and schema-free with a flat address space. Relational databases have complex, fixed schema. You define tables, columns, indexes, sequences, views and other stuff. Couch doesn't require this level of complex, expensive, fragile advanced planning.
Distributed, featuring robust, incremental replication with bi-directional conflict detection and management. Some SQL commercial products offer this. Because of the SQL API and the fixed schemas, this is complex, difficult and expensive. For Couch, it appears simple and inexpensive.
Query-able and index-able, featuring a table oriented reporting engine that uses Javascript as a query language. So does SQL and relational databases. Nothing new here.
So. Why CouchDB?
- REST is simpler than JDBC or ODBC.
- No Schema is simpler than Schema.
- Distributed in a way that appears simple and inexpensive.
Probably you shouldn't :-)
The second most obvious answer is you should use it if your data isn't relational. This usually manifests itself in having no easy way to describe your data as a set of columns. A good example is a database where you actually store paper documents, e.g. by scanning office mail. The data is the scanned PDF and you have some meta data which always exists (scanned at, scanned by, type of document) and lots of possible metadata fields which exists sometime (customer number, supplier number, order number, keep on file until, OCRed fulltext, etc). Usually you do not know in advance which metadata filds you will add whithin the next two years. Things like CouchDB work much nicer for that kind of data than relational databases.
I also personally love the fact that I don't need any client libraries for CouchDB except an HTTP client, which is nowadays included in nearly every programming language.
The probably least obvious answer: If you feel no pain using a RDBMS, stay with it. If you always have to work around your RDBMS to get your job done, a document oriented database might be worth a look.
For a more elaborate list check this posting of Richard Jones.
Rapid application development comes to mind.
When I am constantly evolving my schema, I am constantly frustrated by having to maintain the schema in MySQL/SQLite. While I've not done too much with CouchDB yet, I do like how simple it is to evolve the schema during the RAD process.
A case where you might not want to use a non-relational database is when you have a lot of many-to-many relationships; I've yet to get my head around how to create good MapReduce functions around these kinds of relationships, particularly if you need to have metadata in the joining relationship. I'm not sure, but I don't think CouchDB Map functions can call their own queries on the database, since that could potentially cause infinite loops.
For stupidly storing and serving other-servers-data.
In the last couple of weeks I've been playing with a lifestream app that polls my feeds (delicious, flickr, github, twitter...) and stores them in couchdb. The beauty of couchdb is that it lets me keep the original data in its original structure with no overhead. I added a 'class' field to each document, storing the source server, and wrote a javascript render class for each source.
Generalizing, whenever your server communicates with another server a schema-less storage is best as you have no control over the schema. As a bonus, couchdb uses the native protocols of servers and clients - JSON for representation and HTTP REST for transport.
Use a document-based database when you do not need to store data in tables with uniform sized fields for each record. Instead, you have a need to store each record as a document that has certain characteristics. Any number of fields of any length can be dynamically added to a document at any time without the need to "modify the table" first. Fields in document-based can also contain multiple pieces of data.