views:

50

answers:

1

I'm coming from a MySQL background, and I'm interested in document-oriented databases, specifically CouchDB. One of the things I'm interested in is data integrity. How do document-oriented databases handle this? For instance, in RDBMSes, there are ways to prevent duplication of records, or guaranteeing that if you have one bit of information, you will have another, or else none at all.

I guess more broadly, my question is, what types of problems are RDBMSes cut out for, compared to problems that DODBes are used for? I looked on some of the other stackoverflow questions for an explanation, but didn't find any good ones.

Also, with my databases at work, I do a lot of reporting, with summing and averaging values, and historical trending. Is this something appropriate for document-oriented databases?

+1  A: 

Most of the document-databases have only support very limited integrity or no integrity checks. They rely on the application to ensure that the data is correct. I can tell you how it is in CouchDB.

To the second part. I think RDBMS do very well at reporting and analyzing data. The fact that you can run complex queries on the data with joins, aggregations, functions etc make RDBMS a very powerful reporting-tool. Document-databases do really well for storing the 'live' application-data. It very easy to store an retrieve object-graph into document-databases. The schema-free design makes it easy to extends the model for new application features. However this only works if you can split your application-data into nice documents. Otherwise you loose a lot of the elegance.

If you want to do mostly reporting, I would prefer a RDBMS. When to store lots of flat, simple records it very easy to do reporting on it. The tooling etc. is perfect for reporting. However when want to do reporting on complex structured data, you probably still better of with another database desgine than a RDBMS.

However this doesn't mean you need to limit yourself to RDBMS. You could combine the two technologies. Imagine a blog-software. You store the 'live' application data like blog-posts and comments into the documentdatabase. Data for reporting like click- and login-statistics is stored in a RDBMS. See also Rob Conerys post.

Gamlor