tags:

views:

71

answers:

1

I'm reading the Beginning CouchDB book by Apress and there is a line that confuses me a bit:

Also important to note is that CouchDB will never overwrite existing documents, but rather it will append a new document to the database, with the latest revision gaining prominence and the other being stored for archival purposes.

Doesn't this mean that after a couple of updates, you would have a huge database? Thank you!

+3  A: 

The short answer is "not really, no".

In reality in depends on the average size of your document and the amount of them. This will define when you should be running a compact job on your database, which is the job that removes all of the previous revisions from the database. Read more about compaction at http://wiki.apache.org/couchdb/Compaction

Another sysadmin point for this, try to schedule your compaction jobs when the database isn't under load. You most specifically care about write load, because if writes are happening too quickly when you run compaction, then your compaction job could (in theory) run forever and take the database with it. However, I've seen some not-so-nice behavior around running compaction while under a heavy read load. So, if you can stand only compacting once a day, do it at 3am with the rest of your system/database maintenance cron jobs.

Oh, and possibley most importantly, if you're just starting to learn couchdb, then it's probably premature to start worrying about when to run your compaction jobs compared to your system's load. Premature optimization and all that - focus on other aspects for now.

Cheers.

Sam Bisbee