views:

1313

answers:

3

As far as I understand, CouchDB indexes are updated when a view is queried. Assuming there are more reads than writes, isn't this bad for scaling? How would I configure CouchDB to update indexes on writes, or better yet, on a schedule?

+15  A: 

CouchDB does regenerate views on update, but only on what has changed since the last read access to the view. Assuming your read volume greatly outweighs your write volume, this shouldn't be a problem.

When you're changing large numbers of documents at once this could lead to the possibility of the first read requests taking a noticeable amount of time. To alleviate this a few different possibilities have been suggested. Most rely on registering with CouchDB's update notifications and triggering reads automatically.

An example script for doing exactly that is available on the CouchDB wiki at [1].

[1] http://wiki.apache.org/couchdb/RegeneratingViewsOnUpdate

Paul J. Davis
+4  A: 

You can't and also, why would you want that?

Think about it like that:

  • When you import data into MySQL you can turn of indizes because it's more expensive to update the index for every row you insert, than it is to update the index for 100 writes (or however many rows you import) in a single run.
  • This is why CouchDB updates the index on read because it's less expensive to integrate those 100 changes at the same time, then each change when it's written.

This is one of the advantages of CouchDB! :) I am not saying that this is a CouchDB only feature, but it's just smart to do this on read.

One thing you could do is read with update=false, which is a dirty read and might not return what you expect. If you always do this, you could schedule a "regular" read through a cronjob and update your index with that. I just don't think it makes sense.

Till
+7  A: 

a) "Scaling" is such an overloaded term. What "kind" of scaling are you referring to? (Either way, I can't see how it affects you negatively).

b) Update on writes: Just query your view after a write. Note that adding a bunch of data to the index is more resource friendly (that not specific to CouchDB). So you might want to trigger your view every N writes.

c) Scheduled: Set up a cronjob that queries your view every M minutes.

d) Wait for CouchDB to evolve to provide you with the infrastructure that allows you to set this up with a configuration parameter.

e) (BEST OPTION). Get your hands dirty and help us out polishing CouchDB! Any contributions are highly appreciated.

d) RTFM (blink :)

Jan Lehnardt