tags:

views:

61

answers:

2

I dont mean the view sources, stored in _design docs (those replicate since they're just docs). What I mean is do the view results (the computed btrees) replicate as well, or do just regular documents replicate (which is how I understand it right now).

the problematic scenario is:
there's a spike in traffic and I want to bring up a temporary server up, and replicate a portion of the dataset onto that new server. the views for those (to be replicated)docs have already been computed on the old server, and so dont need to be recomputed on the new server... so I want those old computed results to be transfered along with the portion of the docs.

another scenario is to use a backend cluster to compute complex views, and then replicate those results onto the a bunch of front-end servers that are actually hit by user requests.

+1  A: 

The computed result is not replicated.

Here are some additional thoughts though:

  • When you partition your server and bring up a second server with it, how do you distribute read/writes and combine view results? This setup requires a proxy of some thought, I suggest you look into CouchDB-Lounge.

  • If you're doing master-master, you could keep the servers in sync using DRDB. It's been proven to work with mysql master-master replication, I don't see why it would not work here. This would also imply that the computed result is automatically in sync on both servers.

Let me know if this helps!

Till
+1  A: 

As Till said, the results are not replicated. For more detail, you actually don't want them to be replicated. The general CouchDB paradigm that you should remember is that each installation is treated as an independent node - that's why _id, _rev, and sequence numbers are so important. This allows each node to work without taking any other node into consideration: if one of your nodes goes down, all of the others will continue to crank away without a care in the world.

Of course, this introduces new considerations around consistency that you might not be used to. For example, if you have multiple web servers that each has its own CouchDB node on it, and those nodes run replication between themselves so that each instance stays up to date, there will be a lag between the nodes. Here's an example flow:

  1. User writes a change to web server A.
  2. User makes a read request to web server B, because your load balancer decided that B was the better choice. The user gets their result.
  3. Web server A sends the updated doc to web server B via replication.

As you can see, the user got the previous version of their document because web server B didn't know about the change yet. This can be defeated with...

  • Stick sessions, so that all of their reads and writes go to the same server. This could just end up defeating your load balancer.
  • Moving the CouchDB nodes off of the web servers and onto their own boxes. If you go with this then you probably want to take a look at the couchdb-lounge project (http://tilgovi.github.com/couchdb-lounge/).
  • Do your users really care if they get stale results? Your use case might be one where your users won't notice whether their results don't reflect the change that they just made. Make sure you're really getting a marked value out of this work.

Cheers.

Sam Bisbee