views:

276

answers:

5

Caching your data in your application code is generally a good idea for many reasons. We have being doing this for quiet some time in our shared environment which includes ColdFusion, .NET, and PHP. However, because we share this environment with many other development groups in our organization, there is significantly more downtime then we (or our customers) appreciate.

Therefore, our web admins are moving to implement a new environment. In this environment they are adding a QA level between the current Dev and Prod environments. Also, to increase uptime, they are clustering machines at both the QA and Prod level.

This is all great for many reasons. The one area where I see a problem is with caching. There will be two (or more depending on the number of nodes) sets of cache. As you can imagine, this creates the following potential problem. Cache is the same on node A and B. User 1 updates data and thus cache is updated on node A. User 2 comes to query data but is on node B and therefore get's the old data.

Are their any best practices on how to handle this?

Is there any type of change I can make in my code?

Are there server settings that can be implemented?

+2  A: 

The basic two approaches for content caching are 1) centralization and 2) replication.

Each can be implemented in various ways, and to various levels of complexity.

If you're talking about just a small group of web servers, then a simple centralization setup is what you want. I would recommend a memcached server per environment (which PHP supports). So in your model, both nodes A and B would use cached data from a new node: node C.

Replication is the more scalable solution, but it's also significantly more complicated to implement. But you need to hit a vast traffic volume to go this route (think facebook, youtube, wikipedia) so I doubt you need to worry about it.

Peter Bailey
+1  A: 

When using ColdFusion Multiserver configuration with clustered ColdFusion instances, you can use Session replication in the cluster, but use it sparingly since cached data is constantly serialized and marshalled to the other servers in the cluster. You could serialize complex data (CFWDDX) and store it in a database, then store a primary key in the session scope to replicate where to find the record, and maybe a flag indicating the cached data has been changed which would cause the other servers to refresh their cache from the database.

Steven Erat
The point of the cache is so we don't have to make unnecessary calls to the database. Therefore, I don't think using CFWDDX and database storage would work well in this situation.
Jason
As long as you don't need to share cached data outside of ColdFusion (to .net or php, etc) then I think Steven's suggestion of using ColdFusion's built-in clustering is your best bet. It's baked in and fully supported - what more could you ask for?
Adam Tuttle
I'm looking for a solution that works for different languages. This *might* solve our problem in coldfusion, but not in the other areas.
Jason
A: 

The point of wddxing the complex data for database storage is that is a means of centralization as suggested by Peter above. The servers would replication a key that references the data, and flag indicating if it was changed. Then upon change the other servers would query the modified data and cache it until the change flag was set again. Each server would query the database only once per change, and the data is centralized, and not held only in memory.

If the data is not so complex, then Session Replication can accomplish what you want without involving a database at all. The performance is inversely proportional to the number of servers in the cluster (because the amount of replication will grow exponentially as they all replicate data to the other members) and the quantity of data to be replicated.

Steven Erat
A: 

There is a cf implementation of the java memcached client that would allow you to use memcached to do the caching easily and directly from coldfusion.

http://memcached.riaforge.org/

I would make a very strong vote in favor of memcached, even if you had to run it on the same cluster of webservers, you would get the benefit of single cache location for data elements, and redundancy of the cluster.

np0x
+1  A: 

If you are running ColdFusion 9, you can use the new built-in caching which implements ehcache under the hood. In a clustered environment, it's very easy to set up a replicated cluster cache that uses RMI (literaly just a few lines of XML). See my series on caching in ColdFusion 9 here:

http://www.brooks-bilson.com/blogs/rob/index.cfm/Caching

The post on setting up a clustered cache isn't done yet, but if you contact me directly, I can provide the config you'll need for your ehcache.xml file along with more specific instructions.

Rob Brooks-Bilson