views:

1721

answers:

9

What is the best strategy to refactor a Singleton object to a cluster environment?

We use Singleton to cache some custom information from Database. Its mostly read-only but gets refreshed when some particular event occurs.

Now our application needs to be deployed in a Clustered environment. By definition, each JVM will have its own Singleton instance. So the cache may be out-of-sync between the JVM's when a refresh event occurs on a single node and its cache is refreshed.

What is the best way to keep the cache's in sync?

Thanks.

Edit: The cache is mainly used to provide an autocomplete list (performance reasons) to UI and we use Websphere. So any Websphere related tips welcome.

+8  A: 

Replace your singleton cache with a distributed cache.

One such cache could be JBoss Infinispan but I'm sure that other distributed cache and grid technologies exist, including commercial ones which are probably more mature at this point.

For singleton objects in general, I'm not sure. I think I'd try to not have singletons in the first place.

Christian Vest Hansen
I've found that the simplest one (subjectively) to implement seems to be 'ehcache'.
Ryan Fernandes
+1  A: 

If possible, use your app server's support for this, if possible (some have it, some don't). For example, we use JBoss's support for an "HA Singleton" which is a service that only runs on the cluster master node. It's not perfect (you have to handle the case where occasionally it brain farts), but it's good enough.

Failing that, you may be able to engineer something using JGroups, which provides with cluster node auto-discovery and negotiation, but it's non-trivial.

As a last resort, you can use database locking to manage cluster singletons, but that's seriously fragile. Not recommended.

As an alternative to a cluster singleton, you could use a distributed cache instead. I recommend JBossCache (which doesn't need JBoss app server to run) or EhCache (which now provides a distribution mechanism). You'll have to reengineer your cache to work in a distributed way (it won't magically just work), but it's probably going to be a better solution than a cluster singleton.

skaffman
+1  A: 

I'm with Mr. Vest Hansen on this one, move as far away from singletons as you possibly can. After being plaguged with the nightmare that is SAAJ and JAXP and getting compatible versions working on JBoss, I'm done with singletons and factories. A SOAP message shouldn't need a factory to instantiate it.

Okay, rant over, what about memcache or something similar? What sort of affinity do you need for your cache? Is it bad if it's EVER out of date, or is there some flexibility in how out of date the data can get?

Chris Kaminski
We use it for an autocomplete list, so the users will not see the changes. Thx for your feedback.
lud0h
+2  A: 

The simplest approaches are:

  1. Add an expiry timer to your singleton cache so that every so often the cache gets purged and subsquent calls fetch the updated data from source (e.g. a database)

  2. Implement a notification mechanism for the cache using something like a JMS topic/tibRV. Get each cache instance to subscribe and react to any change messages broadcast on this topic.

pjp
Can you elaborate on 2? You mean JMS pub/subscribe model?
lud0h
Yes solution 2 is essentially a way of using a pub/sub mechanism for broadcasting changes to the individual cache instances. You'd need to create a JMS topic running on the application server that is subscribed to by each of the caches. When that data changes a message would need to be published to the topic. Each subscriber would then receive this message and update the local caches accordingly.
pjp
If your data doesn't change very often then i'd go for option 1. I've worked on several systems using this approach for refreshing reference data. I believe that we used to refresh the caches around every 30 minutes. The refresh period you chose will obviously be based around how your reference data is being used.
pjp
2 sounds interesting, but how do you handle the case of having potentially different singleton states for some period of time between instances? i.e. It takes time to inform the subscribed singleton instances, during which time they could potentially have different state information (think cache).
Shane
+2  A: 

Or something like memcached

http://www.danga.com/memcached/

What is memcached? memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.

Danga Interactive developed memcached to enhance the speed of LiveJournal.com, a site which was already doing 20 million+ dynamic page views per day for 1 million users with a bunch of webservers and a bunch of database servers. memcached dropped the database load to almost nothing, yielding faster page load times for users, better resource utilization, and faster access to the databases on a memcache miss.

monkey_p
+1  A: 

There are several ways to handle this, depending on 1) how out of data the data is, and 2) does every instance need to have the same values all of the time.

If you just need data that is reasonably up to data, but every JVM doesn't need to have matching data, you can just have every jvm refresh its data on the same schedule (e.g., every 30 seconds).

If the refresh needs to happen at about the same time, you can have one jvm send out a message to the rest of them saying "its time to refresh now"

If every jvm always needs the same information, you need to do a sync, where the master says "refresh now", all of the caches block any new queries, refresh, and tell the master that they are done. When the master gets an answer back from every member of the cluster, it sends another message that says to proceed.

KeithB
Every instance needs some data, otherwise the users will not see new changes. Can you elaborate little more on keeping JVM's in sync. What kind of sub/notify available? Thx.
lud0h
^Every instance needs some data^ -> Every instance needs *same* data
lud0h
A: 

There are products for having a distributed in memory cache (such as memcache) that can help in this situation.

A better solution, if possible, may be to have the singletons not really be single, but have the application tolerate having separate instances (say that all recognize when they need to be refreshed) but not that they have to be in sync across JVMs, that can turn your cache into a bottleneck.

Yishai
Yeah the trick part is "that all recognize when they need to be refreshed" ...JMS needs a Messaging provider, looks RMI may be the only option. Any other ideas? (other than jGroups/Terracota) and so on...i.e. withtout external dependencies?
lud0h
+2  A: 

You could use the DistributedMap that is built into WAS.

-Rick

Rick
Thanks for the link. Looks like it can be much simpler to setup and use than JMS.
lud0h
+1  A: 

I'm facing a similar situation, but I'm using Oracle's WebLogic and Coherence.

I'm working over a web application that uses an hashmap with cached data read from the database (text to show on webform's labels). To accomplish this, the developers used a singleton instance where they stored all this information. This worked well on a single server environment, but now we want to go into cluster solution and I'm facing this issue with this singleton instance.

From what I've read by now, this is the best solution to accomplish what I want. I hope this helps you with your problem, too.

XpiritO