views:

979

answers:

2

Is there a best-practice for scalable http session management?

Problem space:

  • Shopping cart kind of use case. User shops around the site, eventually checking out; session must be preserved.
  • Multiple data centers
  • Multiple web servers in each data center
  • Java, linux

I know there are tons of ways doing that, and I can always come up with my own specific solution, but I was wondering whether stackoverflow's wisdom of crowd can help me focus on best-practices

In general there seem to be a few approaches:

  • Don't keep sessions; Always run stateless, religiously [doesn't work for me...]
  • Use j2ee, ejb and the rest of that gang
  • use a database to store sessions. I suppose there are tools to make that easier so I don't have to craft all by myself
  • Use memcached for storing sessions (or other kind of intermediate, semi persistent storage)
  • Use key-value DB. "more persistent" than memcached
  • Use "client side sessions", meaning all session info lives in hidden form fields, and passed forward and backward from client to server. Nothing is stored on the server.

Any suggestions? Thanks

+2  A: 

You seem to have missed out vanilla replicated http sessions from your list. Any servlet container worth its salt supports replication of sessions across the cluster. As long as the items you put into the session aren't huge, and are serializable, then it's very easy to make it work.

http://tomcat.apache.org/tomcat-6.0-doc/cluster-howto.html

edit: It seems, however, that tomcat session replication doesn't scale well to large clusters. For that, I would suggest using JBoss+Tomcat, which gives the idea of "buddy replication":

http://www.jboss.org/community/wiki/BuddyReplicationandSessionData

skaffman
Tomcat 6.0, right?
Ran
In as much as it's the current version, yes. JBoss ships with an embedded version of Tomcat.
skaffman
+1  A: 

I would go with some standard distributed cache solution. Could be your application server provided, could be memcached, could be terracotta Probably doesn't matter too much which one you choose, as long as you are using something sufficiently popular (so you know most of the bugs are already hunted down).

As for your other ideas:

  • Don't keep session - as you said not possible
  • Client Side Session - too unsecure - suppose someone hacks the cookie to put discount prices in the shopping cart
  • Use database - databases are usually the hardest bottleneck to solve, don't put any more there than you absolutely have to.

Those are my 2 cents :)

Regarding multiple data centers - you will want to have some affinity of the session to the data center it started on. I don't think there are any solutions for distributed cache that can work between different data centers.

Gregory Mostizky
I'm leaning towards this solution, I think. Just wanted to validate I'm heading in the right direction ;)
Ran