views:

166

answers:

2

Hi. How does digg or any other high-traffic website store user sessions? What do they use for storing the user sessions? File system, DB (which one?), memcache or both?

Let's imagine a simple situation. Logged user has set the flag "Remember me" during login. We've set a session cookie with expiration date 1 year. For example, we are keeping session in memcache, but we also should keep record of this session in DB (in my version). Only users with "Remember me" flag are stored in DB. Is it a right way of storing sessions? I mean high traffic websites, of course (with 2 or more application servers, 2 or more databases, memecache servers etc.). In small websites storing session by default way (in file system) is ok.

I've tried to search google, but failed to find any information about it. I've read some solutions from "Advanced PHP programming" book, but main accent was made to customizing session storing handler.

Really hope to hear good ideas or links!

Thank you.

+3  A: 

They are most certainly using memcached or equivalents.

Alix Axel
Yeah, I understand, but what about long term session storing?What about situation when user has flagged "Remember me" checkbox? It means that I should store this session not only in memcache (I can't keep data of user session in memcache for weeks).......or maybe they are using another "remember me" mechanism?Thank you!
Kirzilla
You can store sessions on a database, of anywhere else and use memcached on top of that. No one forbids you doing it. What happens is that let's say when you "come back" you will be "remembered" by the database call (or the flat line casandra call), so chache will miss, but from the next request, cache will do the job. If you are fully "forgotten" both will return wrong.
dimitris mistriotis
This is exactly I wanted to hear.Thank you!
Kirzilla
+4  A: 

In addition to Alix's answer, you may be interested in checkout out this article:

A short excerpt:

What prompted the Memcached as sessions store:

Shortly after the rollout of Digg v3, the non-redundant MySQL session store hardware crashed. This led to a Digg outage. We had always planned that in such a case we would just roll a (trivial) change to put sessions into Memcached rather than MySQL to see how it fared.


So, before you were hitting the db every time for sessions?

Yes.

MySQL was plenty capable of keeping up with the inserts and selects done to deal with sessions. Our problem was actually with clearing out old sessions. The script to delete old sessions, despite being fairly sophisticated in its attempts to not overload the sessions database, still affected it.

We surmise that Memcached will remove expired sessions with less overhead than MySQL.


We used InnoDB for sessions [before memcached]. It wasn't table- or row-level locking. It was OS-level contention. Using Memcached in front of MySQL would've reduced the load and allowed the admin script to do its work, but that highlights the question: why even have MySQL behind memcached at all? We don't need or even want non-volatile sessions. (Important note to reader: you may need or want non-volatile sessions).

"Why even have MySQL behind memcached at all?"... "We don't need or even want non-volatile sessions".

Daniel Vassallo
Yeah, thank your for this article...Right now I'm taking a look at slideshow http://www.slideshare.net/folke/netlog-what-we-learned-about-scalability-high-availability-430211 (about Netlog architecture). There are also some slides about it...
Kirzilla
+1 for link to Digg story
namespaceform