views:

986

answers:

3

When it comes to writing custom MySQL database-driven PHP session management for a VERY dynamic website, what is the best (fastest read/write access) structure for your session table?

Bad Example (Not Optimized):

CREATE TABLE `session` (
    `session_id` VARCHAR(32) NOT NULL,
    `session_data` TEXT NOT NULL,
    `t_created` DATETIME NOT NULL,
    `t_updated` DATETIME NOT NULL,
    PRIMARY KEY  (`session_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;

I assume that using the Memory Engine would be better/faster, but I'm not sure. I can't think of a good way to explain everything in English, so I've made a list of requirements/details that I think are important:

Details:

  • Category: Optimization
  • Sub Category: MySQL Query Performance
  • Goal: Fastest Random Access Table Schema and Single Row Query
  • Common Uses: Custom Session Management, Temporary Storage
  • Operating System: *nix, more specifically: Centos 5+ (on x86_64)
  • Database: MySQL Version: 5+ (Community Version)

Outcomes:

  • SQL Query: Create Table
  • SQL Query: Select Single Row by Random Key (e.g. PHP session id)
  • SQL Query: Insert Single Row with Random Key (e.g. PHP session id)
  • SQL Query: Update Single Row by Random Key (e.g. session id)
  • SQL Query: Delete Multiple Rows by Timestamp (garbage collection, e.g. expired sessions)

Expected Row Lifespan (e.g. session durations):

  • 30%: 0s-30s
  • 20%: 30s-5m
  • 30%: 5m-1h
  • 20%: 1h-8h

Expected Row Count (e.g. active sessions):

  • Low: 128
  • Medium: 1024
  • High: 100000

If anyone can think of a better way to phrase all this, please feel free to edit.

A: 

Your intuition seems to be about right. I'd recommend creating the table as follows:

CREATE TABLE session (
  id CHAR(32) NOT NULL,
  data BLOB NOT NULL,
  t_created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  t_updated TIMESTAMP,
  PRIMARY KEY (session_id),
  INDEX t_created(t_created),
  INDEX t_updated(t_updated)
)
ENGINE = MEMORY
CHARACTER SET utf8;

Notes:

  • id - CHAR is less expensive when you know the length of the contents
  • data - BLOB (Binary Large OBject) is more applicable here, as you're likely storing something other than TEXT.
  • t_created and t_updated - TIMESTAMP - calculations are faster, although you're limited to the time range 1901-2038 but that should be fine for this application.
  • INDEXes on t_created and t_updated are memory expensive and not entirely necessary, but they can really help performance when querying by these columns.
  • MEMORY tables, while incredibly fast, have their limitations. If your mysqld relaunches, all data is lost.

Side note: I'm not sure how you plan to garbage-collect your sessions, but if you're expecting 50% of your sessions to be under 5 minutes, how is an end-of-session defined? Must a user/client explicitly end their session (via logging out)? If you implicitly end sessions that quickly, your users may have a very rough time with your website.

yaauie
One note: `MEMORY` tables can't contain `BLOB` (or `TEXT`) columns
Otterfan
As others have pointed out, memory tables won't work. I recommend InnoDB, but beware:utf8 uses 3 bytes per character even if they are not unicode when storing an index in memory. Therefore, you should specify the character set for id as latin1.
Jay Paroline
+2  A: 

Have you considered use memcached or APC for the session data? These would almost certainly be much faster than any RDBMS solution.

Another suggestion if you are set on using MySQL: instead of the MEMORY storage engine, just enable plenty of memory to the various cache buffers. That way the data will be durable, transparently backed by disk storage, but available quickly when it's in use.

Bill Karwin
A: 

ENGINE=MEMORY doesn't support BLOB/TEXT columns.

Peter