views:

41

answers:

2

In a project I'm working on, I have an object that is a sort of Collection with a database back end. The exact results this Collection returns are dependent upon its configuration, which is itself dependent on a number of user inputs. I would like to have an element on the page that contains the records in the Collection and can be updated dynamically through an AJAX request. The idea has occurred to me to serialize() this object, store it in memcache, and include the memcache key as a parameter in my AJAX calls. I would then retrieve the string from memcahce, unserialize() it, and retrieve the next set of records from the collection.

Is this a good way to achieve the kind of object persistence I want to make this work? I considered storing just the configuration, but I feel like this is a better "set it and forget it" solution in the face of future changes to the user controls. My main concern is that there might be some pitfall with serialize that I'm not aware of that would make this solution not robust, unreliable, or not very fast. Do I need to be concerned in any of those regards?

+1  A: 

serialize/unserialize works well enough with scalars, but can be more problematic when working with objects. I've had a couple of issues that highlight potential pitfalls.

If any of your object properties are resources, these can't be serialized. You'd need to use the magic __sleep and __wakeup methods to cleanly close the resource attribute and restore it again on unserialize.

If your collection contains objects with cyclic references (e.g. a cellCollection object is an array of cell objects, each of which has an attribute pointing back to the parent cellCollection object) then these won't be cleanly restored on unserialize... each cell's parent object will actually be a clone of the original parent. Again, __sleep and __wakeup need to be used to restore the true relationships (not a trivial task).

Mark Baker
Thanks for the excellent advice. The objects have no circular references, and the current implementation does not use resources that need to be closed.
Jeremy DeGroot
@Mark I can't reproduce that anomalous behavior with cyclic references. See http://codepad.viper-7.com/LtLpkt
Artefacto
I'd add that you must make sure the definitions are available when you unserialize the graph.
Artefacto
@Artefacto: I read about that concern in the serialize() documentation on php.net. I knew it would have been technically feasible to store the object in $_SESSION, but the concern about definition availability was one of the factors that led to me choosing memcache.
Jeremy DeGroot
@jeremy - you realise that PHP serializes/unserializes whether you store an object in $_SESSION, APC or memcache... so the same potential issues with serialize/unserialize are present regardless of the storage location.
Mark Baker
@Artefacto - Probably my bad explanation re the cyclic references. I was serializing the individual children when I noticed the behaviour http://codepad.viper-7.com/CwiVCu
Mark Baker
@Mark OK so it's not really `serialize` not handling cyclic references, it's more that if you unserialize two different graphs, the objects won't be shared between the two.
Artefacto
@Mark Yes, I am aware of that but I believe (and please correct me if I'm wrong about any of this) that PHP will try to `unserialize()` any objects I store in `$_SESSION` on every page, regardless of whether it's needed. Memcache on the other hand will only `unserialize()` the object if I ask for it, so I only need to ensure the class definition is always available when I retrieve the object.
Jeremy DeGroot
+1  A: 

If the serialized objects are larger than just queries you are extracting from the database, and have had a lot of processing applied to them, then what you are proposing is actually a very good optimization.

Two reference in particular: http://code.google.com/p/memcached/wiki/FAQ#Cache_things_other_than_SQL_data! http://www.mysqlperformanceblog.com/2010/05/19/beyond-great-cache-hit-ratio/

Both promote using memcached as being beyond a "row cache".

Morgan Tocker
We're talking about potentially thousands of results, being displayed maybe 10 at a time. Some processing would be done on them after retrieval.
Jeremy DeGroot
Sounds good! The more processing you can do before writing to memcached, and the fewer objects you retrieve to generate a page, the better performance will be.
Morgan Tocker