tags:

views:

1021

answers:

4

Memcached has length limitations for keys (250?) and values (roughtly 1MB), as well as some (to my knowledge) not very well defined character restrictions for keys. What is the best way to work around those in your opinion? I use the Perl API Cache::Memcached.

What I do currently is store a special string for the main key's value if the original value was too big ("parts:<number>") and in that case, I store <number> parts with keys named 1+<main key>, 2+<main key> etc.. This seems "OK" (but messy) for some cases, not so good for others and it has the intrinsic problem that some of the parts might be missing at any time (so space is wasted for keeping the others and time is wasted reading them).

As for the key limitations, one could probably implement hashing and store the full key (to work around collisions) in the value, but I haven't needed to do this yet.

Has anyone come up with a more elegant way, or even a Perl API that handles arbitrary data sizes (and key values) transparently? Has anyone hacked the memcached server to support arbitrary keys/values?

A: 

For values that were too large, instead of storing a standard value (which, when decoded, was always a dictionary) we stored a list of keys. We then read the data in each key, and restored the main value. I think we also hashed the keys when they were too long (which in our dataset could happen, but extremely rarely).

We did write all this code directly on top of the memcached client (we were using Python), so at a higher level it was all transparent.

Kathy Van Stone
+1  A: 

The server does already allow you to specify whatever size you want:

-I            Override the size of each slab page. Adjusts max item size
              (default: 1mb, min: 1k, max: 128m)

However, most of the time, when people are wanting to cache larger objects, they're doing something wrong. Do you really need that much data in one cache key? Uncompressed?

If you have sufficiently large tasks, the benefit of low-latency access is dwarfed by the time it takes you to actually transfer the data. Or you find that tossing everything in the same key means your frontend ends up having to do a lot of work to deserialize a bit of data they want.

It depends on your needs, and I can't tell you what's best for you without knowing more of what you're doing. If you truly do need something bigger than 1MB, that's why we added -I, though.

Dustin
A: 

$key=abs(crc32($long_key))

This way you get unique key for queries and other long keys which may have changes beyond the 250 chars memcache sees.

Nir
A: 

$key=abs(crc32($long_key))

This way you get unique key for queries and other long keys which may have changes beyond the 250 chars memcache sees.

Whoa...careful. Good advice, but without an important caveat. That can cause collisions. Sure it's highly improbable, but it only ever has to happen once to cause an earth shattering bug. You will still probably want to store the long key with memcached and always double-check for collisions at the key. Best way to deal with them would be to store a simple list of long_key/value pairs.

peabody