views:

346

answers:

4

I have a situation where I could really benefit from having system like memcached, but with the ability to store (per each key) sorted list of elements, and modifying the list by addition of values.

For example:

something.add_to_sorted_list( 'topics_list_sorted_by_title', 1234, 'some_title')
something.add_to_sorted_list( 'topics_list_sorted_by_title', 5436, 'zzz')
something.add_to_sorted_list( 'topics_list_sorted_by_title', 5623, 'aaa')

Which I then could use like this:

something.get_list_size( 'topics_list_sorted_by_title' )
// returns 3
something.get_list_elements( 'topics_list_sorted_by_title', 1, 10 )
// returns: 5623, 1234, 5436

Required system would allow me to easily get items count in every array, and fetch any number of values from the array, with the assumption that the values are sorted using attached value.

I hope that the description is clear. And the question is relatively simple: is there any such system?

+6  A: 

Take a look at MongoDB. It uses memory mapped files, so is incredibly fast and should perform at a comparative level to MemCached.

MongoDB is a schema-less database that should support what you're looking for (indexing/sorting)

Alan
As I understand the docs, sorting is done on retrieval time, which is not really helpful for me - even using memory based storage, sorting of 1 milion elements, using text values of up to 200 characters will be rather slow.But maybe I got it wrong - will check the docs in depth.
depesz
+4  A: 

MongoDB will fit. What's important it has indexes, so you can add an index by title for topics collection and then retrieve items sorted by the index:

db.topics.ensureIndex({"title": 1})
db.topics.find().sort({"title": 1})
Alexander Azarov
A: 

why not just store an array in memcached? at least in python and PHP the memcached APIs support this (i think python uses pickle but i dont recall for sure).

if you need permanent data storage or backup, memcacheDB uses the same API.

basic pseudopython example:

getting stored data stored=cache.get(storedDataName)

initialize list if you have not stored anything previously if(stored==None): stored={}

---------------- finding stored items

try: alreadyHaveItem=stored[itemKey] except KeyError: print 'no result in cached'

---------------- adding new items

for item in newItemsDict: stored[item]=newItems[item]

---------------- saving the results in cache cache.set(storedDataName,stored,TTL)

because it is hard to keep the list sorted in case when we have more than 1 process adding items to it
depesz
well, you could also maintain a list of dicts or even objects, or a/several sorted lists of keys for such objects..etc...and use insertion sort...i do this kind of stuff for user rankings. actually i have my multiple processes drop jobs into the same queue (which is ALSO stored in memcached) and then they are processed in order, effectively making one source for insertion...anyway,i'm sure you know your problem space better than i do, i can't quite determine the complexity from what you wrote. good luck
Actually, you can just do the sort on the client side and CAS it in safely regardless of concurrency. THere are a few strategies to do such a thing.
Dustin
+1  A: 

Redis supports lists and sets. You can diasable disk saving and use it like memcache instead of going for MongoDB which would save things to disk.

Kalpesh Patel