I want to implement lists and queues over Cassandra, Riak, or any other eventually consistent store. Is this possible and how could I do it?
I am looking for a general purpose algorithm.
I want to implement lists and queues over Cassandra, Riak, or any other eventually consistent store. Is this possible and how could I do it?
I am looking for a general purpose algorithm.
I don't fully understand. What lists/queues? You could create a (one/multiple) document that contains each one queue/list in it/them. Do you mean about queries or the like (what sounds a bit like SQL thinking)?
A very good article about modeling and not modeling things can be found here:
how to NOT do it http://ayende.com/Blog/archive/2010/04/20/that-no-sql-thing-the-relational-modeling-anti-pattern-in.asp
how to do it http://ayende.com/Blog/archive/2010/04/21/that-no-sql-thing-modeling-documents-in-a-document-database.aspx
if i understood you wrong pls clarify :)
You could go a different route and use Lucene for storing your "list", and just add a column in Lucene for the "conversation index" or whatever it is you are doing.
See the Lucandra project for more info. Also, the Sematext blog has a good writeup on it.
Disclaimer: I am not specifically familiar with Casandra or Riak. This applies to a general "database" (not necessarily relational, distributed, etc).
I am assuming that you are able to store and access "key-value pairs" (ie. a pair of values (a, b) where a is the key to the value b).
I'll also use this notation to represent some generalized "object" (data structure, object, dictionary, ..): Person[name:"John Doe" age:49].
Assuming you have a key-value pair (key, value) and an object Object[fields:values ...], a linked-list can be implemented in the database by
It could also be a could idea to store the first value of the linked-list in some special key-value pair like (first, ...).
In either case, it is possible to implement a linked list in-database.
When extracting a value from a key-value pair, to get the next value, simply find the "next" field of the Holder or Object to traverse the list to the next value, etc.
Example linked-list algorithms
Search:
def find(first, node):
if node = first[next]:
return first[next]
else:
find(first[next], node)
Search for predecessor:
def find_pred(first, node):
if node = first[next]:
return first
else:
find_pred(first[next], node)
Insertion before a particular node:
def insert_at_front(node, inserted_node):
find_pred(node)[next] = inserted_node
inserted_node[next] = node
In this case, a queue can simply be a linked list where two specific values are automatically known (probably stored in the database):
Note: these algorithms are deliberately simplified; they are not in any particular language, do not handle exceptions, end-of-list, etc, and should not be used for blah, blah, blah.... etc.
Have a look at the Cages project. It may be helpful for your use case. http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
It basically uses a ZooKeeper cluster for locking so that you can have logical consistency in your data. Your application can hold a lock when it wants to modify a list so that it is not modified by other users concurrently.
Not sure about what operations you want to support, and not familiar with Riak et. al., but here is a possible implementation for CouchDB, another eventually consistent DB.
I am assuming map/reduce operations that return one or more key/value pairs, that results are returned by key sort order, and that querying on keys and key ranges are primitive operations. (CouchDB does this, don't know about the others.)
Assuming you want to support iteration, push and pop, you can have a document with
[123, "client1"]
will sort before [123, "client2"]
. (The specifics here will probably differ by DB, but you can use this trick even if keys are just strings.)The operations are