views:

163

answers:

5

I want to implement lists and queues over Cassandra, Riak, or any other eventually consistent store. Is this possible and how could I do it?

I am looking for a general purpose algorithm.

+2  A: 

I don't fully understand. What lists/queues? You could create a (one/multiple) document that contains each one queue/list in it/them. Do you mean about queries or the like (what sounds a bit like SQL thinking)?

A very good article about modeling and not modeling things can be found here:

how to NOT do it http://ayende.com/Blog/archive/2010/04/20/that-no-sql-thing-the-relational-modeling-anti-pattern-in.asp

how to do it http://ayende.com/Blog/archive/2010/04/21/that-no-sql-thing-modeling-documents-in-a-document-database.aspx

if i understood you wrong pls clarify :)

Eleasar
But then each document becomes massive. I'm talking about billions of items in each list/ queue
Zubair
Also, one of your links doesn't work.
Zubair
ok - with billions of items that approach is not going to work
Eleasar
+1  A: 

You could go a different route and use Lucene for storing your "list", and just add a column in Lucene for the "conversation index" or whatever it is you are doing.

See the Lucandra project for more info. Also, the Sematext blog has a good writeup on it.

GalacticJello
Lucandra looks quite nice, but doesn't work over Riak. I'm looking more for a description of the algorithms. I read the sematext blog too but it doesn't discuss them.
Zubair
Riak already has a key-value type index, could you use that to implement your list? It also has a lucene-like search syntax, so your code to access your list would be similar if you used Riak or Lucandra.
GalacticJello
+1  A: 

Disclaimer: I am not specifically familiar with Casandra or Riak. This applies to a general "database" (not necessarily relational, distributed, etc).

I am assuming that you are able to store and access "key-value pairs" (ie. a pair of values (a, b) where a is the key to the value b).

I'll also use this notation to represent some generalized "object" (data structure, object, dictionary, ..): Person[name:"John Doe" age:49].

Implementation of a linked list

Assuming you have a key-value pair (key, value) and an object Object[fields:values ...], a linked-list can be implemented in the database by

  1. adding a "next" field to the object by defining Object[fields:values ... next] OR
  2. creating a new "holder object" defined as (key, value) = Holder[Object[fields:values ... ] next]

It could also be a could idea to store the first value of the linked-list in some special key-value pair like (first, ...).

In either case, it is possible to implement a linked list in-database.

When extracting a value from a key-value pair, to get the next value, simply find the "next" field of the Holder or Object to traverse the list to the next value, etc.

Example linked-list algorithms

Search:

def find(first, node):
    if node = first[next]:
        return first[next]
    else:
        find(first[next], node)

Search for predecessor:

def find_pred(first, node):
    if node = first[next]:
        return first
    else:
        find_pred(first[next], node)

Insertion before a particular node:

def insert_at_front(node, inserted_node):
    find_pred(node)[next] = inserted_node
    inserted_node[next] = node

Implementation of a queue

In this case, a queue can simply be a linked list where two specific values are automatically known (probably stored in the database):

  1. the first element (or head) that could be stored with the special key-value pair (head, ...)
  2. the last element (or tail) that could be stored with the special key-value pair (tail, ...)

Note: these algorithms are deliberately simplified; they are not in any particular language, do not handle exceptions, end-of-list, etc, and should not be used for blah, blah, blah.... etc.

nickname
Best answer so far, but still doesn't address the issues of an "eventually consistent" store. For example what happens if two different clients add a new node to the end of the same queue at roughly the same time, should it recover the missing item later, or should the item be lost?
Zubair
In theory, there is no way to guarantee almost anything efficiently with an eventually consistent store since, at any point, a value could be modified by an unsynchronized node. Merging the changes could be done if you assigned certain nodes a "priority" over editing.For example, if node A and node B edit the same queue at the same time, since A has priority over B, A's addition to the queue will be added before B's addition.
nickname
+1  A: 

Have a look at the Cages project. It may be helpful for your use case. http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/

It basically uses a ZooKeeper cluster for locking so that you can have logical consistency in your data. Your application can hold a lock when it wants to modify a list so that it is not modified by other users concurrently.

Damodharan R
+1  A: 

Not sure about what operations you want to support, and not familiar with Riak et. al., but here is a possible implementation for CouchDB, another eventually consistent DB.

I am assuming map/reduce operations that return one or more key/value pairs, that results are returned by key sort order, and that querying on keys and key ranges are primitive operations. (CouchDB does this, don't know about the others.)

Assuming you want to support iteration, push and pop, you can have a document with

  • document ID: Could be UUIDs or something else that makes sense so each entry get a unique ID and does not cause conflicts during merge.
  • rank key: Gives relative ordering. Could be a timestamp or a global counter, the point is that the sort order of the key is the same as the relative ordering within the list. When the rank key is different from the document ID, conflicts do not matter and you still get the list/queue in correct order by sorting on the key. If you want a unique ordering, append a client ID or some such to the key. Example key: [123, "client1"] will sort before [123, "client2"]. (The specifics here will probably differ by DB, but you can use this trick even if keys are just strings.)
  • value: the contents of this list element

The operations are

  • list/iterate: return all elements ordered by rank key, iterate on the client. If the dataset is massive, iterate over key subranges.
  • head, tail: queries that return the element with the smallest or largest rank key.
  • push: insert a new document with a rank key higher than the highest existing.
  • pop: retrieve and then delete the document with the lowest rank key
j-g-faustus
So mapreduce is needed first I assume from your answer. However the sorting is done within the mapreduce in this example, is that correct?
Zubair
No, the map just extracts the keys from individual documents and the reduce (for the head/tail queries) gets a list of keys and returns the highest or lowest. The actual sorting comes from the default result set ordering in CouchDB (result sets are sorted on keys, alphabetically for strings, numerically for numbers etc.).It should be possible to implement something similar anywhere the DB either supplies a default ordering on keys or supports some form of "order by", and has some efficient way of querying on key ranges for the iteration.
j-g-faustus