views:

544

answers:

2

Skip lists (Pugh, 1990) provide sorted dictionaries with logarithmic-time operations like search trees but skip lists are much more amenable to concurrent updates.

Is it possible to create an efficient purely functional concurrent skip list? If not, is it possible to create any kind of efficient purely functional concurrent sorted dictionary?

+5  A: 

Not a skip list, but seems to match the problem description: Clojure's persistent red-black trees (see PersistentTreeMap.java). The source contains this notice:

/**
 * Persistent Red Black Tree
 * Note that instances of this class are constant values
 * i.e. add/remove etc return new values
 * <p/>
 * See Okasaki, Kahrs, Larsen et al
 */

These trees maintain the order of elements and are "persistent" in the sense in which Rich Hickey uses the word (immutable and able to maintain their performance guarantees as updated versions are constructed).

In case you want to play around with them, you can construct instances in Clojure code with the function sorted-map.

Michał Marczyk
+24  A: 

The property of skip lists that makes them good for concurrent updates (namely that most additions and subtractions are local) also makes them bad for immutability (namely that a lot of earlier items in the list point eventually to the later items, and would have to be changed).

Specifically, skip lists consist of structures that look like so:

NODE1 ---------------------> NODE2 ---------...
  |                           |
  V                           V
NODE1a --> NODE1b ---------> NODE2a --> NODE2b --> NODE2c --- ...

Now, if you have an update that, say, deletes NODE2b or NODE1b, you can take care of it very locally: you just point 2a to 2c or 1a to 2a respectively and you're done. Unfortunately, because the leaf nodes all point one to another, it's not a good structure for a functional (immutable) update.

Thus, tree structures are better for immutability (as the damage is always locally limited--just the node you care about and its direct parents up through the root of the tree).

Concurrent updates don't work well with immutable data structures. If you think about it, any functional solution has an update of A as f(A). If you want two updates, one given by f and one given by g, you pretty much have to do f(g(A)) or g(f(A)), or you have to intercept the requests and create a new operation h = f,g that you can apply all in one go (or you have to do various other highly clever stuff).

However, concurrent reads work fantastically well with immutable data structures since you are guaranteed to have no state change. If you don't assume that you can have a read/write loop that resolves before any other write can interrupt, then you never have to lock on read.

Thus, write-heavy data structures are probably better implemented mutably (and with something like a skip list where you only need to lock locally), while read-heavy data structures are probably better implemented immutably (where a tree is a more natural data structure).

Rex Kerr
Of course! How could I miss this? The fundamental difference between a skip list and a normal list is that the latter only allows consing (i.e. insertion at the front), whereas the *whole point* of the former is that it allows insertion anywhere. I agree: path copying on a tree looks much cheaper.
Jörg W Mittag