STM hash library for C (glib?)

Well, I think (and there are a number of study) that current STM isn't faster than lock-free and mutex-based code. It is obvious: STM requires on-line data conflict checking. However, such conflict checking in pure software requires very huge time overheads. Currently, only Sun's ROCK processor supports a limited form of STM (Best-effort HTM with STM) by hardware. No x86 CPUs currently support TM in hardware. In short, STM is just slow.

In my opinion, you'd better just using a concurrent hash table. For example, you can find concurrent_hash_map in Intel TBB. Here is the link of TBB Manual. Oh, but it's C++, not C. But, I do believe you can (though it might take significant work) translate C++-based such hash table to C code. Intel TBB is open source.

Also, I think such highly concurrent (usually implemented as lock-free) data structures are not always useful. In some workload pattern, using such data structures is not good. To be sure, I recommend you to write a small micro-benchmark for two versions of hash tables: (1) lock-free and (2) lock-based. Also, please keep in mind the workload patterns for such micro-benchmark should be close to real one. An example could be found in here.

I looked through those. But they're very low-level. The best one for integration is probably stmmap: http://github.com/skaphan/stmmap. But I'm looking for something that already includes the hashmap implementation. Unfortunately I can't be sure that what some random data-structure library does is safe to just use on top of the stm library... I would prefer something working on a higher level. (ready data-structure + stm solution)

viraptor 2009-11-08 20:31:54

I know they're sometimes slower. But they're also sometimes easier in a specific solution. I have a big hash that is often read, rarely written to. Also writes take a long time and need a lookup first. With a pool of readers and 1 writer it's a perfect case for STM. Also, you DO NOT need an on-line conflict checking. You can implement many STM-style solutions in a way that reader NEVER waits and NEVER blocks (and writer never waits, although might need to retry) using only the standard CMPXCHG. For me that's one synchronisation every 30 sec -vs- grabbing the read lock every 10ms.

viraptor 2009-11-08 21:35:40

I understand your problem. It's better to use "lock-free" data structure rather than STM-style. STM always means conflict checking. What about RCU? http://en.wikipedia.org/wiki/Read-copy-update

minjang 2009-11-08 21:38:55

PS. I know that concurrent_hash_map is supposed to be "lock-less", but they write: "Because having access to an element can block other threads, try to shorten the lifetime of the accessor or const_accessor." Not good enough for me in this case.

viraptor 2009-11-08 21:39:29

Minjang: That's what I meant (more or less ;) ). From what I understand an RCU trie-based immutable GC'd hash map pretty much meets the definition of an STM hash map (i.e. has exactly the same properties). Or am I missing something?

viraptor 2009-11-08 21:48:23

STM is just software-level implementation of TM. TM is a speculative approach to handle locks efficiently. By definition, TM is a generalization of lock-free data structures. So, STM hash map actually doesn't make sense to me, and I never heard such terms. It's just wait-free/lock-free. In current x86 architecture, there is no magic that you can do better than a conventional lock-free data structure. I believe RCU-like data structure is the best for read-intensive workloads.

minjang 2009-11-08 21:58:47

ansaurus

tags:

views:

answers:

STM hash library for C (glib?)

related questions