views:

147

answers:

2

I'm interested in techniques people use to publish information and changes to data structures that are being shared across multiple threads without losing much concurrency. In my personal experience I come across the single writer/multiple readers quite often, where a single thread is updating an object, but multiple threads are reading from the object and need to be informed of changes.

As a simple example, consider a hashtable (let's assume it is thread safe, whether through coarse-grained locking, fine-grained locking, or low-lock techniques, etc). Thread 1 is responsible for putting and removing information from the hashtable, but is the only writer. Other threads may wish to be informed when any key is changed, a certain key is changed, or any variant. What they wish to subscribe to is not particularly important.

What techniques (I'd love suggestions for papers) would you use to make sure threads receive timely and correct change information?

+1  A: 

The classical way to do this is to use wait conditions, but they block the thread until an event arrives.

Nowadays, I'd use a message queue for each worker thread, and updates to the central data structure would be messages posted into the worker thread's queue. That, too, uses a wait condition internally, but it's a higher-level interface than the naked wait condition, and there's good library support for this kind of programming in just about every programming language out there.

A somewhat more hackish solution is to block the thread on a select() call on a socket into which the informing thread writes a byte to wake up the sleeping thread. Using select() has the advantage that you can multiplex this kind event with just about every other event processing, and so is also applicable when the target thread is e.g. the GUI thread.

+1  A: 

you have actually asked about 2 distinct problems

MultipleReader/SingleWriter exclusion

Update Notification(which is frequently called the observer pattern)

MultipleReader/SingleWriter Locks are a CLASSIC problem for which there is an enormous amount of literature. The real problem is that many of the classic solutions involve multiple mutexes and or semaphores and can be pretty heavyweight frequently involving 6 read modify write(RMW) cycles per simple write lock and 6 RMW for the first read lock

there are quite a few solutions to both problems, the efficiency and practicality of which depends on arrival rate of reads, arrival rate of writes, duration of an write,duration of a read, whether updates can be batched, do you need to upgrade a read lock, downgrade a write lock(sounds like no in your case)

For the example of a hash table you can used "medium grained locking that is O(n accessing threads) and just arbitrairily assign the locks to chunks of the hash table(separate chaining mind you. With open addressing the lock does not necessarily lock the probed to slot). Or you can also use hop-scotch hashing a concurrent hash table algorithm

As far as notification is concerned some typical questions are:

Must every read thread see (and process) every notification? Is the set of observers static or dynamic? ie can the notification "network" be static(compile time defined)

pgast
Hmm, I hadn't thought about batching updates before, I can see how that might be advantageous under certain circumstances. What information could you use for benefit from knowing whether the set of observers is static or dynamic?
Soonil
unless you just blindly notify for any update, if the set of interest or the set threads to be notified is dynamic then you have to add subscribe/unsubscribe (identifier) to the notification part of whatever API you define. If static then you can just compile time define all the synchronization objects you might need and their interconnectedness
pgast