views:

180

answers:

6

Hi,

We are developing a Java application with several worker threads. These threads will have to deliver a lot of computation results to our UI thread. The order in which the results are delivered does not matter.

Right now, all threads simply push their results onto a synchronized Stack - but this means that every thread must wait for the other threads before results can be delivered.

Is there a data structure that supports simultaneous insertions with each insertion completing in constant time?

Thanks,

Martin

A: 

Take a look at java.util.concurrent.ConcurrentLinkedQueue, java.util.concurrent.ConcurrentHashMap or java.util.concurrent.ConcurrentSkipListSet. They might do what you need. ConcurrentSkipListSet, for instance, claims to have "expected average log(n) time cost for the contains, add and remove operations and their variants. Insertion, removal, and access operations safely execute concurrently by multiple threads."

Brian Clapper
+10  A: 

ConcurrentLinkedQueue is designed for high contention. Producers enqueue stuff on one end and consumers collect elements at the other end, so everything will be processed in the order it's added.

ArrayBlockingQueue is a better for lower contention, with lower space overhead.

Edit: Although that's not what you asked for. Simultaneuos inserts? You may want to give every thread one output queue (say, an ArrayBlockingQueue) and then have the UI thread poll the separate queues. However, I'd think you'll find one of the two above Queue implementations sufficient.

gustafc
+1: This is definitely a case for concurrent `Queue` implementations.
Andrzej Doyle
@Andrzej: to me it sounds more like a case of premature optimization.
Michael Borgwardt
@Michael not really - it is perfectly normal to specify a constant time insertion on a data structure you might be putting lots of data into, and if you don't use the concurrent queue from the library, you have more code to write and its harder to test in order to get something which has the correct behaviour. There's nothing in either the OP or this post that says anything about optimisation, just correctness within the requirements.
Pete Kirkham
@Pete: Constant time insertion is rather trivial to achieve for a queue, so that's not really an issue. I read the question as mainly being about concurrency. Using a synchronized collection is the trivial solution to that, and worrying about lock contention is what looks like premature optimization to me when the queue is just going to contain the *results* rather than being involved directly in the computation.
Michael Borgwardt
A: 

Two other patterns you might want to look at are

  • each thread has its own collection, when polled it returns the collection and creates a new one, so the collection only holds the pending items between polls. The thread needs to protect operations on its collection, but there is no contention between threads. This is blocking (each thread cannot add to its collection while the UI thread pulls updates from it), but can reduce contention (no contention between threads).

  • each thread has its own collection, and appends the results to a common queue which is protected using a Lock.tryLock(). The thread continues processing if it fails to acquire the lock. This makes it less likely that a thread will block waiting for the shared queue.

Pete Kirkham
Is there more to this answer?
Ben Voigt
@Ben I had switched from my IDE where tab+enter would indent and give a new line rather than submitting
Pete Kirkham
+3  A: 

Right now, all threads simply push their results onto a synchronized Stack - but this means that every thread must wait for the other threads before results can be delivered.

Do you have any evidence indicating that this is actually a problem? If the computation performed by those threads is even the least little bit complex (and you don't have literally millions of threads), then lock contention on the result stack is simply a non-issue because when any given thread delivers its results, all others are most likely busy doing their computations.

Michael Borgwardt
You are right, but since a vast number of results are being computed, with each computation being quite quick, I figured there would probably be collisions. Also, I simply became curious as to whether such a data collection exists :)
Martin Wiboe
@Martin: confirm, don't "figure". Look at your app in VisualVM's threads view and see how much time they actually spend waiting for locks. And if it really is significant, why not have each thread do compuations in larger bundles before submitting results? It many reduce overhead in other ways as well.
Michael Borgwardt
A: 

Well if you are looking for simultaneous (sounding like parallel) insertions, then any queue implementation cannot insert in parallel. It would be thread safe, but there is some serial work going on to ensure a mutual exclusion for each offer.

Parallel insertions are available with a ConcurrentHashMap or BlockingDeque. The ConcurrentHashMap employs lock striping that will allow for parallel puts into each independent entry (or more accurately segment). The BlockingDeque allows you to offerFirst/offerLast for parallel insertions (assuming enough space on the Deque.

John V.
+1  A: 

Take a step back and evaluate whether performance is the key design consideration here. Don't think, know: does profiling back it up?

If not, I'd say a bigger concern is clarity and readability of design, and not introducing new code to maintain. It just so happens that, if you're using Swing, there is a library for doing exactly what you're trying to do, called SwingWorker.

Mark Peters