views:

244

answers:

2

Hi,

I´m trying to understand the purpose of BlockingCollection in the context of the new Parallel Stacks on .NET 4.

The msdn documentation says:

Blockquote BlockingCollection is used as a wrapper for an IProducerConsumerCollection instance, allowing removal attempts from the collection to block until data is available to be removed. Similarly, a BlockingCollection can be created to enforce an upper-bound on the number of data elements allowed in the IProducerConsumerCollection; addition attempts to the collection may then block until space is available to store the added items.

However when I look at the implementation of some IProducerConsumerCollection, like ConcurrentQueue I see that they provide a lock free, thread safe, implementations. So why is needed the lock mechanism that BlockingCollection provides? All the examples in the msdn show using those collections via BlockingCollection wrapper, what are the troubles of using those collections directly? What benefit produces using BlockingCollection?

+3  A: 

Blocking until the operation can be performed is a convenience if you have nothing else to do anyway (or rather: cannot proceed until the operation has been performed).

If you have a non-blocking queue from which you want to read data, and there is no data at the moment, you have to periodically poll it, or wait on some semaphore, until there is data. If the queue blocks, that is already done automatically.

Similarly, if you try to add to a non-blocking queue that is full, the operation will just fail, and then you have to figure out what to do. The blocking queue will just wait until there is space.

If you have something clever to do instead of waiting (such as checking another queue for data, or raising a QueueTooFullException) then you want the non-blocking queue, but often that is not the case.

Often, there is a way to specify a timeout on blocking queues.

Thilo
+3  A: 

The purpose of locking is the locking itself. You can have several threads read from the collection, and if there is no data available the thread will just stay locked until new data arrives.

Also, with the ability to set a size limit, you can let the producer thread that is filling the collection just feed as much as it can into it. When the collection reaches the limit, the thread will just lock until the consumer threads have made space for the data.

This way you can use the collection to throttle the throughput of data, without doing any checking yourself. Your threads just read and write all they can, and the collection takes care of keeping the threads working or sleeping as needed.

Guffa
The important part is "without doing any checking yourself". Both your producer and consumer code can be really simple, almost completely the same as for your non-parallel version and still you get the benefit of threads falling asleep if there's nothing (useful) to do for them.
VolkerK