views:

194

answers:

5
+10  Q: 

Events vs. Yield

I have a multithreaded application that spawns threads for several hardware instruments. Each thread is basically an infinite loop (for the lifetime of the application) that polls the hardware for new data, and activates an event (which passes the data) each time it collects something new. There is a single listener class that consolidates all these instruments, performs some calculations, and fires a new event with this calculation.

However, I'm wondering if, since there is a single listener, it would be better to expose an IEnumerable<> method off these instruments, and use a yield return to return the data, instead of firing events.

I'd like to see if anybody knows of differences in these two methods. In particular, I'm looking for the best reliability, best ability to pause/cancel operation, best for threading purposes, general safety, etc.

Also, with the second method is it possible to still run the IEnumerable loop on a separate thread? Many of these instruments are somewhat CPU-bound, so ensuring each one is on a different thread is vital.

+4  A: 

It's a close call, but I think I'd stick to the event model in this case, with the main decider behing that future maintenance programmers are less likely to understand the yield concept. Also, yield means the code processing each hardware request is in the same thread as the code generating the requests for processing. That's bad, because it could mean your hardware has to wait on the consumer code.

And speaking of consumers, another option is a producer/consumer queue. Your instruments can all push into the same queue and your single listener can then pop from it do whatever from there.

Joel Coehoorn
I'm not too concerned about maintainability. Since this is the mission-critical portion of the application, it is so documented that non-programmers can find their way around. Thanks for answering the threading question. And, good pointer about using a queue. I'm looking at the new `ConcurrentQueue<T>` class and may wind up going this route.
drharris
+1  A: 

Stephen Toub blogs about a blocking queue which implements IEnumerable as an infinite loop using the yield keyword. Your worker threads could enqueue new data points as they appear and the calculation thread could dequeue them using a foreach loop with blocking semantics.

Brian Gideon
+2  A: 

There's a pretty fundamental difference, push vs pull. The pull model (yield) being the harder one to implement from the instrument interface view. Because you'll have to store data until the client code is ready to pull. When you push, the client may or may not store, as it deems necessary.

But most practical implementations in multi-threading scenarios need to deal with the overhead in the inevitable thread context switch that's required to present data. And that's often done with pull, using a thread-safe bounded queue.

Hans Passant
+4  A: 

This sounds like a very good use case for the Reactive Extensions. There's a little bit of a learning curve to it but in a nutshell, IObservable is the dual of IEnumerable. Where IEnumerable requires you to pull from it, IObservable pushes its values to the observer. Pretty much any time you need to block in your enumerator, it's a good sign you should reverse the pattern and use a push model. Events are one way to go but IObservable has much more flexibility since it's composable and thread-aware.

instrument.DataEvents
          .Where(x => x.SomeProperty == something)
          .BufferWithTime( TimeSpan.FromSeconds(1) )
          .Subscribe( x => DoSomethingWith(x) );

In the above example, DoSomethingWith(x) will be called whenever the subject (instrument) produces a DataEvent that has a matching SomeProperty and it buffers the events into batches of 1 second duration.

There's plenty more you could do such as merging in the events produced by other subjects or directing the notifications onto the UI thread, etc. Unfortunately documentation is currently pretty weak but there's some good information on Matthew Podwysocki's blog. (Although his posts almost exclusively mention Reactive Extensions for JavaScript, it's pretty much all applicable to Reactive Extensions for .NET as well.)

Josh Einstein
I've used Rx on a new project, and it's wonderful. The problem in this case is that it will require a major overhaul of legacy code, in that it changes the whole way the problem is even approached. For the next major version, I'm already working on this, but it seems like a daunting task for a small(er) redesign.
drharris
A: 

I don't think there's much difference performance-wise between the event and yield approach. Yield is lazy evaluated, so it leaves an opportunity to signal the producing threads to stop. If your code is thoughtfully documented then maintenance ought to be a wash, too.

My preference is a third option, to use a callback method instead of an event (even though both involve delegates). Your producers invoke the callback each time they have data. Callbacks can return values, so your consumer can signal producers to stop or continue each time they check in with data.

This approach can give you places to optimize performance if you have a high volume of data. In your callback you lock on a neutral object and append incoming data to a collection. The runtime internally uses an ready queue on the lock object, so this can serve as your queuing point.

This lets you choose a collection, such as a List<T> with predefined capacity, that is O(1) for appending. You can also double-buffer your consumer, with your callback appending to the "left" buffer while you consolidate from the "right" one, and so forth. This minimizes the amount of producer blocking and associated missed data, which is handy for bursty data. You can also readily measure high-water marks and processing rates as you vary the number of threads.

ebpower