views:

102

answers:

4

I understand the basics of dataflow programming and have encountered it a bit in Clojure APIs, talks from Jonas Boner, GPars in Groovy, etc. I know it's prevalent in languages like Io (although I have not studied Io).

What I am missing is a compelling reason to care about dataflow as a paradigm when building a concurrent program. Why would I use a dataflow model instead of a mutable state+threads+locks model (common in Java, C++, etc) or an actor model (common in Erlang or Scala) or something else?

In particular, while I know of library support in the languages above (and Scala and Ruby), I don't know of a single program or library that is a poster child user of this model. Who is using it? Why do they find it better than the other models I mentioned?

A: 

If you think about it, relational databases is the poster child. Think of any evaluation plan, where each operator processes streams of rows from others operators/tables and produce streams that are fed into other operators.

Random image stolen off the web: alt text

Dimitris Andreou
I think maybe you're right at a very high level of abstraction, but wrong in terms of what you'd actually use dataflow variables for. In particular, the diagram represents streams of tuples flowing through processing units where each node is performing operations on the data. That doesn't match my mental model of what a data flow variable is or does.[I have written a relational query planner in the past, so maybe I just know too much about the actual details of what this diagram represents. :)]
Alex Miller
Ok, let me bite: what would you actually use dataflow variables for? What is your mental model of a dataflow variable? From a cursory look, e.g. http://www.slideshare.net/jboner/state-youre-doing-it-wrong-javaone-2009 slide 89, dataflow variables are similar to operators (minus the relational semantics of course!), they are fed with streams, they output streams, basically expressing the various data dependencies, and let a scheduler deal with concurrent execution rather than the programmer. If you see something (significant) more or new to dataflow programming, let me know (I don't yet).
Dimitris Andreou
+1  A: 

I have a good "poster child" (I like this term) for you. I assume, you have never seen it before, but you've probably heard it.

I think, almost all the modern digital syntehisers and samplers have some kind of dataflow architecture inside. Let me tell you how they work.

I'm not sure if Roland JV-1080 was the first, but it was the most famous synth with 4-layer sound generator scheme. When you press a key on the keyboard, a Patch is starting. It consist of 1..4 sound generator. A sound generator is a line of components: oscillator, filter, envelope, amp. JV-1080 is able to play 64 sound generator at a time. The active sound generators' output goes into the effect configuration. The sound generator path is "hardwired", you can select the entry points of the effect bus, and amounts.

Roland JV-1080's effect bus have 4 entry points: dry, custom effect, choir, reverb, and there is the main output. The effect bus is fixed, but all the effects' output is wired to all other effects, which is standing rightmost to it, so you can "delete connection" between them by setting the amount to zero.

Alesis QS series (QuadraSynth, QS6-7-8-R and x.1 versions) have near some sound architecture, and the effect system is similar... except, that you can choose one from the 3 FX configuration. One FX config is for organs (QS have incredible Leslie emulation): Leslie, Choir, Reverb; another FX config has two Reverbs. You have more freedom on how to utilize the horsepower of the gear.

These synths are great, but you will forget them, as you meet Clavia Nord Modular. It has no 4-layer architecture, nor FX configs. It comes with a win32 program, a dataflow editor. There are various components: oscillators, filters, envelope generators etc., and you can draw your configuration. You should draw a traditional 4-layer sound generator, but you can even draw a 99-layer one, if you want. It just simply rocks. (Has to say, that DF is not everything: Roland JV has 44.1 kHz sample freq, QS has 48k, Modular has 96k.)

Clavia have anoter line of their synths: Nord Lead. There are Modular's engine inside (the parameters and the sound is the very same), but you can't use the dataflow programmer for that models. They have a fixed path, with plenty of parameters, but you can't change the route. Also, there are Nord Lead patch-set for Modular: all the pathces looks same in the editor, only paremeter varies.

Here's a Modular patch example http://www.clavia.se/pictures/nordmodular/patchwindowlarge.jpg

If you are not statisfied by the synth example, say, because you're C programmer, here is another, which is more familiar:

make -j

It was surprise to me, that make is a dataflow system, so it can run "components" simultaneously, which means faster compiling on multi-core machines. Try it!

ern0
+1  A: 

I have a wrong example, too. It does not implement a clean actor model, and it has no concurrency issues, but it uses DF architecture, and extremly popular: any spreadsheet software (e.g. MS Excel).

When you modify a cell, it sends "recalculate" signal to the cells which have reference to it. Altough, when you are working with a sheet, which becomes bigger and bigger, you can feel the real taste of the dataflow programming - the focus of the work will change:

  • the formula creating looses its initial importance (you will find yourself just cloning the same 3-4 formulas),
  • the layout becomes more important: re-organizing references, splitting long formulas to shorter ones, hiding parameters, finally form a graph from the data.

If we realise, that formulas are components, and references are messages, we get the usual way of dataflow programming: first, we're creating some components, then we're building a dataflow graph with them. If components are too big, we split them into smaller ones. Finally, we're picking a visualization component for an eye-candy result presentation.

ern0
A: 

Check this out: http://www.synthedit.com/

It's an audio related framework and component set for VSTi. I don't figured out how it goes exactly, but it looks like the author releases the software with a standard bunch of components of own, then other folks can attach their ones by compiling DLLs.

Also, I've just caught a guy nearby, we're on the same mailing list, who has created a nice TB303 simulator (famous analog vintage synth), and he's created it using SynthEdit as framework. So, as figure shows, it can be used as framework, there is no technical (nor bizmodel) difficulty.

So, it's worth a look, I've found great implementation practices browsing thru the document. Altough, the site does not contain the word dataflow, and the documentation should be better edited, the spirit of the project is OK. There're a couple of "3rd party" component developers, too. It has nice GUI frontend, at not least.

ern0