views:

286

answers:

4

I read an article that claims that closures (or "blocks") are a useful weapon in the "War on Multicores", because

[...] they allow you to create units of work, which each have their own copy of the stack, and don’t step on each others toes as a result. What’s more, you can pass these units around like they are values, when in actual fact they contain a whole stack of values (pun intended), and executable code to perform some operation.

Now, I am not debating the usefulness of closures in general and possibly also for concurrent programming in a shared-memory model, but what's the difference with a thread that only acts on local data (or processes, or actors, or ...)?

Isn't a closure on its own as useful for concurrent programming as a thread without a scheduler?

What with closures that have non-local side effects?

+1  A: 

Here is a nice definition of closures:

A "closure" is an expression (typically a function) that can have free variables together with an environment that binds those variables (that "closes" the expression).

I think you are confusing definitions, as, in javascript for example, my closures may often have non-local side effects, as I am changing the DOM.

Closures are very useful, which is why C# added them to the language.

In languages such as the functional programming language, they seem to not necessarily create threads, which you have to pay a price for due to context switching, but create light-weight processes. The framework, or compiler, will have control over what to create to ensure that the processor is best utilized.

Whether you write with closures is less important than if you use immutable data.

For example, if I have an application that has no global data, but every thread uses it's own local copy, then it is up to the OS and the scheduler to determine which cores my application will use. Unfortunately, in C/C++ the compilers don't see to know how to do that well, so by moving to FP then we can go with frameworks, such as Erlang, that have been dealing with distributed processing for a long time, and leverage their experience.

Actors, in something like Erlang, will have less overhead than a C/C++ thread, as the switching seems to be faster with actors.

James Black
Non-local side effects is just one issue I have with closures as 'the' useful structure for concurrency. So it is not a problem of my definition of a closure (which is the same as yours). I just seriously doubt that closures by themselves are useful for concurrent programming without additional constraints and mechanisms in place, thereby making them not more (or less) suitable than other constructs for concurrency.
eljenso
There are problems that must be dealt with when using closures, as it is possible for the value you depend on to change, so you have to deal with that.
James Black
I think I mentioned that it takes more than closures to be useful, but closures can be useful to help simplify and generalize code, which is what the article that you referenced was showing, in their example. But, as with anything, you need to know what is going on, so you don't get hurt by the complications of the tool.
James Black
When you mention closures in c#, are you referring to lambda statements?
Robert Harvey
@Robert Harvey - Lambda expression can bind a variable, but that is not always the case, so, in my answer I was not referring to lambda expressions, but to closures, which can be implemented in C# by lambda expressions.
James Black
+3  A: 

The argument is that having closures in your programming language makes it easier to have some work done in another thread. I think the author should have mentioned the importance of higher-order function in that argument.

My favorite introduction to higher-order functions is "Why functional programming matters", I won't try to present a bad replica here.

So using closures doesn't give you parallelism for free if you're going do execute closures in for loops, e.g.

for (int i = 0; i < numElements; i++) {
  result[i] = closure(inputs[i], i);
}

because the language can't tell if closure(a, b) somehow changes other values in the result or inputs arrays. But languages with higher-order functions like map specify that the function passed to map shouldn't look at or change other values in the inputs, and prevent it from affecting other results. So, code like the following, which is common in functional languages, can be parallelized for you, without you needing to create a pool of worker threads and hand off the closure to them:

results = map(closure, inputs, [0..numElements-1]);

In these languages, closures take away the pain of declaring a new function somewhere for short pieces of code. That makes it more fun to use higher-order functions.

The following Haskell code defines a function f that takes a list of numbers and returns a list where each input i is replaced with 2i+1. By saving you the hassle of creating a function to compute 2i+1 this is 1 line of code instead of 2.

f nums = map (\i -> 2*i+1) nums

Again, see "Why functional programming matters" for strong arguments as to how this scales up to real code bases.

Harold L
But still the bottom line is that closures are useful because they are closures (short and sweet) but not because they inherently offer some advantages when doing concurrent programming. There is no connection between closures and concurrency.
eljenso
A: 

I think the general idea is that "closures" in the the language make it easier to move code around, and thus make it easier to move that code to a different thread. In C#, the compiler writes the hidden class and changes local variable reference to refer to member data in that class. Sure, you can do this all on your own, but it's just messy.

Having such features in the language my no means solves all the multi-threading problems; but it does make one aspect easier.

Dan
+1  A: 

This quote from the article packs quite a few misunderstandings into one sentence fragment:

[...] they allow you to create units of work, which each have their own copy of the stack, and don’t step on each others toes as a result.

This is not generally true of languages with closures.

Firstly, more often they have references to the stack, not copies, for the sake of efficiency. And in the majority of languages, you can modify things through a reference. So in those languages, this feature does not provide a way of isolating units of work at all. If anything, it can make it more confusing.

Secondly, in most (sane) languages you can refer to anything that is lexically enclosing a local function. You cannot refer to just anything anywhere on the stack. For example, you cannot dig into the local variables of the function that called the function that called the function... etc. You can only access the variables/parameters of functions declared locally, whose text encloses the text in which the usage occurs. Also local variables and parameters ("on the stack") are not the only things that may be lexically enclosing a function. So "the stack" is the wrong concept to invoke here.

Java is one example of a language that can only take copies of local variables and parameters in its "anonymous inner class" closures. However, it will still take a reference in the case of the this reference of the outer class.

And in the case of closing over this, the inner class will now store an implicit reference to the outer class - nothing to do with the stack, really.

The situation is similar in C#, except local variables and parameters are captured by reference instead of being copied.

var counter = 0;
Repeat(10, () => counter++);

Suppose Repeat is a library function that starts another thread and will now call the Action lambda passed to it every 10 milliseconds. You can hopefully see how this is a very succinct way to create race conditions!

The only kind of language that would avoid this problem would be a pure functional language such as Haskell, but that would not be due to closures - clearly - but instead due to the fact that you can never modify any shared state. (And Haskell still wouldn't avoid the problem entirely; most real software has to interact with shared state external to the program at some point, and Haskell's standard library has a way of doing that).

Daniel Earwicker