views:

48

answers:

3

Instead of adding the PLinq extension statement AsParallel() manually could not the compiler figure this out for us auto-magically? Are there any examples where you specifically do not want parallelization if the code supports it?

+1  A: 

I do research in the area of automatic parallelization. This is a very tricky topic that is the subject of many Ph.D. dissertations. Static code analysis and automatic parallelization has been done with great success for languages such as Fortran, but there are limits to the compiler's ability to analyze inter-regional dependencies. Most people would not be willing to sacrifice the guarantee of code correctness in exchange for potential parallel performance gains, and so the compiler must be quite conservative in where it inserts parallel markers.

Bottom line: yes, a compiler can parallelize code. But a human can often parallelize it better, and having the compiler figure out where to put the markers can be very, very, very tricky. There are dozens of research papers available on the subject, such as the background work for the Mitosis parallelizing compiler or the D-MPI work.

Borealid
I guess the trickyness of the problem depends some on the language in use. Would it be easier to parallelize functional languages such as F# because of their inherit immutability.
TT
@TT, F# is a bad example, because it is not "inherently immutable", rather immutable is the default (and of course it can call out to other CLR languages seamlessly). That said, there are definitely language design consideration like mutability (or lack thereof) that can help make "magical" parallelization easier
Logan Capaldo
Does languages that are inherently immutable exist?
TT
@TT: Of course. Haskell, Miranda, SASL, KRC, Hope, Clean, CAL, ATL, Guru, Agda, Amanda, Curry, Pure and many others.
Jörg W Mittag
+2  A: 

Auto parallelization is trickier than it might initially appear. It's that "if the code supports it" part that gets you. Consider something like

counter = 0

f(i) 
{
   counter = counter + 1
   return i + counter
}

result = map(f, {1,2,3,4})

If the compiler just decides to parallelize map here, you could get different results on each run of the program. Granted, it is obvious that f doesn't actually support being used in this manner because it has a global variable. However if f is in a different assembly the compiler can't know that it is unsafe to parallelize it. It could introspect the assembly f is in at runtime and decide then, but then it becomes a question of "is the introspection fast enough and the dynamic parallelization fast enough to not negate the benefits from doing this?" And sometimes it may not be possible to introspect, f could be a P/Invoked function, that might actually be perfectly suited to parallelization, but since the runtime/compiler has no way of knowing it has to assume it can't be. This is just the tip of the iceberg.

In short, it is possible, but difficult, so the trade-off between implementing the magic and the benefits of the magic were likely skewed too far in the wrong direction.

Logan Capaldo
+2  A: 

The good news is that the compiler researchers say that we are really close to having automatically parallelizing compilers. The bad news is that they have been saying that for fifty years.

The problem with impure languages such as C# is usually that there is not enough parallelism. In an impure language, it is very hard for a human and pretty much impossible for a program to figure out if and how two pieces of code interact with each other or with the environment.

In a pure, referentially transparent, language, you have the opposite problem: everything is parallelizable, but it usually doesn't make sense, because scheduling a thread takes way longer than just simply executing the damn thing.

For example, in a pure, referentially transparent, functional language, if I have something like this:

if a <= b + c
  foo
else
  bar
end

I could fire up five threads and compute a, b, c, foo and bar in parallel, then I compute the + and then the <= and lastly, I compute the if, which simply means throwing away the result of either foo or bar, which both have already been computed. (Note how this depends on being functional: in an impure language, you cannot simply compute both branches of an if and then throw one away. What if both print something? How would you "unprint" that?)

But if a, b, c, foo and bar are really cheap, then the overhead of those five threads will be far greater than the computations themselves.

Jörg W Mittag
Anyways, I hope there will be intelligent compilers taking care of these problems because I as a programmer should only specify what should be done and not have to worry about how (I have enough of other problems to worry about :)). Could be a chance that this will happen now that multi-cores are more and more common
TT