views:

592

answers:

3

Hi all,

I wonder if there is any reliable comparison of performance between "modern" multithreading-specialized languages like e.g. scala and "classic" "lower-level" languages like C, C++, Fortran using parallel libs like MPI, Posix or even Open-MP.

Any links and suggestions welcome.

A: 

I'd view such comparisons as a fraction. The numerator is a constant (around 0.00001, I believe). The denominator is the number of threads multiplied by the number of logical processors.

IOW, for a single thread, the comparison has about a one chance in a million of meaning something. For a quad core processor running an application with (say) 16 threads, you're down to one chance in 64 million of a meaningful result.

In short, there are undoubtedly quite a few people working on it, but the chances of even a single result from any of them providing a result that's useful and meaningful is still extremely low. Worse, even if one of them really did mean something, it would be almost impossible to find, and even more difficult to verify to the point that you actually knew it meant something.

Jerry Coffin
*0.00001* Where do you come up with that number?
Paul Nathan
@Paul: Like 97.4% of all statistics, I made it up off the top of my head. Despite this, I'd challenge anybody to try to come up with a number that's more accurate. Regardless of the exact value, it starts out small, and adding more threads and processors reduces it quickly.
Jerry Coffin
So *you invented an answer*? Really? That's piss-poor practice.
Paul Nathan
@Paul:It's called injecting the most minute bit of humor into the answer -- in a way that anybody but a complete ass would find quite easy to recognize at that. Far from being poor practice, I take rather some pride in the degree of inventiveness shown in many of my answers. Anybody who can't (and doesn't) invent answers on a regular basis belongs somewhere else -- the very essence of programming is inventing answers.
Jerry Coffin
Well, Jerry, either you invent correct answers or you don't. You didn't. You pulled an answer out of thin air and made as if it was true. There was no humor implied or given.
Paul Nathan
@Paul:Even for the thoroughly humor impaired such as you appear to be, saying "0.00001, I believe" should be sufficient to indicate that it's barely possible that the number wasn't necessarily a solidly proven fact.
Jerry Coffin
@Jerry: +1 for the "injecting the most minute bit of humor" comment alone.
Billy ONeal
@Billy:It seemed to me that this was an "ASCII a silly question, get a silly ANSI" kind of situation...
Jerry Coffin
Possibly amusing, but it's also wrong. There are plenty of good ways to benchmark performance, and the MPI folks do it routinely, from latency measurements to efficiency with increasing number of processors to breakdowns of communication vs. computation for well-understood algorithms. It's not an easy question to answer, but questions of this sort are answerable and very worth knowing the answer to if you're planning on burning a million hours of CPU time.
Rex Kerr
@Rex:yes, there are lots of ways to benchmark performance. The types of questions you've listed can even be answered with such benchmarks -- but his question is different from those, and isn't answerable. You can't compare Scala to Fortran -- at most, you can compare one particular implementation to another particular implementation. A meaningful answer requires a question that's well-defined and generally quite narrow. His is poorly defined and *extremely* broad, and lacks any meaningful answer.
Jerry Coffin
@Jerry: If you can in principle answer the appropriate pieces that make up an answer (e.g. the ones I listed), the whole question has a meaningful answer. "I don't know where to find the right numbers" is a different answer than "Everything even close to the topic is irrelevant."
Rex Kerr
@Rex: that's the problem: you can't answer (even in principle) the pieces that make up the answer. If (for example) you find that implementation X (using Scala) is 10% faster than implementation Y (using Fortran) you can only **speculate** that the difference you're seeing really has anything to do with Scala vs. Fortran. The next week, Y might be upgraded to be 20% faster than X -- but whether it is not not, it still tells you nothing about Scala vs. Fortran.
Jerry Coffin
@Jerry: It's not that mysterious. You code the same algorithm in both languages. You profile each, compare bottlenecks. Maybe you twiddle one or the other to get better performance. Either way, if you're competent, you'll get good ballpark numbers. Then adjust the task to vary just e.g. the communications volume. Measure again.
Rex Kerr
@Rex:At least to me it's mysterious how you think this constitutes a test of the language instead of a test of one specific implementation of the language.
Jerry Coffin
@Jerry: Care to write another version of Java + JVM from scratch? No? Then I think the specific implementation is important. You don't get vastly improved performance from a new C++ every few weeks, you know. These things are pretty stable.
Rex Kerr
@Rex:If there's any logic in your comment, I must be failing to grasp it. How does the fact that I'd consider it unethical to loose yet another Java compiler on the world change the fact that what you test is an implementation, not a language?
Jerry Coffin
@Jerry: The distinction between a language and the specific implementation you have access to is immaterial if you can't get access to any other implementation.
Rex Kerr
@Rex:Access to only a single implementation might restrict your interest to that implementation, but is completely immaterial to the distinction between the language and the implementation.
Jerry Coffin
@Jerry: If you assume that the best available implementation is really pathetic, I agree with you--it tells you very little. I think we'll just have to disagree on this one.
Rex Kerr
+1  A: 

Here's another non-answer: go to your local supercomputer centre and ask what fraction of the CPU load is used by each language you are interested in. This will only give you a proxy answer to your question, it will tell you what the people who are concerned with high performance on such machines use when tackling the kind of problem that they tackle. But it's as instructive as any other answer you are likely to get for such a broad question.

PS The answer will be that Fortran, C and C++ consume well in excess of 95% of the CPU cycles.

High Performance Mark
+3  A: 

Given that Java, and, therefore, Scala, can call external libraries, and given that those highly specialized external libraries will do most of the work, then the performance is the same as long as the same libraries are used.

Other than that, any such comparison is essentially meaningless. Scala code runs on a virtual machine which has run-time optimization. That optimization can push long-running programs towards greater performance than programs compiled with those other languages -- or not. It depends on the specific program written in each language.

Daniel