views:

233

answers:

4

I hear about the manifold increase in productivity, while using certain languages (RoR). I have also heard about some VMs being more optimal than others (GHC?). Yet others are trying to optimize their language of choice by improving the underlying architecture (Unladen Swallow)

However, while reading a paper ("SSA is functional programming"), I had a question on whether a particular language, by the virtue of its syntax , can (someday) be the language with the best performance.

I guess what I am asking is that, whether a particular syntax, is THEORETICALLY the best geared one to produce the best machine code. I would be very interested in the underlying theory for any opinions - I was discussing this with some friends and we were knocking about ideas about the information content of a particular syntax.

Please do note I am talking about languages which have atleast first class functions - no ASM please.

+3  A: 

This is highly subjective

The syntax of a language is just a method of expressing desired semantics. It is the semantics that drive performance. The 'performance implications of syntax' is equal to the performance implications of semantics given that past syntax analysis the syntax is often irrelevant.

The performance implication of semantics comes down to the environment that those semantics are being run in. This is why we have a CPU and a GPU, because they each can perform the semantics of a given low-level language quicker.

There is not really an answer to this without explicitly stating the target environment. A cluster of machines will deal better with concurrent programs, and there are syntax that express concurrency such as Erlang.

What you should be focusing on perhaps is how a generalized virtual machine or environment can provide the best performance for a wide range of semantics. For example, if you ported the Erlang syntax to JVM, could the virtual machine recognize that the language was single-assignment and concurrent without the requirement of locks? Could it optimize for this? Stackless virtual-machines are a good example of attempts to make efficient a generalized environment dependent on required semantics.

Really the question is: can the environment be optimized for a class or constrained set of semantics, when the environment is by definition, general?

I would recommend learning a bit about compilers (and where syntax stops mattering) and then look at something like LLVM, then re-ask yourself the question. As to if function languages are more suited to performance depends on the environment the translation is being executed in (multi-core, distributed, small-embedded device).

Aiden Bell
I do not agree entirely, but I can see your point. For example, take a look at this paper (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.4503 ) which restricts the syntax of the language - so that certain optimizations can be applied. I completely understand your emphasis on semantics, but I am not worried about implementing syntax-A on virtualmachine-B. Let us say there was one holy VM and you had syntax-A, syntax-B... syntax-N. Which syntax could theoretically push the underlying VM to its peak?As to environment - I am seeking single core, desktop applications.
Sandeep
I think you might be missing my point. Image for a second that the CPU (x86) is a holy-vm, which language can be compiled into the most efficient machine code? Now imagine that the holy-vm is a Cell processor, what now? Given that this 'holy VM' does not exist, and all environments are built to do certain things well, the point is moot imho. But, I am prepared to accept I might well be missing the point myself.
Aiden Bell
The paper you cited seems to be about efficient translation of a syntax into SSA form. In this case, it is about the best syntax for translation into this form. I could design an intermediate form based on a stackless machine, then think about what syntax would best suit this. Syntax and semantics are about context. There is no single answer unless you specify a target architecture and language purpose.
Aiden Bell
A: 

Well GHC at least compiles Haskell code into C and then compiles that, so it can't possibly be faster than the same algorithm written in C, right? At the very best it can be just as efficient, and that's only with a good optimizing compiler. The code generated from the Haskell->C conversion is very inflexible, big blocks of code repeated for each instruction so it may very well be worse.

Now this being said, I think a language's libraries are what dictate what the language is best suited for more than the actual language itself (assuming it has acceptable performance), so this isn't the very best question to ask.

Blindy
GHC only compiles via C on architectures for which a native code generator hasn't been written yet, or if you use the flag -fvia-C. http://www.haskell.org/ghc/docs/latest/html/users_guide/options-phases.html
Nathan Sanders
Turns out you're right but taken from the same page you linked, "-fasmUse GHC's native code generator rather than compiling via C. This will compile faster (up to twice as fast), but may produce code that is slightly slower than compiling via C. -fasm is the default.". The last part seems to imply that ghc is still slower, even when compiled natively. My initial point stands I think.
Blindy
Just because a compiler compiles to a language doesn't mean it can't be faster than the underlying language. For example, the higher compiler might generate C that is unfeasible for any human to generate. This is apart from higher level optimizations that just aren't visible in C.
MichaelGG
+3  A: 

From a blog entry about c vs. ocaml:

The Objective-Caml bytecode interpreter was faster than the carefully hand-optimized C program! Why? Because the OCaml compiler could recognize that the arrays were completely independent - it didn't need to worry about one iteration of the loop stepping on the values used by another. The C compiler couldn't apply a lot of useful optimizations, because it couldn't be sure that they were valid.

It's not so much about syntax, though.

The MYYN
@MYYN, I like this example.
Aiden Bell
Very nice example. But the C99 spec added a new keyword, `restrict`, which would allow a C compiler to assume that the arrays are completely independent, so that a C99 compiler could generate code equally fast or faster than the OCaml. A great explanation is here: http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html
steveha
+2  A: 
Norman Ramsey