views:

478

answers:

9

NOTE: Already made this a Wiki. I don't care what this question is tagged as, as long as there is a good discussion.

I've heard that since in pure functional programs, there are no side effects and values dont mutate, it makes it easier for the compiler to make more runtime optimizations. To what extent is this true?

If this is true, my next concern is what is the loss in freedom that we are trading this for? I mean, in languages like C++/C the developer is totally in control and can tweak a lot of things. If we give away this job to the compiler, we lose that opportunity. The bright side of this is that even a non-expert programmer can write good code. Furthermore, these days with so many layers of cache in the machine architecture, may be even an expert cannot really do anything worthwhile. So, delegating this job to the compiler that knows more about the underlying architecture than the programmer does, is a good idea.

What are your suggestions?

+10  A: 

Have you seen this "good math, bad math" blog post?

I program in C, assembly and OCaml. I sometimes look at the assembly generated by the C compiler. It is dreafully inefficient. Often I notice that a global variable is re-loaded again and again, when it's clear (to someone aware of the logic of the program) that it doesn't change during the execution of the function, but because the function manipulates pointers, the compiler cannot assume that the variable isn't touched. One simple way to fix this when you notice it is to copy the contents of the global into a local at the beginning of the function, this way you keep the portability of C but get assembly-like performance...

But you shouldn't have to do this, and indeed, with a higher-level language, you don't have too, because there are less pointer, casts and other low-level constructs that let you get close to the hardware when you want to but actually get in the way of automatic optimization.

Pascal Cuoq
I think you are missing the point the he is talking about functional programs with no side-effects. Here the complier knows that it is impossible for the variable to change and will therefore optimize for that, I'm certain that doesn't apply for C (unless you use const?)
henrikh
@henrikh Hmm, what? He is talking about OCaml. Search for "OCaml" in the linked web page (pay attention to the benchmark). He is not saying a pip about functional languages with no side-effects, he is talking about OCaml, Fortran and Common Lisp as enabling more optimizations than C. If you think that purity is what enables optimizations, with all due respect, I think you have missed the post's point (and mine).
Pascal Cuoq
I'm talking about ajay who asked the question about side-effects. But I misread your answer, sorry. I completely agree, more control is not the same as better performance.
henrikh
+4  A: 

Check out the Haskell vs C comparison here:

http://www.haskell.org/haskellwiki/Introduction#Quicksort_in_Haskell

You will note that the Haskell verison is much shorter. However, the page goes on to say that while the c version is arguably more complex, it also sorts in place, while the Haskell version must allocate much more memory to work.

So, by definition, if you want the ability to get extreme fine-tuning of the performance, the c algorithm is the better approach.

Robert Harvey
+5  A: 

Just to be clear, just because a compiler for a functional language can optimize better doesn't mean that it actually does. Any crazy optimization by definition leads to unexpected behavior at runtime which almost certainly leads to bugs. So to put a big misconception to rest: in theory Haskell programs can be parallelized 'for free' by the compiler, but in reality they aren't nor should they be.

I think the days of compiler optimizations being king are over. When clockspeed was the main bottleneck for performance optimizations like loop unrolling were essential. However, for the vast majority of applications the problem isn't raw CPU speed but instead other factors such as network bandwidth, disk IO, and/or the speed of the algorithm you are using.

To my knowledge there is no compiler optimization for reimplementing your code to use a parallel design pattern, for example. In other words, if you are using Bubble Sort with every compiler optimization you can think of and I'm using Quick Sort. The smart money is on the Quick Sort implementation in most situations.

There is a lot of value to functional programming, but "because they run faster" shouldn't be a consideration.

Chris Smith
Your answer makes a big deal of parallelization and I'm not sure why because the topic was not mentioned at all in the question. Higher-level (not only functional) languages also have more potential for being automatically vectorized, for instance.
Pascal Cuoq
parallelism is an important factor in optimizing for modern CPUs which are tending to have more and more cores
Peter Recore
@Peter Really?! My point was that Chris's message seemed to be that a high-level language may not parallelize programs automatically. Well, true, but no widespread language at all does this (and certainly not C). So we can consider it's a draw for automatic parallelization and look at other criteria. Actually some compilers already create and join threads automatically, they are just not widespread yet -- and they are for high-level languages, not for C.
Pascal Cuoq
A: 

As a general rule, compiler optimization makes a difference in code that:

  1. It actually compiles. If most of the time the bottom of the call stack is inside code that the compiler never sees, it can optimize all it wants, but will have no effect.

  2. That consumes a significant fraction (like 10% or more) of run-time when the user or others are actually waiting for the program. If nobody's waiting for if, the optimization will never be noticed.

  3. That contains no function calls. If the code being compiled contains function calls, the program counter is almost always elsewhere.

So you'll never notice unless you write heavy crunching algorithms working on lots of data without calling out to code you don't compile. Like the minute you put string-compare into a sort algorithm, optimization is academic.

Mike Dunlavey
+3  A: 

I think you should be talking about language models rather than compiler optimizations. imperative procedural/functional both have their own area where they shine. I will give you two examples:

Imperative - procedural

Erlang ( a functional language). Took the approach of shared state with their built in Mnesia database -- it is inherently non-functional as a update transaction locks a database resource gets a value, changes it, and writes it back. They did this for performance because they recognize that speed is important in this area and if they didn't do this Erlang would be useless for the problems they were trying to solve (can you imagine writing an entire database file out each time you make a change ? ) =D.

Pure Functional

Functional vs. non-function in terms of performance allowed by the model is a funny subject area. Taking the functional viewpoint; Erlang can saturate the core's the machine is running when problems are inherently concurrency oriented. Some tests showed the YAWS web-server handling 100k connections where Apache fell over at 4k :D Ejabberd can also handle MUCH more load than traditional message switches. What I am trying to say is that pure functional language can have a runtime engine which can massively parallelize the application across large amounts of cores. You can't easily do this with imperative procedural code.

Different models are suitable for different problems. I personally think you can optimize procedural code much better than functional code, functional code still executes as procedural code, and it is interpreted :P I have studied compiler theory, and I honestly can't think of mind-blowing code-transformations I would apply to functional code before I executed it. You have tail-recursion and lazy-evaluation, but surely those are features of the programming model.

Hassan Syed
+5  A: 

there are no side effects and values dont mutate, it makes it easier for the compiler to make more runtime optimizations

In a sense, yes - until you realize that side-effects and mutable values are optimizations themselves.

ima
A sufficiently smart compiler could potentially deduce that a return-an-altered-copy function is being called such that the original value will never be used again (due to going out of scope, for instance), and mutate the value as an optimization. The converse, deducing when a potentially mutable value definitely won't change, is *much* harder and likely to be intractable for all but very simple programs.
camccann
Wrong. I won't even go into details, just take classic "immutable quicksort", pretend to be a compiler and try to optimize into something even remotely close to efficient quicksort implementation. And if you return and say it was easy, we'll know you cheated.
ima
By "classic immutable quicksort" do you mean the version operating on a list instead of an indexed array? That's really a different algorithm entirely; you might as well ask the compiler to optimize a bubble sort into a quicksort. If you instead take a language with immutable arrays and a "swap" function that returns an array identical to an existing one except with two elements exchanged, a straightforward translation of the in-place quicksort could be easily demonstrated as allowing safe mutating optimizations.
camccann
Just like I expected, you are trying to cheat. What you are saying is that if you write an imperative program, then translate it to functional language in the most straight forward way - it would be easy to optimize it back to imperative. Correct, but not really relevant
ima
Okay, so writing functionally pure code using immutable data structures, in a way that the compiler can optimize, is cheating and not relevant, even though you can do so without losing any of the benefits of functional purity and immutability? Is that actually, seriously what you're claiming?
camccann
"without losing any of the benefits" parts is where cheating kicks in. Well, it's clear now that you are pretty good at fooling yourself, so I'm out.
ima
I'd rather not be "fooling myself" if I can avoid it, and I'd like to give your argument proper consideration, but it's hard to understand your point when you opt to insult me instead of actually explaining it. If you can't spare the time to elaborate that's fine, but please just say so...
camccann
@ima: Unless your contention is that functional programs cannot use arrays, I don't get what you're trying to say either.
Chuck
I am saying that brevity and power of functional programming only come to play when you use specific set of data structures, algorithms and code organization, which lack performance compared to those, enabled by imperative languages. Functional program can be easily optimized to imperative level of performance only when it was written with this optimization in mind - and that loses all brevity and power and produces a disguised imperative code with clumsy syntax.
ima
Ok, that was last attempt, I'm not really trying to bring someone over
ima
There are, in fact, languages that can see a value is no longer being pointed to and, instead of treating it as copy-on-write, will modify the original value and return that as the "new" value. An example of this is Tcl. I can only assume the same can be (and likely is somewhere) done for functional languages.
RHSeeger
A: 

when it comes to C or C++, there's a lot of compiler optimization involved, and the compiler might decide to overthrow about any of the nice little ideas away, since it can determine much better which optimization to use at a specific place (such as inlining) and takes into account things as instruction pipelines etc. hardware architecture has become far too complex for people to program hardware directly. nowadays even simple compilers employ significant amounts of optimization. using funky tricks is less readable and harder for the compiler to optimize, because manual optimization reduces semantics. so even if you write C or C++, your choice is either to shoot yourself in the foot, or let the compiler do the optimization for it. That's what most of the 9,513,227 lines of code in the GCC are about.

i think, it is very important for software to be verifiable, robust, maintainable, flexible and reusable/extensible. the more expressive and concise a language is, the easier it is for developers to meet these criteria.

there's one last, yet important criterium for software: efficiency. the important question is, how you meassure efficiency. i would measure it in pure speed and scalability. these two are not mutually exclusive, however using a language, that is tightly coupled with concepts of parallel/distributed computing and features language constructs design for that matter, it is much easier to develop such algorithms. of course the thing would run faster, if you'd do all the dispatching yourself, but that's error prone and time consuming.

you should also consider, that compiler optimization has linear effects. if your algorithms or you system design doesn't scale well, it won't solve the problem for you. if it does scale well, you can solve performance issues by throwing in more hardware, which is always cheaper than developement time.

in the end, the choice you make, is to reinvent the wheel to have the optimal wheel to go down the road you want to take, or take an existent wheel, and care more about choosing the right road, instead of the speed, at which you go.

if you write software, the only important decision to take in advance, is design ... then, you implement it, and once it is implemented, you will be able to determine the actual bottlenecks, and then optimize there ... usually, more than 90% of your code is run less 10% of the time ... optimization is only important for the remaining 10% ... to determine them, you need working software that you can profile ... and you need it quick, so you can reconsider your design, before writing something that is super optimized, but crap otherwise ...

back2dos
+1  A: 

This is just an anecdotal contribution:

I taught myself (some) Clojure using the math problems from Project Euler. I solved some problems using lazy sequences.

I found that, when I applied type hints, performance for long calculations (where startup doesn't matter) was usually comparable with hand-tuned Java.

This is almost to be expected: Clojure compiles ahead-of-time to JVM bytecode, and that runs with the same optimizing runtime as Java code.

What amazed me for a while was that some calculations were occupying multiple cores, although I hadn't (knowingly) programmed in any parallelization. It seems that the runtime is able to take the opportunity to parallelize lazy sequence generation and consumption.

My conclusion: Clean functional programming need not be slower than imperative programming, and may sometimes find unexpected avenues of optimization or parallelization.

As someone who hates to think about concurrency because it makes my brain hurt, I consider "automatic parallelization" a priceless benefit!

Carl Smotricz
@Carl, can you tell me what that particular project euler problem was? I would also like to try that out and see it for myself. And yes, Project Euler seems to be a good place to practice Clojure too.
ajay
Sorry, it's been a while. But I only did 25 or so problems, starting from 1. I think it involved prime numbers, but I'm not sure.
Carl Smotricz
+1  A: 

Here's an example in Clojure -- iterating over a list of things using some function.

(map some-fn big-list)

What's cool is that you can parallelize it (multiple cores, not multiple machines) with one character:

(pmap some-fn big-list)

The language/compiler machinery that makes pmap work would not be possible without pure functions and immutable data structures.

This is still an experimental feature. But functional, side-effect-free programming does make multi-core execution much easier.

Stuart Sierra