views:

7419

answers:

18

From time to time I read that Fortran is or can be faster then C for heavy calculations. Is that really true? I must admit that I hardly know Fortran, but the Fortran code I have seen so far did not show that the language has features that C doesn't have.

If it is true, please tell me why. Please don't tell me what languages or libs are good for number crunching, I don't intend to write an app or lib to do that, I'm just curious.

+3  A: 

Any speed differences between Fortran and C will be more a function of compiler optimizations and the underlying math library used by the particular compiler. There is nothing intrinsic to Fortran that would make it faster than C.

Anyway, a good programmer can write Fortran in any language.

Kluge
@Scott Clawson: you got -1'd and I don't know why. +1'd to remedy this. However, something to take into account is Fortran has been around longer than a lot of our parents have been. Lots of time spent optimizing compiler output :D
sixlettervariables
@sixlettervariables - good on you sir! Beat me to it. Sometimes SO is strange...
freespace
I agree. I had just posted a very similar answer in parallel.
Tall Jeff
Pointer alias issue in C has been raised by others, but there are several methods the programmer can use on modern compilers to deal with that, so I still agree.
Tall Jeff
+1  A: 

This is more than somewhat subjective, because it gets into the quality of compilers and such more than anything else. However, to more directly answer your question, speaking from a language/compiler standpoint there is nothing about Fortran over C that is going to make it inherently faster or better than C. If you are doing heavy math operations, it will come down to the quality of the compiler, the skill of the programmer in each language and the intrinsic math support libraries that support those operations to ultimately determine which is going to be faster for a given implementation.

EDIT: Other people such as @Nils have raised the good point about the difference in the use of pointers in C and the possibility for aliasing that perhaps makes the most naive implementations slower in C. However, there are ways to deal with that in C99, via compiler optimization flags and/or in how the C is actually written. This is well covered in @Nils answer and the subsequent comments that follow on his answer.

Tall Jeff
It sounds like a benchmark test of an algorithm. Which takes less time, FORTRAN or C? Doesn't sound subjective to me. Perhaps I'm missing something.
S.Lott
Disagree. You are comparing the compilers, not the languages. I think the original question is if there is anything about the LANGUAGE that makes it inherently better. Other answers here are getting into some of the subtle questionable differences, but I think we most agree they are in the noise.
Tall Jeff
This isn't O(n) analysis of algorithms. It's Performance. Don't see how performance can be a hypothetic implementation-independent concept. Guess I'm missing something.
S.Lott
+54  A: 

The languages have similar feature-set. The performance difference come from the fact that fortran has different and stronger aliasing rules for memory pointers.

This allows the compiler to generate more efficient code. Take a look at this little example in C:

void transform (float *output, float * input, float * matrix, int n)
{
  int i;
  for (i=0; i<n; i++)
  {
    float x = input[i*2+0];
    float y = input[i*2+1];
    output[i*2+0] = matrix[0] * x + matrix[1] * y;
    output[i*2+1] = matrix[2] * x + matrix[3] * y;
  }
}

This function would run slower than the fortran counterpart. Why so? If you write values into the output array you may change the values of matrix. After all the pointers could overlap and point to the same chunk of memory. The C-compiler is forced to reload the four matrix values from memory for all computations.

In fortran the compiler will load the matrix values once and store them in registers. It can do so because in fortran pointers/arrays cannot overlap in memory.

Fortunately the restrict keyword has been introduced to the C99 standard to solve that problem. It's well supported in most C++ compilers these days. That keyword allows you to give the compiler a hint that a pointer does not alias with any other pointer.

If you use them you will get the same speed from C and Fortran.

Nils Pipenbrinck
Aside from the restrict keyword, most compilers have had command line optimization flags to instruct the compiler to not worry about aliasing. Since doing actual aliasing is pretty rare, I consider that level of optimization the starting point default.
Tall Jeff
Additionally, without the flag or the new keyword, in C just assign pointer based operations through an automatic variable. This hints to the compiler that aliasing can be ignored and while it looks like more code, you'll actually end up with the fastest possible result through the optimizer
Tall Jeff
"rare" may or may not be a good reason to have flaky nondeterministic bugs. I've seen it in cases where a function is used to process data either in-place or out-of-place.
Jamie
All true and valid, jeff. However, I don't consider the "assume no aliasing"-switch safe. It can break code inherited from other projects in so subtle ways that I'd rather not use it. I've become a restrict-nazi for that reason :-)
Nils Pipenbrinck
Other team members that aren't yet that experienced are a risk as well if you force no aliasing. They may write perfectly fine code that does rely on aliasing to work.
Nils Pipenbrinck
Fortran is also able to automatically split calculations like this over multiple cores/CPUS because it knows that the parts of an arry cannot overlap.
Martin Beckett
@mgb: with the appropriate markup in place to enable these features.
sixlettervariables
To my second point, you don't have to use the no alias compiler switch. Just write the code so that the pointer based loads are assigned into an automatic variable first and then work with the automatics from there. It will look more verbose, but it will optimize down perfectly by the compiler.
Tall Jeff
Jeff, I agree on the auto variables and I think it's also good practice to write code that way. It not only hints the compiler, it makes clear that you don't assume aliasing as well. Nice to have if you have to touch the code a year after writing.
Nils Pipenbrinck
I love these kinds of discussions - most people (in general) don't appreciate the subtle issues on this kind of stuff. Good exchange of thoughts - Thanks!
Tall Jeff
A good example is the mere existence of memcpy() vs. memmove(). Unlike memcpy(), memmove() copes with overlapping areas, therefore memcpy() can be faster then memmove(). This issue was sufficient reason for somebody to include two function instead of one into the standard library.
J.F. Sebastian
Memcpy() vs. Memmove() is NOT a good example really. Different algorithms being used. memcpy() is a simpler (trivial) algorithm BECAUSE by definition it assumes the areas do not overlap. memmove() has to do more checks and a more complicated order of copy ops BECAUSE the areas MAY overlap.
Tall Jeff
I think that was Sebastian's point - because of the complexity/flexibility of C memory handling, even something as simple as moving memory is tricky.
Martin Beckett
My point is to copy an array in C one can use both memcpy() and memmove(). The choice depends on context. And in Fortran one have to make no such choice due to stronger aliasing rules.
J.F. Sebastian
I'd like to see some benchmark evidence that Fortran is faster than C. Don't suppose you have any refs?
Mitch Wheat
You didn't mention the other thing in your C code that is slower: The exit condition in the C version has to be rechecked every iteration.In your example that could be optimized out, but it couldn't if there was something in the loop potentially modifying n (or i).
T.E.D.
ted, if the compiler can figure out how to put it outside the loop, it is allowed to do so, because of the as-if rule, whether or not the n is modified doesn't matter
Johannes Schaub - litb
Just to be contrary, let me point out that if routine A contains a loop, and routine B is called inside the loop, B probably takes most of the cycles. Optimizing the loop in A buys you very little.
Mike Dunlavey
As I recall, the EQUIVALENCE keyword, allowed you to some kinds of array overlaps. Would that have an effect on the aliasing in a fortran program?
EvilTeach
A: 

I haven't heard that Fortan is significantly faster than C, but it might be conceivable tht in certain cases it would be faster. And the key is not in the language features that are present, but in those that (usually) absent.

An example are C pointers. C pointers are used pretty much everywhere, but the problem with pointers is that the compiler usually can't tell if they're pointing to the different parts of the same array.

For example if you wrote a strcpy routine that looked like this:

strcpy(char *d, const char* s)
{
  while(*d++ = *s++);
}

The compiler has to work under the assumption that the d and s might be overlapping arrays. So it can't perform an optimization that would produce different results when the arrays overlap. As you'd expect, this considerably restricts the kind of optimizations that can be performed.

[I should note that C99 has a "restrict" keyword that explictly tells the compilers that the pointers don't overlap. Also note that the Fortran too has pointers, with semantics different from those of C, but the pointers aren't ubiquitous as in C.]

But coming back to the C vs. Fortran issue, it is conceivable that a Fortran compiler is able to perform some optimizations that might not be possible for a (straightforwardly written) C program. So I wouldn't be too surprised by the claim. However, I do expect that the performance difference wouldn't be all that much. [~5-10%]

Pramod
+2  A: 

Generally FORTRAN is slower than C. This is true for almost everything. I've used FORTRAN on and off since the '70's. (Really.)

However, starting in the 90's FORTRAN has evolved to include specific language constructs that can be optimized into inherently parallel algorithms that can really scream on a multi-core processor. For example, automatic Vectorizing allows multiple processors to handle each element in a vector of data concurrently. 16 processors -- 16 element vector -- processing takes 1/16th the time.

In C, you have to manage your own threads and design your algorithm carefully for multi-processing, and then use a bunch of API calls to make sure that the parallelism happens properly.

In FORTRAN, you only have to design your algorithm carefully for multi-processing. The compiler and run-time can handle the rest for you.

You can read a little about High Performance Fortran, but you find a lot of dead links. You're better off reading about Parallel Programming (like OpenMP.org) and how FORTRAN supports that.

S.Lott
@S.Lott: I couldn't imagine how awful C code would have to look to do as good as simply written Fortran for most of the codes we have here...and I'm a C programmer. You'll get better performance out of simpler code in Fortran. Not that you or I couldn't find a counterexample. :D
sixlettervariables
No compiler is going to spread a computation on 16 elements of a vector to 16 different CPUs. That'd be hundreds of times slower...
Greg Rogers
@sixlettervariales: Yup. Fortran does it's thing well. C does something else well.
S.Lott
@Greg Rogers: You'll have to take your issue up with Fortran Vectorization people, not me. I'm just reporting what I read. http://www.polyhedron.com/absoftlinux
S.Lott
+4  A: 

There is nothing about the languages Fortran and C which makes one faster than the other for specific purposes. There are things about specific compilers for each of these languages which make some favorable for certain tasks more than others.

For many years, Fortran compilers existed which could do black magic to your numeric routines, making many important computations insanely fast. The contemporary C compilers couldn't do it as well. As a result, a number of great libraries of code grew in Fortran. If you want to use these well tested, mature, wonderful libraries, you break out the Fortran compiler.

My informal observations show that these days people code their heavy computational stuff in any old language, and if it takes a while they find time on some cheap compute cluster. Moore's Law makes fools of us all.

jfm3
*Almost* upmodded this. The problem is that Fortran *does* have some inherent advantages. However, you are quite corrent that the important thing to look at is the compier, not the language.
T.E.D.
+5  A: 

There are several reasons why Fortran could be faster. However the amount they matter is so inconsequential or can be worked around anyways, that it shouldn't matter. The main reason to use Fortran nowadays is maintaining or extending legacy applications.

  • PURE and ELEMENTAL keywords on functions. These are functions that have no side effects. This allows optimizations in certain cases where the compiler knows the same function will be called with the same values. Note: GCC implements "pure" as an extension to the language. Other compilers may as well. Inter-module analysis can also perform this optimization but it is difficult.

  • standard set of functions that deal with arrays, not individual elements. Stuff like sin(), log(), sqrt() take arrays instead of scalars. This makes it easier to optimize the routine. Auto-vectorization gives the same benefits in most cases if these functions are inline or builtins

  • Builtin complex type. In theory this could allow the compiler to reorder or eliminate certain instructions in certain cases, but likely you'd see the same benefit with the struct { double re, im; }; idiom used in C. It makes for faster development though as operators work on complex types in fortran.

Greg Rogers
+5  A: 

I think the key point in favor of Fortran is that it is a language slightly more suited for expressing vector- and array-based math. The pointer analysis issue pointed out above is real in practice, since portable code cannot really assume that you can tell a compiler something. There is ALWAYS an advantage to expression computaitons in a manner closer to how the domain looks. C does not really have arrays at all, if you look closely, just something that kind of behaves like it. Fortran has real arrawys. Which makes it easier to compile for certain types of algorithms especially for parallel machines.

Deep down in things like run-time system and calling conventions, C and modern Fortran are sufficiently similar that it is hard to see what would make a difference. Note that C here is really base C: C++ is a totally different issue with very different performance characteristics.

jakobengblom2
+20  A: 

Yes, in 1980; in 2008? depends

When I started programming professionally the speed dominance of Fortran was just being challenged. I remember reading about it in Dr. Dobbs and telling the older programmers about the article--they laughed.

So I have two views about this, theoretical and practical. In theory Fortran today has no intrinsic advantage to C/C++ or even any language that allows assembly code. In practice Fortran today still enjoys the benefits of legacy of a history and culture built around optimization of numerical code.

Up until and including Fortran 77, language design considerations had optimization as a main focus. Due to the state of compiler theory and technology, this often meant restricting features and capability in order to give the compiler the best shot at optimizing the code. A good analogy is to think of Fortran 77 as a professional race car that sacrifices features for speed. These days compilers have gotten better across all languages and features for programmer productivity are more valued. However, there are still places where the people are mainly concerned with speed in scientific computing; these people most likely have inherited code, training and culture from people who themselves were Fortran programmers.

When one starts talking about optimization of code there are many issues and the best way to get a feel for this is to lurk where people are whose job it is to have fast numerical code. But keep in mind that such critically sensitive code is usually a small fraction of the overall lines of code and very specialized: A lot of Fortran code is just as "inefficient" as a lot of other code in other languages and optimization should not even be a primary concern of such code.

A wonderful place to start in learning about the history and culture of Fortran is wikipedia. The Fortran Wikipedia entry is superb and I very much appreciate those who have taken the time and effort to make it of value for the Fortran community.

(A shortened version of this answer would have been a comment in the excellent thread started by Nils but I don't have the karma to do that. Actually, I probably wouldn't have written anything at all but for that this thread has actual information content and sharing as opposed to flame wars and language bigotry, which is my main experience with this subject. I was overwhelmed and had to share the love.)

jaredor
+3  A: 

There is another item where Fortran is different than C - and potentially faster. Fortran has better optimization rules than C. In Fortran, the evaluation order of an expressions is not defined, which allows the compiler to optimize it - if one wants to force a certain order, one has to use parentheses. In C the order is much stricter, but with "-fast" options, they are more relaxed and "(...)" are also ignored. I think Fortran has a way which lies nicely in the middle. (Well, IEEE makes the live more difficult as certain evaluation-order changes require that no overflows occur, which either has to be ignored or hampers the evaluation).

Another area of smarter rules are complex numbers. Not only that it took until C 99 that C had them, also the rules govern them is better in Fortran; since the Fortran library of gfortran is partially written in C but implements the Fortran semantics, GCC gained the option (which can also be used with "normal" C programs):

-fcx-fortran-rules Complex multiplication and division follow Fortran rules. Range reduction is done as part of complex division, but there is no checking whether the result of a complex multiplication or division is "NaN + I*NaN", with an attempt to rescue the situation in that case.

The alias rules mentioned above is another bonus and also - at least in principle - the whole-array operations, which if taken properly into account by the optimizer of the compiler, can lead faster code. On the contra side are that certain operation take more time, e.g. if one does an assignment to an allocatable array, there are lots of checks necessary (reallocate? [Fortran 2003 feature], has the array strides, etc.), which make the simple operation more complex behind the scenes - and thus slower, but makes the language more powerful. On the other hand, the array operations with flexible bounds and strides makes it easier to write code - and the compiler is usually better optimizing code than a user.

In total, I think both C and Fortran are about equally fast; the choice should be more which language does one like more or whether using the whole-array operations of Fortran and its better portability are more useful -- or the better interfacing to system and graphical-user-interface libraries in C.

A: 

I compare speed of Fortran, C, and C++ with the classic Levine-Callahan-Dongarra benchmark from netlib. The multiple language version, with OpenMP, is http://sites.google.com/site/tprincesite/levine-callahan-dongarra-vectors The C is uglier, as it began with automatic translation, plus insertion of restrict and pragmas for certain compilers. C++ is just C with STL templates where applicable. To my view, the STL is a mixed bag as to whether it improves maintainability.

There is only minimal exercise of automatic function in-lining to see to what extent it improves optimization, since the examples are based on traditional Fortran practice where little reliance is place on in-lining.

The C/C++ compiler which has by far the most widespread usage lacks auto-vectorization, on which these benchmarks rely heavily.

Re the post which came just before this: there are a couple of examples where parentheses are used in Fortran to dictate the faster or more accurate order of evaluation. Known C compilers don't have options to observe the parentheses without disabling more important optimizations.

+2  A: 

There is no such thing as one language being faster than another, so the proper answer is no.

What you really have to ask is "is code compiled with Fortran compiler X faster than equivalent code compile with C compiler Y?" The answer to that question of course depends on which two compilers you pick.

Another question one could ask would be along the lines of "Given the same amount of effort put into optimizing in their compilers, which compiler would produce faster code?" The answer to this would be Fortran. Fortran compilers have certian advantages:

  • Fortran had to compete with Assembly back in the day when some vowed never to use compilers, so it was designed for speed. C was designed to be flexible.
  • Fortran's niche has been number crunching. In this domain code is never fast enough. So there's always been a lot of pressure to keep the language efficient.
  • Most of the research in compiler optimizations is done by people interested in speeding up Fortran number crunching code, so optimizing Fortran code is a much better known problem than optimizing any other compiled language, and new innovations show up in Fortran compilers first.
  • Biggie: C encourages much more pointer use than Fortran. This drasticly increases the potential scope of any data item in a C program, which makes them far harder to optimize. Note that Ada is also way better than C in this realm, and is a much more modern OO Language than the commonly found Fortran77. If you want an OO langauge that can generate faster code than C, this is an option for you.
  • Due again to its number-crunching niche, the customers of Fortran compilers tend to care more about optimization than the customers of C compilers.

However, there is nothing stopping someone from putting a ton of effort into their C compiler's optimization, and making it generate better code than their platform's Fortran compiler. In fact, the larger sales generated by C compilers makes this scenario quite feasible

T.E.D.
+1  A: 

Two answers:

  1. Since all compilers just generate assembly language code, the question is which compiler generates better code. For some types of looping algorithms over arrays, some Fortran compilers may squeeze out a few cycles.

  2. It doesn't really matter, because compiler optimization only helps in code where the PC actually spends time. If you're got subroutine A spending all its time calling subroutine B, you can optimize A all you want, and never see a difference. Most apps I've seen spend the bulk of their time in library routines that almost never get compiled, so the point is moot.

Fortran annoys me because it scrambles the code trying to "optimize" it, which has no effect due to point 2, but it does make it really hard to debug.

Mike Dunlavey
+2  A: 

To some extent Fortran has been designed keeping compiler optimization in mind. The language supports whole array operations where compilers can exploit parallelism (specially on multi-core processors). For example,

Dense matrix multiplication is simply: matmul(a,b)

L2 norm of a vector x is: sqrt(sum(x**2))

Moreover statements such as FORALL, PURE & ELEMENTAL procedures etc. further help to optimize code. Even pointers in Fortran arent as flexible as C because of this simple reason.

The upcoming Fortran standard (2008) has co-arrays which allows you to easily write parallel code. G95 (open source) and compilers from CRAY already support it.

So yes Fortran can be fast simply because compilers can optimize/parallelize it better than C/C++. But again like everything else in life there are good compilers and bad compilers.

Just read about the ELEMENTAL keyword and... wow.
Jeffrey Hantin
+2  A: 

Note that you don't need to write your program in Fortran if all you want to do is call some Fortran libraries. One can easily call Fortran code from C, all you need to remember about is name mangling, passing every variable by reference and different matrix ordering.

quant_dev
Right. Also, strings are a little weird.
Mike Dunlavey
+2  A: 

Most of the posts already present compelling arguments, so I will just add the proverbial 2 cents to a different aspect.

Being fortran faster or slower in terms of processing power in the end can have its importance, but if it takes 5 times more time to develop something in Fortran because:

  • it lacks any good library for tasks different from pure number crunching
  • it lack any decent tool for documentation and unit testing
  • it's a language with very low expressivity, skyrocketing the number of lines of code.
  • it has a very poor handling of strings
  • it has an inane amount of issues among different compilers and architectures driving you crazy.
  • it has a very poor IO strategy (READ/WRITE of sequential files. Yes, random access files exist but did you ever see them used?)
  • it does not encourage good development practices, modularization.
  • effective lack of a fully standard, fully compliant opensource compiler (both gfortran and g95 do not support everything)
  • very poor interoperability with C (mangling: one underscore, two underscores, no underscore, in general one underscore but two if there's another underscore. and just let not delve into COMMON blocks...)

Then the issue is irrelevant. If something is slow, most of the time you cannot improve it beyond a given limit. If you want something faster, change the algorithm. In the end, computer time is cheap. Human time is not. Value the choice that reduces human time. If it increases computer time, it's cost effective anyway.

Stefano Borini
A: 

The faster code is not really up to the language, is the compiler so you can see the ms-vb "compiler" that generates bloated, slower and redundant object code that is tied together inside an ".exe", but powerBasic generates too way better code. Object code made by a C and C++ compilers is generated in some phases (at least 2) but by design most Fortran compilers have at least 5 phases including high-level optimizations so by design Fortran will always have the capability to generate highly optimized code. So at the end is the compiler not the language you should ask for, the best compiler i know is the Intel Fortran Compiler because you can get it on LINUX and Windows and you can use VS as the IDE, if you're looking for a cheap tigh compiler you can always relay on OpenWatcom.

More info about this: http://ed-thelen.org/1401Project/1401-IBM-Systems-Journal-FORTRAN.html

JPerez45
A: 

I was doing some extensive mathematics with FORTRAN and C for a couple of years. From my own experience I can tell that FORTRAN is sometimes really better than C but not for its speed (one can make C perform as fast as FORTRAN by using appropriate coding style) but rather because of very well optimized libraries like LAPACK, and because of great parallelization. On my opinion, FORTRAN is really awkward to work with, and its advantages are not good enough to cancel that drawback, so now I am using C+GSL to do calculations.

corydalus