views:

169

answers:

6
+3  Q: 

Increasing Speed

A broad question I know but:

Does anyone have general tips on increasing execution speed in Fortran programs?

+11  A: 

A broad answer for the broad quesiton:

while (the speed is not satisfied)
    Use a profile to find the bottle neck 
    optimize that part of code.
pierr
A: 

Nothing fortran specific, beyond:

  • optimize your algorithms
  • optimize data access patterns
  • use a state-of-the-art compiler, e.g. one that supports OMP
  • consider moving highly-critical code to an environment that gives you more options - e.g. to C/C++ code to take advantage of thread parallelization and SIMD instructions)

There is also some material available, googling fortran optimization e.g. turns up e.g. this (PDF) and this. However, be careful with older literature and their assumptions: not long ago, optimization guides for many platforms (rightfully) assumed that memory was scarce, memory access was cheap and instructions were expensive. Not so anymore.

peterchen
+1  A: 

This is a very broad field, but ...

  • If you're doing matrix arithmetic, consider looking into off-the-shelf libraries for this. They are probably faster and some support multithreading, which will give you a performance boost on multiprocessor machines.

  • Profiling, as pierr suggests. This will tell you where your program is actually spending its time. Knowing this lets you focus your attention on the bits that actually need tuning.

  • Cache line and word alignment plus optimising chunks to fit into processor caches. These are viewed as more germane to C programming as it's easier to control this sort of thing with C. However, the same issues can cause problems with FORTRAN programs for much the same reasons.

    The cache miss penalty on a modern CPU is very large and optimising for cache usage can make an order-of-magnitude difference in some cases. If you identify this as an issue you may want to re-write the core computation in C to give you more fine-grained control over the data structures.

  • If you are REALLY CPU bound you may get some mileage from techniques like GPU programming.

ConcernedOfTunbridgeWells
Using pre-existing, optimized libraries for doing linear algebra/matrix operations can make an amazing difference.
Tim Whitcomb
+4  A: 

As others have suggested profile your code before thinking of modifying it.

BUT the single best thing you can do is to read the compiler manual closely, line by line, word by word, and pay careful attention to all the options it gives you. In my experience (I have a lot of experience in HPC for computational electromagnetics, not that you should believe what you read here !) you get most bangs for your bucks in performance optimisation by intelligent use of the compiler.

Once you've exhausted the possibilities of the compiler (and as one of the other respondents suggested make sure you have a good compiler -- they're not expensive and I get c40% decrease in execution time for most programs going from g95 to a paid-for compiler) then you should NOT start doing things like:

-- loop unrolling;

-- instruction re-ordering;

-- function inlining;

-- other stuff which we used to do all the time way back in the day.

Most of this code-tweaking stuff is now done, better than us carbon-based life-forms can do it, by good optimising compilers.

If you must tinker, tinker with memory access -- for example tile your access to arrays to take advantage of cache. If you do this, parameterise your tile sizes (etc) so that when, next year, or the year after, you move it to a different architecture you only have to tweak a few parameters rather than modify the code again.

Finally, have fun, optimising the performance of Fortran programs is a great way to spend your working day !

Regards

Mark.

High Performance Mark
+1  A: 

Just because no one mentioned it:

  • Buy a faster machine

(Please, don't hit me :-) ...)

MartinStettner
Good answer Martin, that's another cost-effective approach ! Given what we HPC experts cost per day, USD (or EUR or GBP) 3000 is buttons and fixed.
High Performance Mark
Apropos of that, I once did a rundown on high-spec workstation kit on SO (Mostly I get machines like this for fast I/O but they tend to have quite fast CPUs as well ;). It lives at http://stackoverflow.com/questions/403084/optimal-off-the-shelf-development-machine/403558#403558
ConcernedOfTunbridgeWells
A: 

The Fortran code I'm familiar with is very different from code in other languages. In other languages, data structure is much more dominant, along with layers of abstractions, deep call stacks, and slowdown caused by excess calls.

Fortran on the other hand tends to get used for math-heavy algorithms, with big arrays, and not so much calling depth. In these, cache-locality issues loom much larger, and also algorithm issues. For example, I work a lot with non-linear mixed-effect modeling, and issues like tolerances, forward or central difference gradients, analytic gradients, are crucial. ODE solving methods such as Runge-Kutta, implicit methods, matrix exponent, or closed-form make huge differences.

Also, if you can (by sampling) identify sections of code that are true hot-spots (i.e. where the PC spends a large fraction of time without calling subroutines) and that are in code you actually compile (not in a 3rd-party library) then turning up the compiler optimization there will make a difference.

Personally, I don't care for the kinds of optimization Fortran compilers typically do, scrambling the code to shave cycles in code that uses less than 1% time fraction, while making it very difficult to debug.

Mike Dunlavey