views:

296

answers:

7

I am have a lot of code that I need to optimize and make it run faster. I used opreport to tell me where the code spends a lot of time. I use the following command to get the statistics

opreport -g -l -d

Suggestions to get better statistics are appreciated using different flags, perhaps find it per line number instead of function number.

So a lot of issues that I "think" I see are in regard to:

  • pointers, multidimensional arrays
  • multiplications
  • loops

I want compiler to optimize the code better, thus helping him. I factored some code blocks into function with word restrict to tell compiler that my pointer arrays don't overlap.

SO I am I am looking for (a) common C constructs that can make code run longer and (b) how to help compiler optimize code.

Thanks

+2  A: 

In General: In my experience, 90% of the time repairing silly mistakes (like unintentionally copying instead of passing references) and tweaking algorithms and data structures is what makes significant differences. Thinking about low-level optimizations is only useful when you have fully done those 90%.

In your case: If opreport (which I don't know, BTW) told you where your application spends its time, you'll have to optimize those bits. If you need help with doing so, you might have to post somewhat concrete examples.

sbi
I can't post examples, as I will get sued, fired, enjailed and burnt :)
vehomzzz
@enigma: Your employment agreement really stinks. Mine only allows for suing. What you might need to do is concoct examples similar to the code that's showing up as taking too long.
Michael Kohne
I mean the code is conceptually correct, it just needs a little "optimization polishing"... I like how much they pay me
vehomzzz
So far, nobody has offered me enough money to have me burnt. `:)` Otherwise I second Michael: Show us something and tell us what's slow about it. It doesn't have to be the original code.
sbi
+3  A: 

The biggest thing I know in C++ is to be careful about the methods that you call. In C++ (and any OO language) it's pretty trivial to hide a LOT of processing behind a very small interface.

This is especially important when dealing with overloaded operators - depending on the library these can be BIG time sinks, and look like nothing at all in the code.

Michael Kohne
Oh yeah, that it true!
avp
+4  A: 

Hi

Here's a contentious argument -- if there are 'common C constructs that can make code run longer' (and I'm sure you are right to think that there are such constructs) then I would expect a good optimising compiler to, well, optimise for them. You don't reveal which compiler(s) you are using, and I'm not a C/C++ programmer, so it's difficult for me to suggest any particular compiler flags or tricks to try.

The only concrete advice I would offer is this: study the output of your profiling tool(s) very carefully and only spend your time optimising those parts of the program where it's worth the effort.

Regards

Mark

High Performance Mark
True. If we could give general advice how to speed up code in general, then we could patch the compiler just to do it for you. We can however give general advice how to figure out specific ways to speed up specific code...
Steve Jessop
+4  A: 

Beware of the reports from profiling tools, they can be misleading. For instance, consider an application that does a large amount of string comparisons and not much else. A report is going to tell you that you spend >90% of your time in string comparison functions. So naturally, you decide to implement an optimized version of that code only find out that the profiler tells you that you are still spending 90% of your time there (because that is all your program does...).

You have to have intimate knowledge of your application and apply that to a profiler else you might be wasting effort.

Compilers today do a fairly good job of optimizing (especially with extra flags as options). It is unlikely that you will benefit from anything at a language level (i.e. how you handle arrays) - you will probably have to read/write asm if you want to hand tune things.

ezpz
+1. One of the first sections of Programming Pearls talks about "back of the envelope calculations". The idea is that you should use rough calculations to understand how expensive a given algorithm should be. Then, as you say you know if that 90% is what you'd expect from your code.
Richard Corden
And of course, if the reason for the 90% string comparisons is the use of `std::set<string>` instead of `std::tr1::unordered_set<string>`, improving string comparisons is the wrong solution. A profiler doesn't choose the right algorithm for you. It merely gives you facts.
MSalters
+1  A: 

Well, there are two kinds/approaches to the optimization.
First, one can optimize the architecture. You know, binary search instead of the bubble one and so on :) But seriously, I hope this point is clear.
And the second, technical: one uses the profiler in any form and looks for the bottlenecks. Then, when bottlenecks are found, optimization is necessary only for them.

It is well known, that premature optimization is the root of the evil, so just don't care about minor tweaks like types/loops/virtuals-or-not/and so on. Mostly it is not so important while takes a lot of time. IMHO, the psychological impact is much higher here than the real one.
Also, you can talk to game developers: they are really professional in this and I guess that they would repeat my words: optimize only what is necessary to optimize and then optimize the architecture of the problematic block.
P.S. Also CPUs are different and your optimization can be just the waste of time for some of them...

avp
A: 

Memory allocation can be slow in C++. Make sure you are looping over your arrays properly for cache coherence. if you have int arr [100][1000] then you want to loop like

for(int i = 0; i < 100; i++) for(int j = 0; j < 1000; j++) arr[i][j]

If you loop over i after j then your CPU's cache isn't being used properly. If you are doing a large number of multiplies try to introduce temps to minimize the number of multiplies you do. The compiler should do common sub expression elimination but sometimes it helps to do it explicitly.

stonemetal
FYI cache coherence does not typically mean limiting cache thrashing.
Dan O
A: 

In my experience, it doesn't have much to do with what language or compiler you are using, and it has everything to do with how the code is designed and organized.

If you think only small optimizations are possible, and that it really depends on the compiler, you might find this surprising.

This is the method I use, and an explanation of why it works.

Mike Dunlavey