views:

533

answers:

7

I have created an application executable, it works, but it runs too slow, a lot slower than needed. I would like to make it faster. What can I do to optimize it?

+2  A: 
  • Set a performance target, and make sure you are able to test if you have reached it or how far away from it you are.

  • Measure (profile) and optimize based on what you see.

  • Optimize only the parts of the program which show a significant time used.

  • Prefer high level optimization - using a better algorithm most often brings better results than rewriting critical parts using "assembly"/"SSE"/"Multicore"/other platform specific low-level optimizations.

  • When using low-level optimizations, make sure they are correct and do not break the code under any circumstances.

  • Be sure you understand how you target platform works, which operations are slow, what are common platform specific ways to improve performance.

  • Know what "Premature optimization is a root of all evil" means.

Suma
voted down as appears you have just posted a question in order to provide a stock answer.
workmad3
voted up as providing your own answer doesn't deserve a down vote.
Keith
@workmad3: Posting a question just to answer it is perfectly acceptable here on SO. In this case the question was way too broad though.
Chris Upchurch
I think this is a common question which is interesting for many programmers.My intention when doing so is to make sure even obvious questions are covered by this site, which I think can help esp. newbie programmers. I intend to post more questions like this, and to answer them.
Suma
too common and answer and question
JasonMichael
Fair enough. Down-vote removed :)
workmad3
Common questions are fine, but you need to provide enough detail to make a question answerable. As it says in the FAQ, questions should be "detailed and specific". This question is neither.
Chris Upchurch
The question is broad, and the answer obvious. BUT, the obvious answer is not very good. Everybody knows it, but they still make code that runs like mollasses. I've got answers for this elsewhere.
Mike Dunlavey
+1  A: 

Identify high-level issues that may be causing poor performance.

Is the application heavily I/O bound? Or is it processing vast amounts of data?
Or searching huge data structures?

Spending some time in analysis can save the time of profiling, or give you a good sense of where to start.

glamdringlfo
+4  A: 

As this is about optimisation, the rote answer needs to be rolled out:

Rules of Optimisation
1. Don't Optimise
2. (Experts only) Don't optimise yet
workmad3
comment to self... code sections provide funky colouring to quotations :)
workmad3
Aside from the string-coloring from your contractions, I think it gave good emphasis =)
Branan
The problem with the rote answer is, practically everyone will claim they follow it (and preach it), but software still contains big performance problems. There's a reality disconnect.
Mike Dunlavey
+3  A: 

Have you considered a code profiler tool. In .NET for example you could look at RedGate Ants.

Thomas Wagner
+5  A: 

It is difficult to give a general advice about improving the performance of a program because most performance enhancements are very application specific, and it's not always easy to see how to apply the general advice that you're likely to get.

However, I'll give it a shot. The first task you face is determining what an acceptable performance is. I don't suggest that you spend a lot of time on this, because if it turns out that you set the wrong goal, you can always change the goal, but you need to have some sort of objective goal in mind before you begin.

Second, you need to determine where the slowness is. It's always a good idea to start with a look at the environment. Does your program have a large memory footprint that's causing the computer to start swapping when it's run? If so, then adding physical memory might help. If your CPU is pegged when you run the program, then a faster processor might be in order, or you might try to figure out how to do multiple processes in parallel in order to take advantage of multiple CPU cores. If you're doing file processing of any sort where the file is stored on a remote file system, then you need to not do that because it's going to be slow as a dog. You should also explore additional compiler optimization options, or maybe a better quality compiler. I have found that gcc's optimization is nearly useless, but using -fomit-frame-pointer causes a large jump in performance of compiled programs.

Then, you need to look at what your program is doing when it is doing whatever you think is slow. You don't want a profile, because that tells you how much of a given time period was spent in various parts of your program, instead what you want is an annotated call graph which tells you how long each step took. Look for the calls that take a long time and drill down into them trying to figure out how to make the calls that take a long time faster.

My experience suggests that there are two types of things that take a long time in a computer program. The first is just stuff that just takes a long time. Calculating pi to a billion digits is going to take a long time, even on a fast computer. Other things take a long time because they waste time doing unnecessary things. Invoking copy constructors for a series of nested calls is a huge waste when you could just pass a const reference around and accomplish the same thing by copying pointers. The functional programming guys have a leg up, here, with their lazy execution and memoized results.

To deal with a performance issue on something that takes a long time, you need to not do that in the middle of something whose performance is critical. You have three choices: Do the hard work at some time where nobody will notice the time it takes, reduce the amount of work in the job by simplifying the problem, and choosing another algorithm.

You can't always not do the work, but you can often not do it when time is critical. If you need pi to a billion digits, you might not calculate it every time you need it. Instead, you might calculate it once (say, at compile time) and store the value and then use that stored value whenever you need it. Some database-backed Web sites build static versions of their pages that they serve up instead of building the pages every time someone loads them. Mainframe programs for doing things like online credit-card processing will have the processing program queued up and waiting for the request to come in rather than waiting until the request comes in to start the program because loading and linking takes longer than processing the credit-card request. Those are techniques for shifting work around that you can't avoid.

Reducing the effort of the hard thing is another approach. If you're calculating pi to a billion digits because you want to find the area of a circle, you might observe that what you're finding the area of is an approximation to a circle that's 100 pixels wide, so a billion digits is overkill and that 22/7 or one of the other rational approximations to pi is perfectly acceptable. One of the things that Google does is not try real hard to return all the data on a Web search. That makes their Web searches much faster at the cost of some matching records not being displayed. After all, whose going to notice that a particular site is missing from page 3278 of a popular search?

Choosing another algorithm is the classic method of achieving large performance gains. If you're calculating a billion digits of pi to determine the area of that aforementioned circle-like object, then you could use a circle-drawing algorithm to draw it and then count the interior pixels to find the area. If scanning through a file to find various interesting sections of it takes too long, then you can build an index of the file and use that to go right to where the interesting bits are. If sorting a list of items to remove duplicates take too long, then use some other mechanism make sure duplicates don't happen and don't sort at all. It is often helpful in this case to try to ask a larger question than the immediately obvious one.

Eliminating wasted effort is usually best done by following the best practices for a language. It's a lot like tweaking, so I try to use those best practices while writing the program and I try to avoid it as a separate step. On the other hand, it isn't tweaking because it's about trying to get on the same page as the compiler rather than trying to trick the compiler into producing a faster program, which is what tweaking is. However, you have to have a clear idea of how your compiler is going to do with your code and you need to understand the whys of things. Either that, or just do things the way that the rest of the user community does. It probably won't be too bad.

Tweaking is trying to trick the compiler into producing faster code. It's always a dumb idea, so don't do it.

Jonathan Guthrie
Excellent article, wished I could vote up multiple times. :)
Georg
+2  A: 

The number one rule of optimization is: the best optimization usually lies in correct design.

So, not knowing anything about your application I would suggest:

  1. Go back to the basics and see if you can find a cheaper algorithm / implementation.
  2. With the final result, see if the performance can be improved by looking at it from a different point of view and finding a less straight forward data structures that will work faster with less memory accesses.
  3. Finally, go over each of the functions in the code and do the actual optimization.

    -- Look for heavy 'if' statements within loops

    -- Remove recursions if possible and obsolete function calls.

    -- Look for bottle necks functions (use profiller for that), and try to reduce their heaviest part.

Hope that helped

Adi
A: 

Actually, Suma, I like your question, because it's important and there are general answers that can make us all better programmers (well, most of us, anyway).

I agree with most of the answers.

I have a favorite general method that I "asked" in another question, and it is contrarian. The reaction was like a buzz-saw. Nevertheless, I've relied on it, in various forms, for 30 years, and so have others, quietly.

It is a polarizing concept, though, because people either like it or hate the idea, mostly the latter, so I won't go into it here.

Let me give a friendly rejoinder to each of your points:

  • ''Set a performance target.'' That's OK, but it doesn't really affect my process of tuning, except maybe to tell me when I've done enough.

  • ''Measure (profile) and optimize based on what you see.'' My method of "profiling" gives very imprecise measurement, but very precise diagnosis, so I don't even think of it as measurement. Rather I think of it as asking what needs to be fixed. Then I optimize, after which a stopwatch suffices to measure the speedup.

  • ''Prefer high vs. low level optimization.'' I would prefer to say, locate the most costly code points, and prioritize fixing them by how hard they are to fix vs. how much I would gain (roughly).

  • ''Be sure you understand your target platform.'' Amen.

  • ''Know what premature optimization means.'' This takes experience. To me, it means just keep the code simple, and don't fix any performance problem until you have one. At the same time, every programmer will claim they do this. Nevertheless, often complicated data structures and event-driven architectures are put in place, for which the implicit justification is performance. Then when you optimize, you find out the performance problem is caused by that very architecture.

  • And I would add a point. Repeat the process until you're really satisfied with the results.

Mike Dunlavey