views:

918

answers:

6

This is a canonical question, meant only half-way as a joke. Hopefully, you'll find this before posting any question of the form:

Is something faster than something else?

Hopefully, the answers below will educate you and prevent you from asking any such questions.

Make sure you answer it, bods. I've posted my own thoughts but I've been wrong before (as my wife will attest to frequently, and at the drop of a hat :-).

+31  A: 

There's several mantras that I tend to follow (not dogmatically of course, since that's the sign of a weak mind).

1/ Measure, don't guess.

Don't presume to think that you know how computers or compilers or even cars work. They've all long since moved beyond the simplicity of the Apple II or my old 1966 Holden HR station wagon (I miss being able to fix my own car without buying some maintenance computer from Toyota).

If you really want to know how fast something is, you have to measure it. Do you think Ferrari put stuff in their Formula 1 cars just on the off-chance it might make them go faster? No (well, I hope not), they design and test and measure and throw away quite a lot of ideas that don't make the grade.

Profile your options then choose the best - that's one of the differences between software engineering and cowboy coding. Document your choices, perhaps even leaving the slower options in your code (unused) if anything ever changes in the future.

2/ Get it working first, then get it working fast.

Don't optimize something until you're almost certain that it's functionally stable. Otherwise the optimization is wasted when you have to re-engineer it.

3/ Target your optimizations.

There's absolutely no point in getting a 100% performance boost in a piece of code that runs once a month and takes 7 seconds.

Find out where your program is spending most of its time and target that. If you can find a piece of code where you're spending half of the time and you can increase it's speed by 10%, that's equivalent to getting a 500% boost for a piece of code where you're only spending 1% of the time.

4/ The user is all that matters. Period.

This is an extension of point 3. Why are you writing software? You might be writing it for pure pleasure but most business applications (and plenty of non-business ones) have users.

And they don't care if your application delivers email twice as fast or how long a background DB task takes.

But, if you can speed up the interface so that what used to take 3 seconds now takes one second, they'll elevate you to godhood.

Some of the best improvements I've made have been pseudo-improvements - the task as a whole takes the same amount of time but it looks quicker to the user.

This is where splash screens came from, it made it look like the application was doing something sooner. Initialization wasn't any faster, the user still had to wait for it to finish, but the user saw the splash screen and said: "Ah, the application has started".

paxdiablo
I would add 0/ Have a performance goal. Without a performance goal, you can measure all you want, and it will tell you *how* slow or fast you are, but if you don't have a goal, you will never know, whether you are *too* slow or fast *enough*.
Jörg W Mittag
@Jorg, don't comment, put in an answer. It could be an easy 15 rep points :-)
paxdiablo
Or else edit the answer. I myself would not mind if Pax gets the credit. After all he's worked out a good answer for a general question that comes just too often. To me Pax is due some 'good behavior' rep points.
David Rodríguez - dribeas
The again, I don't quite care for rep points for myself, as Krugman said: ""We all want power, we all want success, but the ultimate reward is the simple joy of understanding.""
David Rodríguez - dribeas
All good. I would only quibble with the "measure" part. Measuring is fine for quantifying results, but I think it's a blunt diagnostic tool, like trying to find a leak in a boat by seeing which way it leans. Sampling the call stack pinpoints performance problems.
Mike Dunlavey
about 2, IMHO optimization has to be in the very structure of the code. So, get it working first, and then add an optimization layer will never be as efficient simply because the root code is not optimal.
Lazer
for point 3: shouldn't that 400% boost really be 40%?
Wallacoloo
@Wallacoloo, yes, I think you're right but what's an order of magnitude among friends? :-) I can't believe no-one's picked that up before. Fixed it (hopefully).
paxdiablo
A: 

This question is almost impossible to answer.

http://stackoverflow.com/questions/688325/how-can-adding-code-to-a-loop-make-it-faster

Mark Ransom
+7  A: 
  1. What are we talking about? Two different algorithms can differ markedly in speed, but two ways of writing a loop are unlikely to.

(Unfortunately, it's sometimes possible to make algorithmic differences in simple coding. Compare these loops processing C-style strings:

for (i = 0; i < strlen(field); ++i)
for (i = 0; field[i] != '\0'; ++i)

where the first one can force a full traversal of "field" each iteration, to recalculate strlen(). The first one is O(n^2), the second is O(n).)

  1. Is there some absolute reason to prefer one coding style? Consider pre- and post-increment operators in C++. The pre-increment operators are about as intuitive, writeable, and readable as the post-increment operators. On some occasions, they will be more efficient than their post-increment counterparts, and won't be slower. It's best to get into the habit of writing pre-increment rather than post-increment.

  2. Is there a performance issue? Most (not all) software that is run on personal computers is fast enough already. Server software is a different matter: if you can make software run 10% faster, a data center might get by with 10% fewer servers, which is a real savings. Widely used software such as libraries is also different, as the next developer may want to use it in a performance-critical application. There's no point in worrying about method A vs. method B if there is no real-world difference.

  3. Is the performance issue something that faster code execution will help? Frequently, performance problems are related to network issues, or disk issues, or other such things. In this case, method A vs. method B will buy nothing.

  4. How often will method A or method B be executed? Most software has hot spots, in which small segments of the code will take a lot of the execution time, while most of the rest of the code takes fairly little. There is no point in micro-optimizing code that isn't executed a lot. Note that humans are very bad at picking out these hot spots, and humans who think they are good at it are still very bad at it. These hot spots can be found reliably by profiling the code, and in no other way.

  5. Assuming it matters, which is faster? In 1975, it was usually possible to understand exactly how code would be executed, and give a simple general answer. Nowadays, processors operate in very complicated ways, and it's difficult to tell which method is faster by inspection. Performance is often dominated by access time, which translates into page faults and cache misses, so locality of code and data is often the most important thing for performance. The only reliable way to tell if method A or method B is faster is to test them against each other.

David Thornley
Sorry, dead horse time again. Regarding your first line of code above, and point 4, that's an example of a separation between the performance bug and the hot spot. gprof will show a hot spot in strlen, but that's not what needs to be fixed. It may or may not show, depending on where else strlen is called from, a high measure on the "for" statement. However, if that line of code accounts for (say) 75% of time, then stack samples (75% of them) will nail it.
Mike Dunlavey
... In my experience, _most_ of the time, hot spots and performance bugs are not the same thing. It depends on the definition of "hot spot". If it includes addresses that are on the call stack a high % of time, then it is a far more useful concept.
Mike Dunlavey
Last time I used gprof, I got two sets of numbers for functions: amount of time spent in that specific function, and amount of time spent in that function and downstream callers, with immediate downstream breakdowns. It does take the call stack into account.
David Thornley
Right. The "inclusive" time should be (if it handles recursion right) should be the same as the fraction of stack samples containing that function. Now, if only it did the same thing for call sites, it could pinpoint that call to strlen.
Mike Dunlavey
+4  A: 

Good answers and non-answers.

I would only add my pet peeves. These are great downvote-bait because they question common beliefs:

  • being penny-wise-pound-foolish
    Talking about whether ++i or i++ is faster is (most of the time) like asking if a car is made faster by having 4 lug-nuts per wheel rather than 5. Not to deny that there may be a level at which this makes a noticeable difference.

  • talking about "better algorithms"
    Certainly a bad algorithm can make a program slow, but it's far from the only way. In my experience the biggest one is galloping generality. It is perfectly fine to say "Let's use our standard template libraries, with layer upon layer of bags, collections, iterators, and whatnot." As long as the job being done by the program only takes a fraction of a second, it's mainly an issue of style. But when the data load gets big, and that CPU racehorse has a 1000 km to run, suddenly you find you're paying big time for all those lovely abstractions. Then the whole thing gets tossed to somebody like me to straighten it out, and with luck a lesson is learned:

    Computers execute instructions, not abstractions.

  • talking about "measuring"
    Measuring is fine for showing that there may be performance problems. For precisely locating such problems it is a fuzzy microscope.

    Sure it can be done, by progressively narrowing the search from upper-level routines down to lower-level ones, without losing track of what it should be doing as opposed to what it is. Or look at how many times functions are invoked and try to decide if it is too many or not. Either way, it's educated guessing.

    Suppose the program is 10 times slower than necessary. That means 90 out of every 100 cycles are being spent unnecessarily. If I take a random-time sample of the call stack, chances are 90% that it will be on one of those cycles.

    I can tell if the cycle is necessary by examining every call on the call stack. If they are all really necessary, then the cycle is necessary. If any one of them can be eliminated, that cycle (and others like it) will be saved.

    I take multiple samples, just to be sure the problems I see are real. In fact, if I see a particular call on multiple samples, then the time I will save by eliminating it is approximately the percentage of samples it is on.

    The key difference between this and measuring is that I'm locating the problem with precision (because it's on the call stack) and determining its necessity unambiguously. To precisely quantify the problem is a separate issue from locating it so it can be fixed.

    To give a medical analogy, it is like the difference between finding tumors by X-ray, versus measuring body parts.

    There is more detail here.

Mike Dunlavey
Sorry, but your last point is utter nonsense. The point of measuring/profiling is to get hard data rather than doing guesswork. It has nothing whatsoever to do with impressing anyone. And your "method" is nothing but a crude manual form of profiling - have you ever actually used a real profiler? It sounds like you haven't.
Michael Borgwardt
Mike Dunlavey
Remember, _for finding problems_ (not for quantifying performance), the task is to pinpoint the exact lines and instructions creating the cost. One might like to know the cost to 3 decimal places, but what one Really needs to know is _where is it exactly_, so it can be fixed. Otherwise one is only making informed guesses.
Mike Dunlavey
That's what the profiler is for: to find out what code is taking the most time. Then you measure to see if what you did helped, hurt, or did nothing. You are correct that precise measures are unimportant, but if you don't profile you'll mess with the wrong thing, and if you don't measure you're guessing about the results.
David Thornley
Hi David: It's a common misconception among people who haven't tried it that sparse sampling will find the wrong thing, but if it shows something on multiple samples, and you eliminate it, it _will_ speed up. You can also be sure that any significant problem will be shown, and without a large number of samples. What's more, any problem you miss will be found the 2nd or 3rd time you try it. This idea often stems from a preconception of what the problem is.
Mike Dunlavey
... measuring is fine to see how much was saved, but it does not pinpoint the problem. Profilers (some) could do this - but they don't. It's not a hard concept. I built one once: http://stackoverflow.com/questions/368831/how-to-write-a-profiler/738647#738647.It did the job, but I still didn't think it beat the manual method, because in the manual method you can examine context.
Mike Dunlavey
Mike Dunlavey
Sparse sampling *is* profiling, however you do it, and a good profiler like gprof will give you all the context you need. Measuring is necessary because you need at least a halfway decent idea of how much you've improved or degraded performance with your approach. I think we're arguing about technique here, rather than anything basic.
David Thornley
Mike Dunlavey
Mike Dunlavey
+1  A: 

Modification to Pax's answer item #1:

1a. Measurement is crucial, but overly simple measurement can be a trap.

Whether A or B is faster will often depend on the data on which each will operate.

If this is the case for your particular comparison, then the revised question is, Under what circumstances is method A faster than method B? At which point some thought must be devoted not only to algorithmic efficiency and boundary cases, but to the distribution of data sets that actual usage of the methods will bring.

John Pirie
A: 

I think there's a valid argument for habitual optimisation which is a special, desirable case of premature optimisation. This is where if there are two or more ways of doing some common operation, find out if one of them is always more efficient/correct and make a habit to use that one. While this is indeed premature optimisation, the cost/benefit ratio is much better as it's a think about it once, benefit forever system.

A perfect example is pre-increment versus post-increment in C(++). In some cases on some compilers, a post-increment will slightly slower than a pre-increment. Has replacing a post-increment with a pre-increment ever significantly improved the performance of a piece of software? Probably not. But if you get into the habit of always using a pre-increment (except for where post-incrementation is actually needed, of course) you can be sure that your code will never waste any post-incrementing, and it cost you nothing.

In Qt, it is more efficient to do someString.isEmpty() than someString == "". Does this ever make a big difference? Probably not, but if you make a habit of using the former, then your getting a tiny performance boost in several places that costs you nothing, not even a conscious thought. There are plenty of examples of this in lots of languages/libraries.

Of course premature optimisation is dangerous, but most of the danger comes from wasted developer time and sacrifices to code readability. But if you learn a small, clear-cut, readable optimisation that can be used frequently, why wouldn't you?

Parker