tags:

views:

78

answers:

5

This is a dangerous question, so let me try to phrase it correctly. Premature optimization is the root of all evil, but if you know you need it, there is a basic set of rules that should be considered. This set is what I'm wondering about.

For instance, imagine you got a list of a few thousand items. How do you look up an item with a specific, unique ID? Of course, you simply use a Dictionary to map the ID to the item.

And if you know that there is a setting stored in a database that is required all the time, you simply cache it instead of issuing a database request hundred times a second.

Or even something as simple as using a release instead of a debug build in prod.

I guess there are a few even more basic ideas.

I am specifically not looking for "don't do it, for experts: don't do it yet" or "use a profiler" answers, but for really simple, general hints. If you feel this is an argumentative question, you probably misunderstood my intention.

I am also not looking for concrete advice in any of my projects nor any sophisticated low level tricks. Think of it as an overview of how to avoid the most important performance mistakes you made as a very beginner.

Edit: This might be a good description of what I am looking for: Create a presentation (not a practical example) of common optimization rules for people who have a basic technical understanding (let's say they got a CS degree) but for some reason never wrote a single line of code. Point out the most important aspects. Pseudocode is fine. Do not assume specific languages or even architectures.

A: 

Are your algorithms correct for the situation or are there better ones available?

graham.reeds
+4  A: 

Two rules:

  • Use the right data structures.

  • Use the right algorithms.

I think that covers it.

anon
It indeed seems to cover the topic, but I'd love a _little_ more detail :)
mafutrct
+2  A: 
  • Minimize the number of network roundtrips
  • Minimize the number of harddisk seeks

These are several orders of magnitude slower than anything else your program is likely to do, so avoiding them can be very important indeed. Typical methods to achieve this are:

  • Caching
  • Increasing the granularity of network and HD accesses

For example, B-Trees are absolutely ubiquitous in DB systems because the reduce the granularity of HD access for on-disk index lookups.

Michael Borgwardt
I'd add something like "unnecessary" or "Cache the results of frequent xxx"?
mafutrct
+1  A: 

I think something extremely important is to be very carefully on all code that is frequently executed. This is normally the code in critical inner loops.

Rule 1: Know this code

For this code avoid all overhead. Small differences in runtime can make a big impact on the overall performance. E.g. if you implement an image filter a difference of 0.001ms per pixel will make a difference in 1s in the filter runtime on a image with size 1000x1000 (which is not big).

Things to avoid/do in inner loops are:

  • don't go through interfaces (e.g DB queries, RPC calls etc)
  • don't jump around in the RAM, try to access it linearly
  • if you have to read from disk then read large chunks outside the inner loop (paging)
  • avoid virtual function calls
  • avoid function calls / use inline functions
  • use float instead of double if possible
  • avoid numerical casts if possible
  • use ++a instead of a++ in C++
  • iterate directly on pointers if possible

The second general advice: Each layer/interface costs, try to avoid large stacks of different technologies, the system will spend more time in data transformation then in doing the actual job, keep things simple.

And as the others said, use the right algorithm, try to optimize the algorithm complexity first before you optimize the algorithm implementation.

Arno
While things like ++a vs a++ are way (_way!_) to specific, I upvoted for the inner loop and large stack ideas.
mafutrct
+1  A: 

I know you're looking for specific coding hints, but those are easy to find: cacheing, loop unrolling, code hoisting, data & code locality, blah, blah...

The biggest hint of all is don't use them.

Would it help to make this point if I said "This is the secret that the almighty Powers That Be don't want you to know!!"? Pick your Powers: Microsoft, Google, Sun, etc. etc.

Don't Use Them

Until you know, with dead certainty, what the problems are, and then the coding hints are obvious.

Here's an example where many coding tricks were used, but the heart and soul of the exercise is not the coding techniques, but the diagnostic technique.

Mike Dunlavey
The question excluded both the need for optimization and too specific tricks outlined in the link. I was really just asking for the most basic ideas at all.
mafutrct
@mafutrct: The most basic idea of all is this: manually interrupt (pause) the program while it is running, and examine what it is doing and why. The call stack, and its state in general will tell what it is doing and why. (There are some types of code where this does not work, but they are few.) Do this several times. The points to optimize will be very obvious, and the % of times they appear is the % of time they could save. It's very simple. When you know where the problem is, your brain will figure out how to optimize it, guaranteed.
Mike Dunlavey
@mafutrct: If you browse around on SO, you will find a small number of people who have discovered this. I discovered it about 1978 on a minicomputer, the RDS-500. I've used it on 8080, 8086, 8088, 68K, Vax, x86, Unix, Windows, all languages including Java and C#. It never fails. People flounder around and bluster with profilers, while randomly halting is low-tech but nails the problems immediately every time. People need to know this because, darn it, it works. So the secret of optimization is not particular coding tricks, it is *how to find the problems.*
Mike Dunlavey
Your explanation is indeed valid and very interesting, but it still misses the point of the question. (edited question to include better example)
mafutrct
@mafutrct: If you think I'm not answering the question, that's OK. Simple as it sounds, this idea is very hard to communicate. I've had former clients call me and practically beg me to give them some hints of how to optimize their systems. I tell them I can guess about as well as anyone - badly. I tell them it's not about coding, it's about diagnosis. I tell them precisely how, but it's not what they want to hear.
Mike Dunlavey