views:

1995

answers:

24

Could someone point me to an article, or write some tips right here about some c++ programming habits that are generally valid (no real drawbacks) and improves performance? I do not mean programming patterns and algorithm complexity - I need small things like how you define your functions, things to do/to avoid in loops, what to allocate on the stack, what on the heap, and so on.

It's not about making a particular software faster, also it's not about how to create a clean software design, but rather programming habits that - if you always apply them, you will make your code rather a little bit faster than a little bit slower.

Thanks :)

+13  A: 

One of the good staring points is the Sutter's Guru of the week series, and the Exceptional C++ books that grew out of that.

Nikolai N Fetissov
+2  A: 

Here's a list I've referred to in the past - http://www.devx.com/cplus/Article/16328/0/page/1. Beyond that, Googling c++ performance tips yields quite a bit.

Tim Hardy
+21  A: 

A number of the tips in Effective C++, More Effective C++, Effective STL and C++ Coding Standards are along this line.

A simple example of such a tip: use preincrement (++i) rather than postincrement (i++) when possible. This is especially important with iterators, as postincrement involves copying the iterator. You optimizer may be able to undo this, but it isn't any extra work to write preincrement instead, so why take the risk?

Laurence Gonsalves
I've heard some refer to the practice of writing less efficient code that isn't any easier to read/write as "premature pessimization".
Laurence Gonsalves
I'm not sure why so many people recommend preincrementing for optimization. It's simple, sure, but it's also a microoptimization that only becomes noticeable if you are doing huge amounts of looping.The other downside to preincrementing is that it can also occasionally lead to logic errors in expressions that are difficult to find because they would not generate an error or a warning.
KingRadical
I use it not because it's faster, but because I find it more expressive. "Increment i" is written "++i". Postincrementing is just as likely to lead to logic errors as preincrementing - the problem is if you use the wrong one, you get the wrong answer.
Steve Jessop
that's exactly the type of tips i meant - thanks for that. I didn't know about this yet - and since it doesn't hurt to write ++i instead of i++ as long as it's logically correct i try to make this a habit
genesys
This tip cannot be overemphasized enough. In my 'for' loops, I used to use i++. After some thought, I said to myself, "Self, does this not add overhead by creating a temporary int?" And of course, you can guess my answer to myself.
George Edison
@George, actually, for primitive types I believe it is guaranteed not to create a temporary. Even if it isn't guaranteed by the standard, in practice every compiler you're likely to encounter will make the optimization for primitives (it's merely a matter of where it puts the increment instruction in the output machine code)
rmeador
I personally HATE seeing ++i. For me, I like seeing the object that is being modified to the left of the operator modifying it. On modern compilers it's going to make no difference in any case.
Billy ONeal
@George as rmeador says primitives does not create a copy when using postincrement/decrement over pre. Generally if you are in doubt, try looking at the assembly the compiler generate, it should help answering your question. Further more, the compiler is an extremely advanced tool created by some of the people who know the most about the language and optimizations, trust that it can optimize simple things like this.
TommyA
Compilers can optimize `i++` to be just as efficient as `++i` for primitives, but not for iterators, and consistency is good.
Pavel Minaev
@BillyONeal, try reading pre-increment like `++i` this way: "increment i", then compare it to the `i++`: "i increment". Which sounds more adequate? Hope now you hate that form of increment less. :)
Dmitry
I use pre-increment in C++ in these types of for loops (for various reasons, consistency with what others expect actually being more relevant than the tiny/negligible performance difference); however, I think how Go makes it post-fix and a *statement* (therefore, no need for the pointless debate *and* get the better syntax) is an interesting take on the matter.
Roger Pate
Well, I'd say no need for debate and you get the worse syntax, but yes, if it's not an expression then there's no performance issue to argue about. Likewise in x86 assembler it's `inc eax`, which is also a "statement" not an expression ;-)
Steve Jessop
@Roger Pate: Lots of problems in C and related languages would have been avoided by making certain constructs into statements - specifically, anything that alters a value.
David Thornley
@David Thornley: agreed. It wouldn't surprise me if 90% or more C/C++ programmers aren't even aware of the concept of "sequence points", which suggests that it was a bad idea to make side-effects in expressions so prevalent. I remember when I used to read comp.lang.c (way back when) one of the most common questions was "what does i = i++ mean?". I swear that some variant of that was asked at least once per week. (The answer is that it's undefined as i is being mutated twice with no intervening sequence point.)
Laurence Gonsalves
Steve: True, better is subjective, but `eggs->and.bacon(are).yummy[3]->food++` is considerably more obvious for me. -- Laurence: my understanding is sequence points are disappearing in 0x in favor of "is-sequenced-before" semantics, to throw another wrench in the works.
Roger Pate
Why didn't they call it ++C hehehe :D
Partial
@Roger: OK, although if I was using pre-increment with that I'd always put in the unnecessary brackets anyway: `++(eggs ... food)`, which I think makes it obvious. If you can't remember what you're doing from one end of a code line to the other, the code lines are too long, and your example is deliberately borderline even without an increment. @Partial: on those grounds it should be called (C+1). It doesn't modify C every time you use C++ ;-)
Steve Jessop
A: 

Prefer to use pre-increment.

With int/pointer etc it makes no difference.
But with class types the standard way of implementation requires the creation of a new object.

So prefer pre-increment. Just in case at a later point the types are changed.
Then you will not need to modify the code to cope.

Martin York
pre-incrementing is a micro-optimization that won't show much change unless you have loops that run millions of times.
KingRadical
`++i` makes a difference if only a semantic one when `i` is an `iterator`
Gregory Pakosz
@KingRadical: Some people have loops that do run millions of times, and in any case pre-increment is just as easy to write as post-increment, works a little better in some cases, and never works worse.
David Thornley
@King: This is not a macro optimization. It is prefered style in C++ (becuase it will reduce code modifications when types change). Read any of Scott Myers books. But yes you are correct (as I also pointed out in the answer above) in that unless the object is heavy or the loop runs a lot it will not make much difference. The point is when the object is changed from somthing light to somthing heavy you will not need to refactor the code.
Martin York
also, post incrementing when you don't need the old value is a semantic error. It tells the compiler (and future maintainers) that you need the old value for something. If you don't need that old value, don't tell everyone you do.
Bill
In the best case the difference between `i++` and `++i` is the position of a single instruction. A good compiler shouldn't generate additional overhead.
envalid
@envlid: You are of course assuming int/pointer and not the situation where ++ is an overloaded operator and thus in reality a method call.
Martin York
+2  A: 

I took the habit to prefer writing ++i rather than i++ not that it brings any performance boost when i is an int but things are different when i is an iterator which might have a complex implementation.

Then let's say you come from the C programming language, lose your habit to declare all your variables at the beginning on the function: declare your variables when they are needed in the function flow since the function might contain early return statements before some variables that were initialized at the beginning are effectively used.

Apart from that, another resource is C++ Coding Standards: 101 Rules, Guidelines, and Best Practices by Herb Sutter (him again) and Alexei Alexandrescu.

There is also a more recent edition of Scott Meyers' Effective C++: Effective C++: 55 specific ways to improve your programs and designs.

Finally, I would like to mention Tony Albrecht's Pitfalls of Object Oriented Programming presentation: not that it contains rules of thumb you can follow blindly but it's a very interesting read.

Gregory Pakosz
Having many returns in one function can be problematic...
Partial
maybe I chose a poor example, but the point is to declare variables and initialize them only at the point when they are used, not 200 lines above -- now you can argue that 200 lines long functions can be problematic ;)
Gregory Pakosz
I'd say for the most cases that it doesn't matter. Generally many of these small optimizations are more destructive in the development process than they are useful. The modern compilers can easily find out how to delay initialization of variables until they are used. The most important thing is to keep the code clean and simple (readable).
TommyA
"The modern compilers can easily find out how to delay initialization of variables until they are used." - compilers cannot rearrange initializations when the latter cause observable side effects. One particular case of an observable side effect is memory allocation using `new` (consider overloaded operator etc), so if a constructor of some class calls `new` at any point, it usually won't be delay-initialized.
Pavel Minaev
+17  A: 

If I understand you correctly, you're asking about avoiding Premature Pessimization, a good complement to avoiding Premature Optimization. The #1 thing to avoid, based on my experience, is to not copy large objects whenever possible. This includes:

  • pass objects by (const) reference to functions
  • return objects by (const) reference whenever practical
  • make sure you declare a reference variable when you need it

This last bullet requires some explanation. I can't tell you how many times I've seen this:

class Foo
{
    const BigObject & bar();
};

// ... somewhere in code ...
BigObject obj = foo.bar();  // OOPS!  This creates a copy!

The right way is:

const BigOject &obj = foo.bar();  // does not create a copy

These guidelines apply to anything larger than a smart pointer or a built-in type. Also, I highly recommend investing time learning to profile your code. A good profiling tool will help catch wasteful operations.

Kristo
Have you heard of copy elision?
Inverse
so what's the right way to do it without creating a copy? (in your example)
genesys
@Inverse: I've heard of it, but I wouldn't expect it to happen between compilation units.
Mike Seymour
@Mike: RVO doesn't require inlining (after all, it's semantically equivalent to passing a pointer to target location from the caller, and doing a placement new on that in the callee), and thus isn't in any way affected by translation unit boundaries.
Pavel Minaev
@genesys, the right way is to not declare an object when you only need the reference. It's a simple oversight but it adds up fast when it shows up in frequently called code.
Kristo
@genesys: just turn obj to a reference. Deciding whether to use a reference or not depeands only if you need to modify the object that bar() is returning or you just need to see the state of BigObject.If you only need read permissions make sure the reference is **const**. If you need to modifiy a copy of what's bar() returning don't use a reference.
the_drow
why is it necessary to make sure the reference is const?
genesys
D.Shawley
Kristo
@genesys, the references don't have to be `const`. It's just that they ususally are, so that's how I chose to write the example code. Since you mentioned real-time code, I would place bigger emphasis on **profiling** than worrying about the language lawering going on in the comments here.
Kristo
of course - i don't hope to get any significant performance increase of applying any of those tips. i just think a serious programmer should - where ever it has no drawback - write code in the most performant way possible
genesys
+2  A: 

I would suggest to read chapter II ("Performance") of "Programming Pearls" from Jon Bentley. It's not C++ specific, but those techniques can be applied in C or C++ as well. The website contains just parts from the book, I recommend to read the book.

Doc Brown
+3  A: 

Avoid iterating over the same dataset multiple times as much as possible.

KingRadical
+10  A: 

The "Optimizing Software in C++" by Agner Fog is generally one of the best references for optimization techniques both simple, but definitely also more advanced. One other great advantage is that it is free to read on his website. (See link in his name for his website, and link on paper title for pdf).

Edit: Also remember that 90% (or more) of the time is spent in 10% (or less) of the code. So in general optimizing code is really about pinpointing your bottlenecks. Further more it is important and useful to know that modern compilers will do optimzation much better than most coders, especially micro optimizations such as delaying initialization of variables, etc. The compilers are often extremely good at optimizing, so spend your time writing stable, reliable and simple code.

I'd argue that it pays to focus more on the choice of algorithm than on micro optimizations, at least for the most part.

TommyA
From my experience, more effort should be spent in a program's correctness and robustness than performance. A high performance program that often locks up or crashes is of little value compared to a slower program that rarely locks up or crashes.
Thomas Matthews
You should aspire to both correctness and high performance. People always pit the two against each other as if you can't have both, but you can and should.
Dan Olson
@Dan I completely agree with you. But I've seen lots of code where people in good faith have tried to optimize the code, and where the only result has been making it more difficult for the compiler to optimize the code. I think the one place where you can really boost your programs performance, is through choosing your design and algorithm carefully. Of course a profiler is needed to identify those 10% code that takes the 90%, and to find out if it can be optimized further (and how).
TommyA
Your program should be adequately correct (if it's completely correct, it's either way expensive or trivial). It should have enough performance on what your target audience would consider a low-end system. You should have the wisdom not to mess up things for the compiler.
David Thornley
+5  A: 

Here is a nice article on the topic: How To Go Slow

Nemanja Trifunovic
+12  A: 

Use functors (classes with operator() implemented) instead of function pointers. The compiler has an easier job inlining the former. That's why C++'s std::sort tends to perform better (when given a functor) than C's qsort.

Fred
`std::sort` to a large extent also does better because it can also avoid the need for extra indirection (passing addresses of two objects being compared as `void*`, rather than objects themselves, even where a plain copy is cheaper).
Pavel Minaev
A: 

The best way to improve these skills is reading books and articles, but I can help you with some tips:

  • 1- Accept Objects by reference and Primitive or Pointer Types by value but use object pointer if a function stores a reference or a pointer to the object.
  • 2- Don't use MACROS to declare constants -> use static const.
  • 3- Always implement a virtual destructor if your class may be subclassed.
Drewen
why not to use Macros? i thoght those are a compile thing and should be the fastest therefore, no? and what's the idea behind the virtual destructor? this will automatically add a vft pointer even if it's not needed, no?
genesys
There are good reasons to avoid macros and implement virtual destructors when you need them, but they have nothing to do with performance.
Mike Seymour
1- A "#define is expanded by the preprocessor while constants are processed by the compiler and are included in the symbolic information used to debug the code.2- If your class will be subclassed declare a destructor virtual to properly destroy derived objects thought base pointers.
Drewen
2 and 3 look like good ideas, but I don't see how they relate to performance.
Max Lybbert
A: 
  • Avoid multiple inheritance.
  • Use virtuals when they are necessary, not just for fun
  • Use templated collection classes only when it's a pain not to
Alex Brown
Multiple inheritance should not necessarily be avoided. That is a belief that comes from Java, C# and probably some other languages that has the philosophy of limiting what a programmer may do for security. If you know what you are doing in C++ there is nothing wrong with it.
Partial
why avoiding multiple inheritance? and what's the problem with templated collection (except for the data overhead compared to a simple array)
genesys
Multiple inheritance can have a small impact on performance, as the `this` pointer sometimes needs to be adjusted to point to different base classes, and member functions are sometimes called through wrappers. But it's certainly not enough to warrant influencing how you design your class hierarchy.
Mike Seymour
+4  A: 

Templates! Using templates can reduce the amount of code because you can have a class or function/method that can be reusable with many datatypes.

Consider the following:

#include <string>
using std::basic_string;

template <class T>
    void CreateString(basic_string<T> s)
    {
        //...
    }

The basic_string could be composed of char, wchar_t, unsigned char or unsigned wchar_t.

Templates could also be used for a panoply of different things such as traits, class specialization or even used to pass a int value to a class!

Partial
Templates may also cause code-bloat, since they are code stencils. The compiler will generate similar code for each data type.
Thomas Matthews
and code bloat can cause cache misses, which is the chief way that it interferes with performance.
rmeador
"Templates may also cause code-bloat ..." - The same can be said about non-template code. In both cases it all depends on how you use/implement the code in question.
Void
@Thomas Matthews: Using templates will for sure take more time at compilation but will be faster at execution.
Partial
+6  A: 

A few of my pet peeves:

  1. Don't declare (actually, define) object variables before their use/initialization (as in C). This necessitates that the constructor AND assigment operator functions will run, and for complex objects, could be costly.
  2. Prefer pre-increment to post-increment. This will only matter for iterators and user-defined types with overloaded operators.
  3. Use the smallest primitive types possible. Don't use a long int to store a value in the range 0..5. This will reduce overall memory usage, improving locality and thus overall performance.
  4. Use heap memory (dynamic allocation) only when necessary. Many C++ programmers use the heap by default. Dynamic allocations and deallocations are expensive.
  5. Minimize the use of temporaries (particularly with string-based processing). Stroustrup presents a good technique for defining the logical equivalent of ternary and higher-order arithmetic operators in "The C++ Programming Language."
  6. Know your compiler/linker options. Also know which ones cause non-standard behavior. These dramatically affect runtime performance.
  7. Know the performance/functionality tradeoffs of the STL containers (e.g., don't frequently insert into a vector, use a list).
  8. Don't initialize variables, objects, containers, etc., when they are going to be unconditionally assigned to.
  9. Consider the order of evaluation of compound conditionals. E.g., given if (a && b), if b is more likely to be false, place it first to save the evaluation of a.

There are many other "bad habits" which I won't mention, because in practice, modern compilers/optimizers will eliminate the ill effects (for example, return value optimization vs. pass-by-reference, loop unwinding, etc.).

I disagree with #8 on the grounds of software maintenance. Just because something was originally unconditionally assigned to doesn't mean it always will be. For the vast majority of cases, initializing your variables is a very good idea.
luke
I disagree with #3. If a processor's native word is 32 bits, it is optimized to handle 32 bits; whether you use all of them or less. Some processors may choke or gag when trying to access variables of less size. For example, an ARM processor in 8 | 32 bit mode (8-bit chars, 32 bit integers) has a hard time with 16-bit quantities. So a short int (16 bits) actually may slow down the processor compared to an int (32-bits). Use integers and only short's when necessary.
Thomas Matthews
I can't speak to ARM processors, but on x86s, you'll only run into a problem like that if your variable spans a word boundary in memory. This is more likely to happen when you use variables that aren't the word size. Of course, most compilers will align variables in memory to the word size, thus avoiding the issue (and potentially using the same amount of memory as it would have if you'd just used word-sized variables in the first place).
rmeador
+1 for mostly good advice, though I also disagree with #3. Although you'll need to go back in time at least 10 years to find an ARM that doesn't handle 16-bit quantities.
Mike Seymour
@Mike: But all arithmatic is done on int size objects. So anything that is smaller needs to be converted into an int. Normally I would not expect a major (or any) hit because of this but on soome systems this may take 2 instructions rather than one to load a value and seat it correctly because of memory alignment (ie registers can only read 32 bit quantities aligned to 32 bit boundries, thus loading an 16 bit value aligned to a 16 bit boundry is a load followed by a shift. Yes rare but a potential. Thus unless you have real specific situaion prefer int for normal/small range numbers.
Martin York
@Martin: you're right - that's why I disagree with #3.
Mike Seymour
for #8, maybe provide an copy-constructor for that class and initialize that object usint the copy-constructor, so assigng the object wont be necessary.Additionally your hint #8 infringes the RAII paradigm.
smerlin
#3 is just wrong. In almost every case, smaller units than ints are going to hurt performance. Otherwise, good list.
jalf
My motivation for #3:I've worked with data-intensive applications where every floating point value was stored as a double regardless of its possible magnitude. In this case, I suppose two memory reads would be necessary to get a 64-bit double, where a 32-bit float would have sufficed. My primary concern was overall run-time memory usage/footprint. Even in the integer case, and I understand the counter-arguments, if the arithmetic operations on these values are seldom, the savings on page faults would far outweigh the extra arithmetic instructions.
@luke: since the OP asked about overall performance considerations, I'm assuming that performance is somewhat important. Most any practice that increases performance will in some way hurt one of the other -ilities. I tend to lean against making concessions for amateur programmers. You shouldn't be modifying any use of a variable that could leave it unitialized without first confirming whether that would be incorrect.
+3  A: 

Unless youre really sure another container type is better, use ´std::vector´. Even if ´std::deque, ´std::list´, ´std::map´ etc seems like more conveniant choices, a vector outperformes them both in memory usage and element access\iteration times.

Also, prefer using a containers member algorithm (ie ´map.equal_range(...)´) instead of thier global counterparts (´std::equal_range(begin(), end()...)´)

Viktor Sehr
+4  A: 

I like this question because it is asking for some "good habits". I have found that certain things that are desirable in programming are initially a chore, but bcome acceptable and even easy once they become habits. An example is always using smart pointers instead of raw pointers to control heap memory lifetime. Another, related of course, is developing the habit of always using RAII for resource acquisition and release. Another is always using exceptions for error handling. Thse three tend to simplify code, thereby making it smaller and so faster, as well as easier to understand. You could also make getters and setters implicitly inline; always make full use of initializer lists in constructors; and always use the find and other related functions that are provided in the std library, instead of crafting your own loops. Not specifically C++, but it is often worthwhile to avoid data copying. In long-running programs with a lot of memory allocation it can be worthwhile to consider memory allocation as a major part of the design, so that the memory you use comes from pools that are reused, although this is not necessarily a common enough thing to be considered worthy of forming a habit. One more thing - do not copy code from one place to another if you need the functionality - use a function. This keeps code size small and makes it easier to optimize all the places that use this functionality.

Permaquid
A good answer, but could use some paragraph breaks :)
Thomas
I'll have to try to cultivate that habit.
Permaquid
+1  A: 

Lots of good suggestions here already.

One of the best ways to get into good habits is to force them on yourself. For this I love PC-Lint. PC-Lint will actually enforce Scott Meyer's Effective C++ & More Effective C++ rules. Also obeying Lint rules tends to lead to easier to maintain, less error-prone, and cleaner code. Just don't go too crazy when you realize lint will often generate more output than you have source code; I once worked on a project with 150MB of source code and 1.8GB of Lint messages.

Chris
+1  A: 

This page sum up all you have to know about optimization in C++ (be it while or after writing software). It's really good advice and is very lear -- and can be used as a useful reminder in optimization phase on a project.

It's a bit old so you also have to know wich optimizations are already done by your compiler (like NRVO).

Other than that, reading the Effective C++, More Effective C++, Effective STL and C++ Coding Standards that have already been cited is important too, because it explains a lot of things about what occurs in the language and in the STL, allowing you to better optimize your specific case by using a better understanding of what's happening exactly.

Klaim
+4  A: 

Use the right container

Sequence containers

  • Do not use vector for data of unknown size if you are going to keep adding data to it. If you are going to repeatedly call push_back(), either use reserve() or use a deque instead.
  • If you are going to be adding/removing data in the middle of the container, list is probably the right choice.
  • If you are going to be adding/removing data from both ends of the container, deque is probably the right choice.
  • If you need to access the nth element of the container, list is probably the wrong choice.
  • If you need to both access the nth element of the container and add/remove elements in the middle, benchmark all three containers.
  • If you have C++0x capability and are using a list but you never move backwards through the list, you may find forward_list more to your liking. It won't be faster but it will take up less space.

Note that this advice becomes more applicable the larger the container. For smaller containers, vector may always be the right choice simply because of the lower constant factors. When in doubt, benchmark.

Associative containers

  • If you do not have TR1, C++0x, or a vendor-specific unordered_foo/hash_foo, there isn't a ton of choice. Use whichever of the four containers is appropriate to your needs.
  • If you do have an unordered_foo, use it instead of the ordered version if you do not care about the order of the elements and you have a good hash function for the type.

Use exceptions judiciously

  • Don't use exceptions in your normal code path. Save them for when you actually have an exceptional circumstance.

Love templates

  • Templates will cost you at compile time and space-wise, but the performance gains can be amazing if you have calculations that would otherwise be performed at run-time; sometimes even something so subtle as a downcast.

Avoid dynamic_cast

  • dynamic_cast is sometimes the only choice for doing something, but oftentimes the use of dynamic_cast can be eliminated by improving design.
  • Don't replace dynamic_cast with a typeid followed by a static_cast either.
coppro
I agree with everything except the first point. `vector` grows exponentially, so the cost of not calling `reserve` before multiple calls to `push_back` is marginal. (It's still better to call `reserve` if you know the new size, but don't avoid using `vector` just because you don't). Also, `deque` won't be any better in this case.
Mike Seymour
The unordered associative containers are not necessarily faster. If you have to rehash a lot (your profiler can tell you), they can be significantly slower.
Kristo
@Mike Seymour: That is true of most implementations but not necessarily the case. While it's true that a `vector` insert is probably faster than a `deque` insert if you don't reallocate, the cost of reallocating a large `vector` is immense. Worse, it's concentrated at one time, so a very large `vector` may create a noticeable lag depending on the application.
coppro
I'd like to add to the first point; if your accessing more often than you add data to a container, std::vector is probably the right choice, even if deque, set, map or list seems more conveniant.
Viktor Sehr
A: 

Why nobody mentioned it so far? Why is everyone into poor little ++i?

One of the best little things you can easily do, to not pessimize your code:

Effective C++ by Scott Meyers, Item 20:

Prefer pass-by-reference-to-const to pass-by value

Example:

// this is a better option
void some_function(const std::string &str);
// than this:
void some_function(std::string str);

In case of short std::string you might not win much, but passing big objects like that, can save you quite a lot of computing power as you avoid redundant copying. And can also save you from a bug or two if you forgot to implement your copy constructor.

Dmitry
Oh, stupid me, Kristo has recommended it already.
Dmitry
Copy Elision...
Inverse
+5  A: 

It seems from your question that you already know about the "premature optimization is evil" philosophy, so I won't preach about that. :)

Modern compilers are already pretty smart at micro-optimizing things for you. If you try too hard, you can often make things slower than the original straight-forward code.

For small "optimizations" you can do safely without thinking, and which doesn't affect much the readability/maintability of the code, check out the "Premature Pessimization" section of the book C++ Coding Standards by Sutter & Alexandrescu.

For more optimization techniques, check out Efficient C++ by Bulka & Mayhew. Only use when justified by profiling!

For good general C++ programming practices, check out:

  • C++ Coding Standards by Sutter & Alexandrescu (must have, IMHO)
  • Effective C++/STL series by Scott Meyers
  • Exceptional C++ series by Herb Sutter

Off the top of my head, one good general performance practice is to pass heavyweight objects by reference, instead of by copy. For example:

// Not a good idea, a whole other temporary copy of the (potentially big) vector will be created.
int sum(std::vector<int> v)
{
   // sum all values of v
   return sum;
}

// Better, vector is passed by constant reference
int sum(const std::vector<int>& v)
{
   // v is immutable ("read-only") in this context
   // sum all values of v.
   return sum;
}

For a small object like a complex number or 2-dimensional (x, y) point, the function will likely run faster with the object passed by copy.

When it comes to fixed-size, medium-weight objects, it's not so clear if the function will run faster with a copy or a reference to the object. Only profiling will tell. I usually just pass by const reference (if the function doesn't need a local copy) and only worry about it if profiling tells me to.

Some will say that you can inline small class methods without thinking. This may give you a runtime performance boost, but it may also lengthen your compile time if there is a heavy amount of inlining. If a class method is part of a library API, it might be better not to inline it, no matter how small it is. This is because the implementation of inline functions has to be visible to other modules/classes. If you change something in that inline function/method, then other modules that reference it need to be re-compiled.

When I first started to program, I would try to micro-optimize everything (that was the electrical engineer in me). What a waste of time!

If you're into embedded systems, then things change and you can't take memory for granted. But that's another whole can of worms.

Emile Cormier
Wow, I took too long to write that. It's aleady been covered by everyone!
Emile Cormier
@Emile Cormier This is called the "Fastest Gun in the West" (see http://meta.stackoverflow.com/questions/9731/fastest-gun-in-the-west-problem ). Here's how to beat it (Jon Skeet uses this methodology). Type up your basic, short answer as fast as possible and submit. This puts your answer up there and you are immediately eligible for votes. Then, go back and make edit(s) to your post until you are fully satisfied.
rlb.usa
@Emile_Cormier ( Non-SO discussion : http://cantgrokwontgrok.blogspot.com/2008/09/stackoverflow-crackoverflow-or.html scroll to "Bob Munden Effect") Random-Aside : Yea you go girl! Show them men females can program too. 8- ]
rlb.usa
@rib: Ahem... Emile is a French **masculine** name. The last 'e' is silent. Not to be confused with Emilie. :-)
Emile Cormier
+1  A: 
  1. Avoid memory fragmentation.
  2. Aligned memory.
  3. SIMD instructions.
  4. Lockless multithreading.
  5. Use proper acceleration trees, such as kd-tree, cover tree, octree, quadtree, etc. 5a. Define these in ways that allow for the first three (ie make nodes all in one block)
  6. inlining. The lowest hanging but quite delicious fruit.

The performance boosts you can get this way are astonishing. For me 1500 times for a computation heavy app. Not over brute fore, but over similar data structures written in a major software package.

I'd not bother with stuck like preincrement over post. That only gives savings in certains (unimportant) cases and most of what's mentioned is similar stuff that might scrape out an extra 1% here and there once in a while but usually isn't worth the bother.

Charles Eli Cheese