On the other hand I've often heard
(and read) people arguing that OO
programming is slower by design that
non-OO programming. And I've heard
that especially about C++.
What I wonder is what makes them say
so, as it is so different from my
personal experience.
... what kind of OO practises could lead to slower C++ code that equivalent programs written using C.
- encapsulation
- virtual dispatch
There are two levels at which I'd like to address this question. Firstly, OO vs non-OO, secondly a more general discussion of C vs C++ performance and how it shapes opinions - OO may simply be a tangible thing for C programmers to point at when complaining of C++'s performance.
OO specifically deals with encapsulation, run-time polymorphism (virtual dispatch), and inheritance.
Of these, encapsulation can adversely affect performance by insisting on discrete operations preserving class invariants, whereas a lack of encapsulation lets client code affect an object in a more direct way that may be optimised in light of several operations that will be preformed before the state again meets the invariants. For example, OO programming is more likely to result in unnecessary default initialisation of variables that will later be over-written before being read. The issues mirror those in the OS world: monolithic OS kernels yield higher performance than modules arranged around a micro-kernel, although the latter is typically more stable.
By writing user-defined types (objects) supporting value semantics, programmers are more likely to carelessly copy values around and create temporaries. In C programming the language doesn't provide user-defined operators etc. that so easily lead to the creation of temporary objects.
In C++, implicit constructors and conversion operators also create objects that may typically be avoided in equivalent C code. For example...
void fn(const std::string& s);
...is a convenient interface as you can call it without having to use .c_str() on a std::string, and you can also pass a const char* and have a temporary created. Often, C++ programmers won't bother to create a second void fn(const char*) unless it's really obvious - or proven by profiling - to be significant. All these little things add up though and contribute to the general impression of C++ as being wasteful.
Virtual dispatch forces out-of-line function calls, which - for trivially simple function bodies - can be an order of magnitude slower than inlined calls. Their speed is comparable to explicitly using pointers to functions, but that can still be slower than code that uses switch statements, cascading ifs etc., as is more common in non-OO code. The OO code is more maintainable though.
C++'s OO features also encompass constructors and operators that most naturally use exceptions to report issues (given the former have no return value and setting a state for subsequent testing invites the ignoring of errors, and the latter typically needs - when successful - to return a reference to the current or a resultant object). Exceptions may or may not be less efficient than return codes depending on the hardware, compiler, situation and frequency of throws etc., but with the older compilers that were around when many of the people you say disparage C++'s performance and OO generally formed their opinions.
Considering the bigger picture of C++ vs C performance and not just OO, C++ provides higher level facilities in the STL and other libraries that provide general purpose facilities. In many specific, limited uses a hand-crafted solution may outperform the general solution, although typically only by a small amount. C++ addresses this better than any other language I'm aware of by supporting templated algorithms that are instantiated for each type to which they're applied, allowing inlining, traits-driven, compile-time sizeof and other optimisations. Still, a std::string uses the heap whereas C programs go further out of their way to avoid or minimise that, and C-style heap allocation offers realloc which can substantially outperform a secondary new/copy/delete cycle. Low-cost operations like maintaining container sizes are adopted as standard practice in the STL, but may be unnecessary in equivalent C code. STL heap memory usage tends to be dynamically scaled to run-time usage, whereas C programmers tend to put more effort into avoiding the need for that and may reap consequent performance gains (often at the cost of arbitrary limitations on program capabilities... how many standard UNIX utils written in C have arbitrary hard-limits on line-size etc. on Solaris etc. - extra effort had to be spent to systematically remove such limits from GNU utils).
Crucially, if you pick up any single C library or function that provides equivalent general-purpose higher-level functionality, it's overwhelmingly likely to perform worse than the C++ equivalents (due to lack of templates). Consider the common UNIX C qsort() and bsearch().
Design patterns do tie into this. C++ programmers may be helped towards higher-level conceptualisation of their problems by having more middle-ground provided by the STL (compared to C programmers). Naturally, they are more likely to use that to adopt more formalised, standardised, generic, reusable etc. solutions to common issues at the level of abstraction that allows. Again, generalised solutions stacked on top of generalised solutions tend to perform worse than monolithic integrated solutions. Nobody makes you use these higher-level general-purpose building blocks... if they're inappropriate to your performance needs then don't. But, normally they're fine and time's better spent profiling afterwards and making a few targetted tweaks.
Again, productivity is a core issue here. C++ lets less people get more done (compared to C), while still being able to go as low as necessary to get as much or more performance when needed. In some senses, C++ is a (reputational) victim of its own success: "C++ takes too long to compile"... mainly because it scales to tens of millions of lines of code that the complaintant can't dream of in their newly-beloved Ruby interpreter (though compilation dependencies do need to be actively managed). "C++ is too slow handling associative containers of strings", because something someone spent 2 seconds to implement it in C++ is being contrasted with months of work on a container hand-optimised for the string content involved.
There - I've succeeded in writing an answer as rambling as the question ;-P.