views:

657

answers:

7

Profiling my C++ code with gprof, I discovered that a significant portion of my time is spent calling one virtual method over and over. The method itself is short and could probably be inlined if it wasn't virtual.

What are some ways I could speed this up short of rewriting it all to not be virtual?

+9  A: 

Are you sure the time is all call-related? Could it be the function itself where the cost is? If this is the case simply inlining things might make the function vanish from your profiler but you won't see much speed-up.

Assuming it really is the overhead of making so many virtual calls there's a limit to what you can do without making things non-virtual.

If the call has early-outs for things like time/flags then I'll often use a two-level approach. The checking is inlined with a non-virtual call, with the class-specific behavior only called if necessary.

E.g.

class Foo
{
public:

inline void update( void )
{
  if (can_early_out)
    return;

  updateImpl();
}

protected:

virtual void updateImpl( void ) = 0;    
};
Andrew Grant
+6  A: 

Is the time being spent in the actual function call, or in the function itself?

A virtual function call is noticeably slower than a non-virtual call, because the virtual call requires an extra dereference. (Google for 'vtable' if you want to read all the hairy details.) )Update: It turns out the Wikipedia article isn't bad on this.

"Noticeably" here, though, means a couple of instructions If it's consuming a significant part of the total computation including time spent in the called function, that sounds like a marvelous place to consider unvirtualizing and inlining.

But in something close to 20 years of C++, I don't think I've ever seen that really happen. I'd love to see the code.

Charlie Martin
+1 for talking about the whole issue here, including your experience. I also have trouble believing the call overhead is the real issue.
dwc
+5  A: 

If the virtual calling really is the bottleneck give CRTP a try.

DaClown
+4  A: 

Please be aware that "virtual" and "inline" are not opposites -- a method can be both. The compiler will happily inline a virtual function if it can determine the type of the object at compile time:

struct B {
    virtual int f() { return 42; }
};

struct D : public B {
    virtual int f() { return 43; }
};

int main(int argc, char **argv) {
    B b;
    cout << b.f() << endl;   // This call will be inlined

    D d;
    cout << d.f() << endl;   // This call will be inlined

    B& rb = rand() ? b : d;
    cout << rb.f() << endl;  // Must use virtual dispatch (i.e. NOT inlined)
    return 0;
}

[UPDATE: Made certain rb's true dynamic object type cannot be known at compile time -- thanks to MSalters]

If the type of the object can be determined at compile time but the function is not inlineable (e.g. it is large or is defined outside of the class definition), it will be called non-virtually.

j_random_hacker
While correct this is a bit of a contrived situation that in real-life virtually (haha!) never occurs - if it did 'f' would not even need to be virtual in the above code.
Andrew Grant
B // and it's obvious why f needs to be virtual.
MSalters
@Andrew: I disagree -- my point is that it's possible to make a method virtual, enabling the flexibility that goes along with that, without sacrificing the speed available from inlining whenever possible.
j_random_hacker
@MSalters: Excellent suggestion, I've updated the post accordingly.
j_random_hacker
+1  A: 

It's sometimes instructive to consider how you'd write the code in good old 'C' if you didn't have C++'s syntactic sugar available. Sometimes the answer isn't using an indirect call. See this answer for an example.

timday
+1. This solution is reasonable because in this case the asker has actually identified the cause of slowness to be this function call -- but in general, don't "optimise" with a switch until you're certain of where your code is spending its time.
j_random_hacker
+1  A: 

You might be able get a little better performance from the virtual call by changing the calling convention. The old Borland compiler had a __fastcall convention which passed arguments in cpu registers instead of on the stack.

If you're stuck with the virtual call and those few operations really count, then check your compiler documentation for supported calling conventions.

veefu