views:

111

answers:

4

I am writing some code for handling data. There are a number of groups of processing functions that can be chosen by the user that are then applied to the dataset. I would like to implement all these groups in separate places, but since they all take the same parameters and all do similar things I would like for them to have a common interface.

Being a good little c++ programmer my first thought was to simply use polymorphism. Just create some abstract class with the desired interface and then derive each set of processing objects from that. My hopes were quickly dashed however when I thought of another wrinkle. These datasets are enormous, resulting in the functions in question being called literally billions of times. While dynamic lookup is fairly cheap, as I understand it, it is a good deal slower than a standard function call.

My current idea to combat this is to use function pointers, in a manner something like this:

void dataProcessFunc1(mpz_class &input){...}
void dataProcessFunc2(mpz_class &input){...}
...
class DataProcessInterface
{
    ...
    void (*func1)(mpz_class);
    void (*func2)(mpz_class);
    ...
}

With some sort of constructor or something for setting up the pointers to point at the right things.

So I guess my question is this: Is this a good method? Is there another way? Or should I just learn to stop worrying and love the dynamic lookup?

+7  A: 

A virtual function call is a function call via a pointer. The overhead is generally about the same as an explicit function call via a pointer. In other words, your idea is likely to gain very little (quite possibly nothing at all).

My immediate reaction would be to start with virtual functions, and only worry about something else when/if a profiler shows that the overhead of virtual calls is becoming significant.

When/if that occurs, another possibility would be to define the interface in a class template, then put the various implementations of that interface into specializations of the template. This normally eliminate all run-time overhead (though it's often a fair amount of extra work).

Jerry Coffin
The extra bulkiness of the template solution can make it perform worse in some cases because of 1) cache constraints 2) branch target prediction for computed gotos in modern processors, precisely designed to speed up the execution of interpreted or object-oriented code, which lessen the overhead of the dynamic solution(s).
Pascal Cuoq
The using the virtual functions sounds reasonable (it certainly is easier). I know that virtual function calls are calls via pointer but I was worried about the fact that it has to look at a table each time the call is made to determine which subclass the virtual function corresponds to. Is this not true or does it just count for essentially nothing?
James Matta
@Pascal:Yup, that's part of why I advised trying virtual functions first. Another part is, of course, that they're easy to understand.
Jerry Coffin
@James Matta:The standard doesn't guarantee how they work, but normally each object will have a vtable pointer. At first use, that'll typically be loaded into a register. Calling a virtual function is then an indexed indirect reference from there (i.e. it calls a function whose address is at a specified offset from where that points). As Pascal pointed out, most recent processors include hardware specifically to predict such branches well, so their overhead is usually pretty low.
Jerry Coffin
+2  A: 

The abstract interface approach is certainly the cleanest from a coding point of view, and much preferable to obfuscating your code with function pointers, which is really programming C in C++.

Have you actually determined that you have a performance issue with the interface approach?

It's best to write readable and maintainable code first, and only optimise if you need to.

LeopardSkinPillBoxHat
+3  A: 

I don't agree with one answer above that says a template-based solution could have a worst overhead or run-time. In fact, template-based solutions allow to write faster code removing the need of virtual functions or call-by-pointer (I agree, though, that using these mechanism still does not impose a significant overhead.)

Suppose that you configure your processing interface using a series of "traits", that is, processing parts or functions that can be configured by a client to tune the processing interface. Imagine a class with three (to see an example) parameterizations of processing:

template <typename Proc1, Proc2 = do_nothing, Proc3 = do_nothing>
struct ProcessingInterface
{
    static void process(mpz_class& element) {
        Proc1::process(element);
        Proc2::process(element);
        Proc3::process(element);
    }
};

If a client have different "processors" with an static function "process" that know how to process an element, you can write a class like this to "combine" those three processings. Note that the default do_nothing class has an empty process method:

class do_nothing
{
public:
    static void process(mpz_class&) {}
};

These calls have no overhead. They are normal calls, and a client can configure a processing using ProcessingInterface<Facet1, Facet2>::process(data);.

This is only applicable if you know the different "facets" or "processors" at compile time, which seems to be the case with your first example.

Note also that you can write a more complicated class by using metaprogramming facilities such as the boost.mpl library, to include more classes, iterate through them, etc.

Diego Sevilla
These are my thoughts as well. Just a clarification, you don't need to exactly which facets will be used at compile time, you just need to know all of the potential facets. I think this is what you meant, but it wasn't quite clear. So you can put the various calls to ProcessingInterace<> in a big switch that only gets executed once.
KeithB
There is no "answer above". By default, answers are sorted by rep. Referring to an "above answer" is just going to confuse readers. If you're going to refer to another answer, you should do it by name. :)
jalf
+1  A: 

These datasets are enormous, resulting in the functions in question being called literally billions of times. While dynamic lookup is fairly cheap, as I understand it, it is a good deal slower than a standard function call.

Billions of times in how much time? If your app runs for an hour, a billion function calls is nothing, and won't make a dent on performance. But if the entire dataset is processed in 100ms, a billion function calls are a significant source of overhead. Simply talking about how many times a function is called is meaningless. What matters performance-wise is how often it is called. Number of calls per time unit.

If this is actually a performance issue at all, I'd go with a template approach. The user isn't going to decide between each call which operations to apply. He's going to make the decision once, and then all the billions of calls are resolved.

Simply define a class for each group of functions that the user can choose, make sure they expose the same interface (possibly using CRTP to streamline the process a bit and easily factor out common code), and then depending on the user's choice of strategy, pass the appropriate class to the (templated) function which is responsible for doing all the processing.

But as the other answers have said, this might not be a performance bottleneck at all. Don't waste your time trying to optimize code that turns out not to matter.

jalf
Perhaps the figure was provided to show the processing intensity and ask for a solution that simply executes in the least amount of time, not a specific amount of time? I would guess with such bulk processing, it's more common to have variably lesser running time than a deadline time.
Ioan
There is no deadline time. I was stating that the datasets contain billions of individual points of data and that the functions had to be called on each point.
James Matta
@James: Yeah, but my point is that while "billions of function calls" sounds like a lot, it's only really worth worrying about if those billions of calls are required to be executed in a short time. If the entire process takes an hour to run, then those billions of function lookups won't even show up in a profiler. If If the entire process runs in 2 seconds, then billions of virtual function lookups are going to be a major source of overhead.
jalf