views:

73

answers:

5

Hi!

This is an odd problem, so I have to provide a bit of background. I have a C++ project that I'm working on that I want to clean up a bit. The main issue that I'm dealing with that makes me want to barf is the massive abuse of preprocessor macros that are used in a core component of the project. There's a file with a bunch of #defines that are commented/uncommented before compiling and using the program in order to toggle the use of different algorithms. I'd much rather have command-line arguments to do that rather than recompiling every time we want to try a different algorithm. The problem is that there are so many #ifdef's interwoven throughout the code that it seems impossible to simply refactor the code for each algorithm.

I've been told that the reasoning behind this is that this is supposed to be a real-time system that will be dealing with millisecond units of time, and the code in this component is called so many times that having an if check would adversely affect our performance. If you want to try another algorithm, you have to set the appropriate flags and recompile so that performance is optimized.

So my question for you all is this:

Is there any way that I can get rid of these macros and instead use command-line arguments without a heavy hit to performance and without reworking the logic and the code?

One of the options I was considering was trying to have compiled versions of this component for each of the possible combinations of algorithms, and then pick the version that would be defined by the combination of provided command-line arguments. But according to my friend, the number of combinations is just too many for this to be feasible. I haven't worked out the numbers myself, but I'll take his word for it considering how much work he put into this code.

Thanks in advance for any advice!

+1  A: 

Have you profiled the code in question? Assuming an if statement slows down a program sounds like premature optimization, which is a code smell.

Sam Miller
No, we haven't profiled it too deeply. We have monitored how many noting how often the code is called during a run and how well it performs under our worst-case conditions, so I could probably use that as a benchmark and try to run through those same test conditions with at least a few of the #ifdefs replaced. It's way more than just one if statements, but you're right: we won't know for sure without truly profiling it...
MKA
+1  A: 

Ah, the atrocities that can be wrought by people being creative with CPP.

You need, first of all, to decide how badly you want to do this. In C++. the right way to handle a situation like this is to build a collection of classes that represent the places where there is some difference, and then hide the differences behind an interface, to that you have, say

DifferentialEquationIntegrator <:
     Runge-Kutta Integrator
     Eulers Method Integrator

and so on (read the <: as an inheritance arrow, or "provides-satisfies" -- A <: B means "A provides a behavioral specification that B satisfies."

It's essentially a mechanical translation to go from the scheme you describe to a "right" scheme, but it will be hairy and extended, and will fee a whole lot like you're rewriting the code.

If you don't want to do this, you will need to analyze the existing code. I'm out of touch with the current state of code analysis tools, but there are many vendors and no few open-source tools. They can help quite a bit.

Yes another option is to use the preprocessor to help. You can generally run the preprocessor against the code, resulting in the generated code. For example, in gcc the -E flag does this. The result, for historical reasons, contains all the newlines it had (make wc -l make more sense) so if you mean to read it, run the output through indent(1) or something similar.

If you do this with all the different sets of flags, you will learn a good bit about the shared code via some diffs.

The fourth option is to build your own tool that breaks the stuff down into fragments and helps you re-arrange them. This should make a good PhD project for someone.

Charlie Martin
At first, I had hoped we'd be able to extract most of the logic and abstract the differences in different implementations of an interface, similar to what you first mentioned. Although after seeing the code and understanding how interleaved everything is, I can understand my teammates' hesitation in doing so. Still, I'm hoping we can still manage to do it the right way. I like the idea of diffing the -E'd files. That and using those code analysis tools are definitely immediate options. And I like the idea of working on that last option for my PhD. :)
MKA
+1  A: 

There is a pattern to deal with this kind of issue: Strategy.

You choose the strategy (algorithm) you wish to use ONCE and then pass the object around. This of course requires a Factory to build the right Strategy object.

As for the interleaved code, common code should be factorized.

Anyway, if you're unsure and fear you might break something, start by writing a bunch of unit-tests before attempting to change the program. This way if anything breaks during the refactoring, then hopefully it'll be caught by a test. Also, try and refactor little by little (file by file for example, or whatever unit makes sense).

Matthieu M.
+1  A: 

If I was in need of using optional algorithms, selecting them in runtime, I would utilize some kind of hash table. My algorithms would be functions with the same signature, so that I can create an array of pointers to functions and invoke them by index in the array (index can be parsed as a command line parameter). There would be no performance penalties from virtuality (like in many design patterns) and no penalties from "if"s (like in choosing algorithms manually).

Some code:

// type of my funcs:
typedef void (*SolverFunc)( const SolverParams &sp );

// implementation for the algorithms:
void EulerSolver( const SolverParams &sp ) { ... }

void RhungeSolver( const SolverParams &sp ) { ... }

// my array of solvers:
static SolverFunc s_solvers [] = { EulerSolver, RhungeSolver };

// parsing command line params:
int main( int argc, char** argv )
{
    int solverIndex = ParseIndex(argv);
    s_solvers[solverIndex] ( .. params .. );
    return 0;
}

Well, the code is c-style rather than c++-style, but the idea is worth considering. p.s. I am not sure, whether the example is syntactically correct, sorry =)

SadSido
+2  A: 

this is supposed to be a real-time system that will be dealing with millisecond units of time

That's a real problem.

[...] that having an if check would adversely affect our performance.

That's not a good reason.

If your code has been benchmarked for performance and optimized as a result (as it should have been), that would apply. I can't imagine any scenario where you would obtain a significant performance gain by replacing ifs with #defines (unless the ifs were done comparing string contents, using sequential search, or something similarly disastrous on performance).

Because of this I'm willing to bet that the decision to use macros was chosen at design time which would probably make it a case of premature optimization ("premature optimization is the root of all macro-definitions" :D)

Is there any way that I can get rid of these macros and instead use command-line arguments without a heavy hit to performance and without reworking the logic and the code?

Yes.

Here are some possible steps (there are other solutions, but this one is not using ifs at all:

  1. Define a benchmark on your code and run it (store the results)

  2. Locate one area of the code that's implemented in terms of more than one possible #defines.

  3. Move the defines behind functions with a common interface.

  4. At run-time, compare a parameter to a constant and pass a pointer to the chosen function to the client code.

    Things to avoid:

    • performing the comparison more than once; after the comparison you should have a chosen function pointer; that function pointer should be passed around, not your parameter.
    • performing the comparison using strings (or char* or anything that's not a number). Comparing strings - or any comparision not done in constant time - is disastrous for performance-critical code. Instead of comparing the parameter value using an if consider doing it using a switch statement.
    • passing large structures as parameters to your strategy functions. Passing should be done by (const) references or by pointers.
  5. call the strategy code through the function pointer instead of directly.

  6. Repeat benchmark done at step 0 and compare performance.

At this point you should have a strong case to present to your boss/manager:

  • you can make the code run as fast (adding the cost of a function call to your performance-critical code shouldn't matter much - at assembly level a function call should involve passing a few pointers on the stack and a jmp instruction - I think). You can show it runs as fast using your benchmark results.

  • you code will be easier to maintain (more modular, separating functional blocks behind interfaces, centralizing change and so on)

  • your code should be easier to extend (same reasons as above)

  • you should not have to recompile your codebase any longer.

  • you got rid of a big problem (caused by premature optimization).

  • you can continue to refactor the code base (and get rid of more macros) as development/maintenance goes on in other areas, with virtually no changes in functional behavior.

utnapistim