views:

235

answers:

7

Hi,

Suppose I have such an if/else-if chain:

if( x.GetId() == 1 )
{
}
else if( x.GetId() == 2 )
{
}
// ... 50 more else if statements

What I wonder is, if I keep a map, will it be any better in terms of performance? (assuming keys are integers)

+5  A: 

why dont you use a a switch ?

swich(x.GetId())
{
  case 1: /* do work */ break; // From the most used case
  case 2: /* do work */ break; 
  case ...: // To the less used case
}

EDIT:

Put the most frequently used case in the top of the switch (This can have some performance issue if x.GetId is generally equal to 50)

Phong
Just for info: Some compilers will recognize a "dense" set of cases, i.e. a (mostly) consecutive bunch of integers, and optimize the switch into an indexed jump. Thus, often "hand optimizing" the frequent cases is superfluous.
Carl Smotricz
+2  A: 

switch is the best thing I think

skwllsp
AFAIK, switch statements are not significantly better than if/else in terms of performance, am I right?
perezvon
Who knows, why don't you profile it and see? Guessing has no merit. If you told us what you are doing (the big picture), we could give better advice. That said, switch statements tend to perform better, for certain numbers of case, IIRC.
GMan
@perezvon: No. `switch` usually creates a jump table if the bounds is not large, giving O(1) performance.
KennyTM
Whether there is a performance benefit or not, given the example code, a `switch` will probably be easier to read and maintain than an `if/else` chain.
Dennis Zickefoose
The original code has a `getId()` happening in each of the 50 branches of the `if`; unless that's optimized away (and that's only possible if the compiler can recognize that `getId()` will return the same value on each call), an average of 24 unnecessary calls to `getId()` will have a significant impact on performance. Thus, `switch` is *certainly* better in performance.
Carl Smotricz
A: 

The better solution would be a switch statement. This will allow you to check the value of x.GetId() just once, rather than (on average) 25 times as your code is doing now.

If you want to get fancy, you can use a data structure containing pointers to functions that handle whatever it is that's in the braces. If your ID values are consecutive (i.e. numbers between 1 and 50) then an array of function pointers would be best. If they are spread out, then a map would be more appropriate.

Carl Smotricz
+9  A: 

Maps (usually) are implemented using red-black trees which gives O(log N) lookups as the tree is constantly kept in balance. Your linear list of if statements will be O(N) worst case. So, yes a map would be significantly faster for lookup.

Many people are recommending using a switch statement, which may not be faster for you, depending on your actual if statements. A compiler can sometimes optimize switch by using a jump table which would be O(1), but this is only possible for values that an undefined criteria; hence this behavior can be somewhat nondeterministic. Though there is a great article with a few tips on optimizing switch statements here Optimizing C and C++ Code.

You technically could even formulate a balanced tree manually, this works best for static data and I happened to just recently create a function to quickly find which bit was set in a byte (This was used in an embedded application on an I/O pin interrupt and had to be quick when 99% of the time only 1 bit would be set in the byte):

unsigned char single_bit_index(unsigned char bit) {
    // Hard-coded balanced tree lookup
    if(bit > 0x08)
        if(bit > 0x20)
            if(bit == 0x40)
                return 6;
            else
                return 7;
        else
            if(bit == 0x10)
                return 4;
            else
                return 5;
    else
        if(bit > 0x02)
            if(bit == 0x04)
                return 2;
            else
                return 3;
        else
            if(bit == 0x01)
                return 0;
            else
                return 1;
}

This gives a constant lookup in 3 steps for any of the 8 values which gives me very deterministic performance, a linear search -- given random data -- would average 4 step lookups, with a best-case of 1 and worst-case of 8 steps.

This is a good example of a range that a compiler would probably not optimize to a jump table since the 8 values I am searching for are so far apart: 1, 2, 4, 8, 16, 32, 64, and 128. It would have to create a very sparse 128 position table with only 8 elements containing a target, which on a PC with a ton of RAM might not be a big deal, but on a microcontroller it'd be killer.

joshperry
Even if a switch can't make it a table, I think the compiler can automatically order comparisons to turn it into a binary search for you.
visitor
Can being the key word, of course the compiler _can_ do all kinds of stuff, but if it's not in the standard and/or your compiler isn't standards compliant then you'll have to live with the nondeterministic nature of "can", "might", or "may". One build will run quickly, you make one change and your next build is slow, you then have to figure out why all of a sudden your switch decomposed into if/else equivalent instead of a BST or jump table. A map-based-jump-table will be more readable, deterministic, extensible and faster in most cases.
joshperry
A: 

The answer, as with most performance related questions, is maybe.

If the IDs are in a fortunate range, a switch might become a jump-table, providing constant time lookups to all IDs. You won't get much better than this, short of redesigning. Alternatively, if the IDs are consecutive but you don't get a jump-table out of the compiler, you can force the issue by filling an array with function pointers.

[from here on out, switch refers to a generic if/else chain]

A map provides worst-case logarithmic lookup for any given ID, while a switch can only guarantee linear. However, if the IDs are not random, sorting the switch cases by usage might ensure the worst-case scenario is sufficiently rare that this doesn't matter.

A map will incur some initial overhead when loading the IDs and associating them with the functions, and then incur a the overhead of calling a function pointer every time you access an ID. A switch incurs additional overhead when writing the routine, and possibly significant overhead when debugging it.

Redesigning might allow you to avoid the question all together. No matter how you implement it, this smells like trouble. I can't help but think there's a better way to handle this.

Dennis Zickefoose
A: 

If I really had a potential switch of fifty possibilities, I'd definitely think about a vector of pointers to functions.

#include <cstdio>
#include <cstdlib>
#include <ctime>

const unsigned int Max = 4;

void f1();
void f2();
void f3();
void f4();
void (*vp[Max])();

int main()
{
    vp[ 0 ] = f1;
    vp[ 1 ] = f2;
    vp[ 2 ] = f3;
    vp[ 3 ] = f4;

    srand( std::time( NULL ) );
    vp[( rand() % Max )]();
}

void f1()
{
    std::printf( "Hello from f1!\n" );
}

void f2()
{
    std::printf( "Hello from f2!\n" );
}

void f3()
{
    std::printf( "Hello from f3!\n" );
}

void f4()
{
    std::printf( "Hello from f4!\n" );
}
Baltasarq
quite a nice design: I would statically initialized the function vector vp.
Phong
in some case, you may need to have managed the case were (x.getID is not between 0 and Max) It also required more memory than needed (eg: we want to check the case for just x=1-50, and x=100 ... or any other pattern.
Phong
If you're comparing to a map, even with empty spaces you can save memory. In addition to the function pointers, it stores the keys and whatever miscellaneous data it needs to maintain the tree [a color and three pointers, probably]. Lots of variables.
Dennis Zickefoose
Thank you for your comments. Yes, if the options are not contiguous then this is not a neat solution, but it can still work if you are able to convert the x.getId() to the domain 0..Max-1 with a more or less simple function (if there are "random" gaps then you will have to balance the advantage of direct access against unused memory). About limits, you can create an inline function in order to access the vector. That will obtain the desired result without (the minimum) performance penalties.
Baltasarq
A: 

There are a lot of suggestions involving switch-case. In terms of efficiency, this might be better, might be the same. Won't be worse.

But if you're just setting/returning a value or name based on the ID, then YES. A map is exactly what you need. STL containers are optimised, and if you think you can optimise better, then you are either incredibly smart or staggeringly dumb.

e.g A single call using a std::map called mymap,

thisvar = mymap[x.getID()];

is much better than 50 of these

if(x.getID() == ...){thisvar = ...;}

because it's more efficient as the number of IDs increases. If you're interested in why, search for a good primer on data structures.

But what I'd really look at here is maintenance/fixing time. If you need to change the name of the variable, or change from using getID() or getName(), or make any kind of minor change, you've got to do it FIFTY TIMES in your example. And you need a new line every time you add an ID.

The map reduces that to one code change NO MATTER HOW MANY IDs YOU HAVE.

That said, if you're actually carrying out different actions for each ID, a switch-case might be better. With switch-case rather than if statements, you can improve performance and readability. See here: http://stackoverflow.com/questions/97987/switch-vs-if-else I'd avoid pointers to functions unless you're very clear on how they'd improve your code, because if you're not 100% certain what you're doing, the syntax can be messed up, and it's overkill for anything you'd feasibly use a map for.

Basically, I'd be interested in the problem you're trying to solve. You might be better off with a map or a switch-case, but if you think you can use a map, that is ABSOLUTELY what you should be using instead.

deworde