ansaurus

Question

C++ Long switch statement or look up with a map?

Answer 1

A:

When I see a case statement (of most any size) I wonder if polymorphism is a better approach. You have to maintain (sub)classes, but you encapsulate the behaviour and you're not maintaining sizeable and error-prone (through omission etc) case statements.

Brian Agnew 2010-03-17 20:46:41

In this case, the values are literally integers. They're stored in a file and need to be translated. No polymorphism; just looking for the best way to do the translation.

Rachel 2010-03-17 20:48:12

Answer 2

A:

I say map if all you are doing is assigning a value. My reason is that it is extensible without changing code, which your switch statement is not.

btw, how about an enum?

Simon 2010-03-17 20:47:25

Can't do an enum to translate ints to ints.

Rachel 2010-03-17 20:50:34

Answer 3

+6 A:

If your codes are contiguous enough and their range permit you would be much better of with old-fashioned straightforward array, something like

int lookup[] = {-1, 10, 15, -1 222};

then the switch statement can be rewritten as simply as

value = lookup[code];

all other options introduce additional cost to some extent.

Oleg Zhylin 2010-03-17 20:48:09

They're not contiguous. That's part of the problem. I'm trying to efficiently translate a random list of ints into another rather random list of ints.

Rachel 2010-03-17 20:51:57

+1. That is sometimes called a "table" approach

Nemanja Trifunovic 2010-03-17 20:52:12

@Rachel: Are both random lists known in compile time? If yes and the discontinuity is not big, the lookup table is still a good way.

KennyTM 2010-03-17 20:55:02

@KennyTM - Yes; it is.

Rachel 2010-03-17 20:58:07

This could be hard to maintain if the list of codes is somewhat long.Say at one point you are told that the value for code 145 has been changed to 8 and you now need to update the your source.

James Dean 2010-03-17 20:59:44

@James - but you could always have a pre-compile step that generates the lookup table source file from a more friendly format (maybe a JSON object?)

therefromhere 2010-03-17 21:31:19

Answer 4

A:

I think the generated code of the switch-case structure can grow quite large, if the amount of "codes" becomes large, in which case I think the stl::map is more appropriate.

S.C. Madsen 2010-03-17 20:50:11

Answer 5

+2 A:

The switch statement would be faster but if this is not in your applications performance bottleneck you should not really care about that.

Go for what makes your code easier to maintain over the long run.

Your sample is too short to make any meaningful call in that regard.

James Dean 2010-03-17 20:51:04

Answer 6

+5 A:

Personally, I would use the map, as its use implies a data lookup - using a switch usually indicates a difference in program behavior. Furthermore modifying the data mapping is easier with a map than with a switch.

If performance is a real issue, profiling is the only way to get a usable answer. A switch may not be faster if branch mispredictions happen often enough.

Another approach to think about this is if it wouldn't make more sense to combine the code and the associated value into a datastructure, especially if the range of codes and values is static:

struct Code { int code; int value; };

Code c = ...

std::cout << "Code " << c.code << ", value " << c.value << std::end;

Lars 2010-03-17 20:54:14

I like the point you made about maps implying data lookup and switches indicating difference in behavior.

Rachel 2010-03-18 00:21:51

Answer 7

+3 A:

It rather depends on what the codes are and how many there are. Good compilers have various tricks they use to optimise switch statements, some of which they won't employ with straight if/then statements. Most are bright enough to do simple maths or use lookup/jump tables for case 0, 1, 2 or case 3, 6, 9 for example.

Of course some don't, and many are easily foiled by unusual or irregular sets of values. Also if code for handling several cases looks remarkably similar, cut and paste can lead to maintenance issues. If you have many codes but they can be divided algorithmically into groups, you might consider several/nested switch statements, for example rather than:

switch (code) {
    case 0x0001: ...
    case 0x0002: ...
    ...
    case 0x8001: ...
    case 0x8002: ...
    ...
}

You might use:

if (code & 0x8000) {
    code &= ~0x8000;
    switch (code) {
        case 0x0001: ... // actually 0x8001
        case 0x0002: ... // actually 0x8002
        ...
    }
}
else {
    switch (code) {
        case 0x0001: ...
        case 0x0002: ...
        ...
    }
}

Many language interpreters decode opcodes this way, since a single byte opcode may have additional information packed into various bits, and transcribing all possible combinations and their handlers would be repetitious and fragile. On the other hand, excessive bit mangling can defeat any optimisation by the compiler and be counter-productive.

Unless you're sure this is a real performance bottleneck I'd avoid premature optimisation: do it whichever way strikes you as reasonably robust and quick to implement. As and if your application is running too slowly, profile it and optimise accordingly.

El Zorko 2010-03-17 21:16:34

Answer 8

A:

Read the integers into an array/vector
sort the array/vector
use bsearch on the underlying array

EvilTeach 2010-03-17 21:19:57

Why not just use a std::map? The performance guarantees are the same and the code's simpler.

Peter 2010-03-17 22:24:05

Smaller memory foot print == better locality == better performance

EvilTeach 2010-03-18 13:38:22

Answer 9

+2 A:

I'm partial to lookup tables myself, because unusually long switch statements seem to me to confuse a separation between code and data. I also think tables lend themselves better to future changes and maintenance.

All IMHO, of course.

Ben Collins 2010-03-17 21:21:56

Answer 10

+1 A:

If you can use tr1 you could use unordered_map to hash the values (hashing ints can be really fast too), which should make most lookups constant time.

However unless you have profiling data to indicate that this is a bottleneck in your program, code it in the approach that makes the most sense from a design standpoint.

Mark B 2010-03-17 21:25:25

Answer 11

A:

I suggest using a static, constant, table of pairs. This is another form of the look up table:

struct Table_Entry
{
    int code;
    int value;
};

static const Table_Entry  lookup_table[] =
{
  {1, 10},
  {2, 15},
  {3, 13},
};

static const unsigned int NUM_TABLE_ENTRIES =
    sizeof(lookup_table) / sizeof(lookup_table[0]);

A benefit to this is that the table is generated at compile time, unlike a std::map which must be initialized during run-time. If the quantities are large, you could use std::lower_bound to find the entry, provided the table is ordered.

Another benefit is this technique is data driven. The data can change without changes to the search engine. Changes to code or process may require serious regression testing but data changes may not; YMMV.

This is similar to what a compiler might generate.

Thomas Matthews 2010-03-18 00:09:39

Answer 12

A:

Normally I'd suggest a map or lookup array or maybe even some hybrid lookup monster (assuming that you're optimizing for speed or code size rather than readability, of course), but it is worth noting that the new versions of GCC are smart enough to replace switch/case assignments like this to lookup tables. While this isn't so great for totally sparse keys it may be if you're keys are in clumps like: 0, 1, 2, 3, 4, 5, 11, 12, 13, 14, 15, 193, 194, 195, 196, 197, 198...

Of course if you could possibly fond some arithmatic to do the conversion, even better (usually). Never overlook shifting, swapping endianness, or more traditional math when doing something like this.

nategoose 2010-03-24 22:16:27

ansaurus

tags:

views:

answers:

C++ Long switch statement or look up with a map?

related questions