tags:

views:

177

answers:

9

What can I do to improve large switches and if-elses speed manually? I will probably need some kind of hash or lookup table.

I'm working with gcc and C code, I doubt that gcc has any inbuilt optimizations for this.

Edit: My switch code is what every switch looks like, do something based on if a particular int is some value. My if-elses look like this:

if( !strcmp( "val1", str ) )
foo();
else if( !strcmp( "val2", str ) )
foo2();
...

I also have ifs that do this

if( struct.member1 != NULL )
foo();
if( struct.member2 != NULL )
foo2();

EDIT2: Thank you everyone. I'm not sure which one I should pick as an answer, because a lot of these answers have valid points and valuable insights. Unfortunately, I have to pick just one. But thanks all! In the end, using a perfect hash table seems the best way to get O(n) time on the access for both ifs and switches.

+1  A: 

I'm not sure what are you looking for, but branch prediction with gcc is discussed in this question

Drakosha
No, I'm not looking for probability based optimizations like that one. I want to improve the general time complexity
jetru
+1  A: 

It has. Just see the generated code. At least it optimizes switches.

You may use hash table to optimize your code, but I'm sure that GCC does the same for you.

Another thing is if-else's, when they contain some complex boolean expressions. I will not answer this part of question here.

Lavir the Whiolet
Please read the edit. How can I use a hash table for code like that?
jetru
@jetru: you can use a hashtable that maps a string (char[]) to function pointers
Lie Ryan
@jetru: also, for the second case, you're screwed up with linear search; a probability-based optimization (Profile-Guided Optimization) is probably the best you can do
Lie Ryan
@jetru: Strings are easily hashable. There are a bunch of algorithms, just google them (use Java's one, for example). What about second case - I don't know, but probably there also is some hash algorithm for that.
Lavir the Whiolet
+1  A: 

It really depends on the code base you are working with and whether it is open to further/better modularization. Otherwise, if nothing else I can recommend this.

If there are more common cases than others (one or two things happen more than the rest), place them at the beginning of the switch/if/else, that way in the more common cases your program will only make that first one or two comparisons and short circuit its path. Generally a good idea on its own for any code IMHO.

gnomed
+1  A: 

It depends much on the strings that you are comparing. You could do a switch on some characteristics of the strings:

  • If you know that they differ pretty well in the 4th position, you could do a switch on str[3] and only then do the strcmp.
  • Or look at some sort on checksum and switch.

But all of this is quite handcrafted, you definitely should check the assembler that gcc produces.

Jens Gustedt
+2  A: 

To use a hash table:

  1. Pick a hash function. This one is a biggie. There are tradeoffs between speed, the quality of the hash, and the size of the output. Encryption algorithms can make good hash functions. The hash function performs some computation using all the bits of your input value to return some output value with a smaller number of bits.
  2. So the hash function takes a string and returns an integer between 0 and N .
  3. Now you can look up a pointer to a function in a table of size N.
  4. Each entry in the table will be a linked list (or some other searchable data structure) because of the chance of collision, that is two strings that map to the same hash value.

E.g.

lets say hash(char*) returns a value between 0 and 3.
hash("val1") returns 2
hash("val2") returns 0
hash("val3") also returns 0
hash("val4") returns 1

Now your hash table looks something like:

table[0] ("val2",foo2) ("val3", foo3)
table[1] ("val4",foo4)
table[2] ("val1",foo1)
table[3] <empty>

I hope you can see how the cost of doing matching using a hash table is bound by the time it takes to calculate the hash function and the small time it takes to search the entry in the hash table. If the hash table is large enough most hash table entries would have very few items.

Guy
+2  A: 

For strings, if you have a small finite number of possible strings, use a perfect hash and switch on the result. With only 30 or so strings, finding a perfect hash should be pretty easy. If you also need to validate the input, you'll have to do a single strcmp in each case, but that's pretty cheap.

Beyond that, just let the compiler optimize your switches. Only do anything fancier if you've done sufficient testing to know the time spent here is performance-critical.

R..
+1  A: 

A hash table would be ideal for speeding up a bunch of string compares. You might want to look into a string library that does not use nul terminated strings like the C stdlib does. Lots of string manipulation in C involves a lot of "look through the string for the nul, then do your operation". A string library like SafeStr keeps info about the length of the strings so there's no need to burn time to scan for nuls, especially for strings with unequal lengths

JimR
+1  A: 

(I'm quoting some from my prior research I've written on this topic)
The specINT2006 benchmark, 458.sjeng, which implements a chess simulator, uses many switch statements to process the different chess pieces. Each statement is in a form like:

switch (board[from]) {  
case (wpawn): ...  
case (wknight): ... 

Which the compiler (gcc) generates as an instruction sequence similar to the following:

40752b: mov -0x28(%rbp),%eax
40752e: mov 0x4238(,%rax,8),%rax
407536: jmpq *%rax

This assembly acts as a lookup table. You can speed up the compiled code further by splitting your switch ... case into multiple switch statements. You'll want to keep the case values consecutive and put the most frequent cases into different switch statements. This particularly improves the indirect branch prediction.

I'll leave the remainder of your questions to others.

Brian