views:

761

answers:

6

I'm thinking about the tokenizer here.
Each token calls a different function inside the parser.
What is more efficient:

  • A map of std::functions/boost::functions
  • A switch case

I thank everyone in advance for their answer.

+8  A: 

STL Map that comes with visual studio 2008 will give you O(log(n)) for each function call since it hides a tree structure beneath. With modern compiler (depending on implementation) , A switch statement will give you O(1) , the compiler translates it to some kind of lookup table. So in general , switch is faster.

However , consider the following facts:

The difference between map and switch is that : Map can be built dynamically while switch can't. Map can contain any arbitrary type as a key while switch is very limited to c++ Primitive types (char , int , enum , etc...).

By the way , you can use a hash map to achieve nearly O(1) dispatching (though , depending on the hash table implementation , it can sometimes be O(n) at worst case). Even though , switch will still be faster.

Edit

I am writing the following only for fun and for the matter of the discussion

I can suggest an nice optimization for you but it depends on the nature of your language and whether you can expect how your language will be used.

When you write the code: You divide your tokens into two groups , one group will be of very High frequently used and the other of low frequently used. You also sort the high frequently used tokens. For the high frequently tokens you write an if-else series with the highest frequently used coming first. for the low frequently used , you write a switch statement.

The idea is to use the CPU branch prediction in order to even avoid another level of indirection (assuming the condition checking in the if statement is nearly costless). in most cases the CPU will pick the correct branch without any level of indirection . They will be few cases however that the branch will go to the wrong place. Depending on the nature of your languege , Statisticly it may give a better performance.

Edit : Due to some comments below , Changed The sentence telling that compilers will allways translate a switch to LUT.

The compiler may convert it to a lookup table, but there is no requirement that it do so. If it doesn't, it will be O(N).
anon
All parsers I've seen use a switch case.Is there a good reason for that?
the_drow
like yossi1981 said, a switch statement is in most cases much faster. so if you have the choice (usually parsers (or more likely the tokenizer/lexer) follow a specific syntax, not made up in runtime) you should prefer a switch.
Idan K
I don't understand how you can answer that :)O(log(n)) is still slower than a lookup table. If you don't trust your compiler enough to transform a switch into a lookup table, well use directly a lookup table: it's quite easy to implement. But using a map really looks like an important overhead here.
NicDumZ
@ "you can use a hash map to achieve nearly O(1) dispatching, [...] but it can sometimes be O(n) at worst case". True. Cost is not evolving with the size. But that cost is really high, compared to the cost of an indirect jump. And by the way @Neil I can't cite any decent compiler which is stupidly testing linearly all the switch values. It's not a language requirement, but all non-experimental compilers do it :))
NicDumZ
Sorry, no offense here, but the "if-else prioritize" construct to enhance cache prediction is just plainly wrong. 1) You have no idea of how you compiler will mingle with branches here, things can go unexpectedly 2) Most cache implement LRU. In the ABABABA scenario, cache prediction will **always** get prediction wrong: your approach only adds overhead adding those conditional tests. If you want performance, use a lookup table, but don't try to outrun your compiler by hand :)
NicDumZ
You didn't understand what I wrote :) I was talking about ***Branch*** prediction and not ***Cache*** Prediction.
+2  A: 

What is your definition of "efficient"? If you mean faster, then you probably should profile some test code for a definite answer. If you're after flexible and easy-to-extend code though, then do yourself a favor and use the map approach. Everything else is just premature optimization...

milan1612
+1  A: 

You don't say what type your tokens are. If they are not integers, you don't have a choice - switches only work with integer types.

anon
They are an enum
the_drow
+1  A: 

Like yossi1981 said, a switch could be optimized of beeing a fast lookup-table but there is not guarantee, every compiler has other algorithms to determine whether to implement the switch as consecutive if's or as fast lookup table, or maybe a combination of both.

To gain a fast switch your values should meet the following rule: they should be consecutive, that is e.g. 0,1,2,3,4. You can leave some values out but things like 0,1,2,34,43 are extremely unlikely to be optimized.

The question really is: is the performance of such significance in your application? And wouldn't a map which loads its values dynamically from a file be more readable and maintainable instead of a huge statement which spans multiple pages of code?

codymanix
Yes it is. It's a scripting language.
the_drow
A: 

The C++ standard says nothing about the performance of its requirements, only that the functionality should be there.

These sort of questions about which is better or faster or more efficient are meaningless unless you state which implementation you're talking about. For example, the string handling in a certain version of a certain implementation of JavaScript was atrocious, but you can't extrapolate that to being a feature of the relevant standard.

I would even go so far as to say it doesn't matter regardless of the implementation since the functionality provided by switch and std::map is different (although there's overlap).

These sort of micro-optimizations are almost never necessary, in my opinion.

paxdiablo
+9  A: 

Hello!

I would suggest reading switch() vs. lookup table? from Joel on Software. Particularly, this response is interesting:

" Prime example of people wasting time trying to optimize the least significant thing."

Yes and no. In a VM, you typically call tiny functions that each do very little. It's the not the call/return that hurts you as much as the preamble and clean-up routine for each function often being a significant percentage of the execution time. This has been researched to death, especially by people who've implemented threaded interpreters.

In virtual machines, lookup tables storing computed addresses to call are usually preferred to switches. (direct threading, or "label as values". directly calls the label address stored in the lookup table) That's because it allows, under certain conditions, to reduce branch misprediction, which is extremely expensive in long-pipelined CPUs (it forces to flush the pipeline). It, however, makes the code less portable.

This issue has been discussed extensively in the VM community, I would suggest you to look for scholar papers in this field if you want to read more about it. Ertl & Gregg wrote a great article on this topic in 2001, The Behavior of Efficient Virtual Machine Interpreters on Modern Architectures

But as mentioned, I'm pretty sure that these details are not relevant for your code. These are small details, and you should not focus too much on it. Python interpreter is using switches, because they think it makes the code more readable. Why don't you pick the usage you're the most comfortable with? Performance impact will be rather small, you'd better focus on code readability for now ;)

Edit: If it matters, using a hash table will always be slower than a lookup table. For a lookup table, you use enum types for your "keys", and the value is retrieved using a single indirect jump. This is a single assembly operation. O(1). A hash table lookup first requires to calculate a hash, then to retrieve the value, which is way more expensive.

Using an array where the function addresses are stored, and accessed using values of an enum is good. But using a hash table to do the same adds an important overhead

To sum up, we have:

  • cost(Hash_table) >> cost(direct_lookup_table)
  • cost(direct_lookup_table) ~= cost(switch) if your compiler translates switches into lookup tables.
  • cost(switch) >> cost(direct_lookup_table) (O(N) vs O(1)) if your compiler does not translate switches and use conditionals, but I can't think of any compiler doing this.
  • But inlined direct threading makes the code less readable.
NicDumZ
Great answer.I'm all out of upvotes for today. :)Tommorow you'll get one.
the_drow