ansaurus

Question

Answer 1

+1 A:

It would surprise me to find a scenario where the overhead of evaluating the if statements is worth the effort to dynamically emit code.

Modern CPU's support branch prediction and branch predication, which makes the overhead for branches in small segments of code approach zero.

Have you tried to benchmark two hand-coded versions of the code, one that has all the if-statements in place but provides zero values for most, and one that removes all of those same if branches?

Eric J. 2009-08-31 21:56:05

I agree, N would have to be in the hundreds, if not thousands, and most of the cases would need to be omitted in practice, for it to be a really significant fraction of runtime.

Barry Kelly 2009-08-31 21:57:36

I have not done much at the ASM/IL level for several years, but I believe current Intel processors will pre-determine the correct branch before the IP (instruction pointer) gets there. Anyone know the details?

Eric J. 2009-08-31 22:01:32

I am no expert at state of the art branch prediction and there happens a lot of stuff between C# code and the processor executing a branch, but I am quite sure that a modern processor will predict the branch false at most once if the condition is constant (besides the number of conditions is very large and the branch prediction cache cannot store all predictions).

Daniel Brückner 2009-08-31 22:18:17

Answer 2

+2 A:

I would usually not even think about such an optimization. How much work does DoValueXRelatedStuff() do? More than 10 to 50 processor cycles? Yes? That means you are going to build quite a complex system to save less then 10% execution time (and this seems quite optimistic to me). This can easily go down to less then 1%.

Is there no room for other optimizations? Better algorithms? An do you really need to eliminate single branches taking only a single processor cycle (if the branch prediction is correct)? Yes? Shouldn't you think about writing your code in assembler or something else more machine specific instead of using .NET?

Could you give the order of N, the complexity of a typical method, and the ratio of expressions usually evaluating to true?

Daniel Brückner 2009-08-31 22:06:27

DoValueXRelatedStuff() is actually not a method call in my algorithm. It's just some expression calculation, so it is quite fast. Basically it's just a couple of additions, multiplications and probably division.

Max 2009-08-31 23:01:07

And what is the order of N? And the ratio expression true to false?

Daniel Brückner 2009-08-31 23:06:04

Right now N is 8. In a typical situation true to false ratio is 3/5. But I need to change N to maybe 20 in the next version.

Max 2009-08-31 23:15:35

Even in the case of 20 I tend to believe that you will not gain that much speed up. Could you profile it like Eric J. suggested? I am really curious about the result. Or even better, could you give a more complete code fragment with a two, or three complete DoValueXRelatedStuff() fragments. I would like to do some analysis, too.

Daniel Brückner 2009-09-01 08:36:49

I tested it as Eric J. suggested. I've added the code for this test. Unfortunately I cannot post the complete code, as it is written in Pascal and it works with some specific data structures.

Max 2009-09-01 16:36:09

But the test code gives the idea of how production code works

Max 2009-09-01 16:36:50

Answer 3

+1 A:

If you are really into code optimisation - before you do anything - run the profiler! It will show you where the bottleneck is and which areas are worth optimising.

Also - if the language choice is not limited (except for LISP) then nothing will beat assembler in terms of performance ;)

I remember achieving some performance magic by rewriting some inner functions (like the one you have) using assembler.

DmitryK 2009-08-31 22:08:21

Writing this in Assembler would be impractical. First of all the code is quite complicated and it woult take too much time to rewrite it in asm. And second, I cannot do such dynamic-compilation with assembler.

Max 2009-08-31 22:57:21

Answer 4

A:

Before you do anything, do you actually have a problem?

i.e. does it run long enough to bother you?

If so, find out what is actually taking time, not what you guess. This is the quick, dirty, and highly effective method I use to see where time goes.

Now, you are talking about interpreting versus compiling. Interpreted code is typically 1-2 orders of magnitude slower than compiled code. The reason is that interpreters are continually figuring out what to do next, and then forgetting, while compiled code just knows.

If you are in this situation, then it may make sense to pay the price of translating so as to get the speed of compiled code.

Mike Dunlavey 2009-09-28 01:03:14

ansaurus

tags:

views:

answers:

Dynamic compilation for performance

related questions