views:

230

answers:

4

Hello everyone,

I have an idea of how I can improve the performance with dynamic code generation, but I'm not sure which is the best way to approach this problem.

Suppose I have a class


class Calculator
{
  int Value1;
  int Value2;
  //.......... 
  int ValueN;

  void DoCalc()
  {
    if (Value1 > 0)
    {
      DoValue1RelatedStuff();    
    }
    if (Value2 > 0)
    {
      DoValue2RelatedStuff();    
    }
    //....
    //....
    //....
    if (ValueN > 0)
    {
      DoValueNRelatedStuff();    
    }
  }
}

The DoCalc method is at the lowest level and it is called many times during calculation. Another important aspect is that ValueN are only set at the beginning and do not change during calculation. So many of the ifs in the DoCalc method are unnecessary, as many of ValueN are 0. So I was hoping that dynamic code generation could help to improve performance.

For instance if I create a method


  void DoCalc_Specific()
  {
    const Value1 = 0;
    const Value2 = 0;
    const ValueN = 1;

    if (Value1 > 0)
    {
      DoValue1RelatedStuff();    
    }
    if (Value2 > 0)
    {
      DoValue2RelatedStuff();    
    }
    ....
    ....
    ....
    if (ValueN > 0)
    {
      DoValueNRelatedStuff();    
    }
  }

and compile it with optimizations switched on the C# compiler is smart enough to only keep the necessary stuff. So I would like to create such method at run time based on the values of ValueN and use the generated method during calculations.

I guess that I could use expression trees for that, but expression trees works only with simple lambda functions, so I cannot use things like if, while etc. inside the function body. So in this case I need to change this method in an appropriate way.

Another possibility is to create the necessary code as a string and compile it dynamically. But it would be much better for me if I could take the existing method and modify it accordingly.

There's also Reflection.Emit, but I don't want to stick with it as it would be very difficult to maintain.

BTW. I'm not restricted to C#. So I'm open to suggestions of programming languages that are best suited for this kind of problem. Except for LISP for a couple of reasons.

One important clarification. DoValue1RelatedStuff() is not a method call in my algorithm. It's just some formula-based calculation and it's pretty fast. I should have written it like this


if (Value1 > 0)
{
  // Do Value1 Related Stuff
}

I have run some performance tests and I can see that with two ifs when one is disabled the optimized method is about 2 times faster than with the redundant if.

Here's the code I used for testing:


    public class Program
    {
        static void Main(string[] args)
        {
            int x = 0, y = 2;

            var if_st = DateTime.Now.Ticks;
            for (var i = 0; i  < 10000000; i++)
            {
                WithIf(x, y);
            }
            var if_et = DateTime.Now.Ticks - if_st;
            Console.WriteLine(if_et.ToString());

            var noif_st = DateTime.Now.Ticks;
            for (var i = 0; i  < 10000000; i++)
            {
                Without(x, y);
            }
            var noif_et = DateTime.Now.Ticks - noif_st;
            Console.WriteLine(noif_et.ToString());

            Console.ReadLine();

        }

        static double WithIf(int x, int y)
        {
            var result = 0.0;
            for (var i = 0; i  < 100; i++)
            {
                if (x > 0)
                {
                    result += x * 0.01;
                }
                if (y > 0)
                {
                    result += y * 0.01;
                }
            }
            return result;
        }

        static double Without(int x, int y)
        {
            var result = 0.0;
            for (var i = 0; i < 100; i++)
            {
                result += y * 0.01;
            }
            return result;
        }
    }
+1  A: 

It would surprise me to find a scenario where the overhead of evaluating the if statements is worth the effort to dynamically emit code.

Modern CPU's support branch prediction and branch predication, which makes the overhead for branches in small segments of code approach zero.

Have you tried to benchmark two hand-coded versions of the code, one that has all the if-statements in place but provides zero values for most, and one that removes all of those same if branches?

Eric J.
I agree, N would have to be in the hundreds, if not thousands, and most of the cases would need to be omitted in practice, for it to be a really significant fraction of runtime.
Barry Kelly
I have not done much at the ASM/IL level for several years, but I believe current Intel processors will pre-determine the correct branch before the IP (instruction pointer) gets there. Anyone know the details?
Eric J.
I am no expert at state of the art branch prediction and there happens a lot of stuff between C# code and the processor executing a branch, but I am quite sure that a modern processor will predict the branch false at most once if the condition is constant (besides the number of conditions is very large and the branch prediction cache cannot store all predictions).
Daniel Brückner
+2  A: 

I would usually not even think about such an optimization. How much work does DoValueXRelatedStuff() do? More than 10 to 50 processor cycles? Yes? That means you are going to build quite a complex system to save less then 10% execution time (and this seems quite optimistic to me). This can easily go down to less then 1%.

Is there no room for other optimizations? Better algorithms? An do you really need to eliminate single branches taking only a single processor cycle (if the branch prediction is correct)? Yes? Shouldn't you think about writing your code in assembler or something else more machine specific instead of using .NET?

Could you give the order of N, the complexity of a typical method, and the ratio of expressions usually evaluating to true?

Daniel Brückner
DoValueXRelatedStuff() is actually not a method call in my algorithm. It's just some expression calculation, so it is quite fast. Basically it's just a couple of additions, multiplications and probably division.
Max
And what is the order of N? And the ratio expression true to false?
Daniel Brückner
Right now N is 8. In a typical situation true to false ratio is 3/5. But I need to change N to maybe 20 in the next version.
Max
Even in the case of 20 I tend to believe that you will not gain that much speed up. Could you profile it like Eric J. suggested? I am really curious about the result. Or even better, could you give a more complete code fragment with a two, or three complete DoValueXRelatedStuff() fragments. I would like to do some analysis, too.
Daniel Brückner
I tested it as Eric J. suggested. I've added the code for this test. Unfortunately I cannot post the complete code, as it is written in Pascal and it works with some specific data structures.
Max
But the test code gives the idea of how production code works
Max
+1  A: 

If you are really into code optimisation - before you do anything - run the profiler! It will show you where the bottleneck is and which areas are worth optimising.

Also - if the language choice is not limited (except for LISP) then nothing will beat assembler in terms of performance ;)

I remember achieving some performance magic by rewriting some inner functions (like the one you have) using assembler.

DmitryK
Writing this in Assembler would be impractical. First of all the code is quite complicated and it woult take too much time to rewrite it in asm. And second, I cannot do such dynamic-compilation with assembler.
Max
A: 

Before you do anything, do you actually have a problem?

i.e. does it run long enough to bother you?

If so, find out what is actually taking time, not what you guess. This is the quick, dirty, and highly effective method I use to see where time goes.

Now, you are talking about interpreting versus compiling. Interpreted code is typically 1-2 orders of magnitude slower than compiled code. The reason is that interpreters are continually figuring out what to do next, and then forgetting, while compiled code just knows.

If you are in this situation, then it may make sense to pay the price of translating so as to get the speed of compiled code.

Mike Dunlavey