ansaurus

Question

Why would I see ~20% speed increase using native code?

Answer 1

A:

Did you use a profiler like AQTime to see where the bottle neck is? Sometimes it's a trivial thing when translating native to managed code. On the other hand, because managed code in some scenarios is slower than native code, you might want to try unsafe code instead.

OregonGhost 2009-05-19 16:08:44

Answer 2

+3 A:

This is most likely due to the JIT compiler generating code that's not as efficient as the code generated by the native compiler.

Profiling the code should probably be your next step if you care about the 20% decrease in performance, or you might consider using an off the shelf optimized library.

Ori Pessach 2009-05-19 16:16:59

Answer 3

+19 A:

Just looking at this code, I'd suspect from my experience a fairly significant slowdown going from C++ -> C#.

One major issue you're going to face in a naive port of a routine like this to C# is that C# is going to add bounds checking on every array check here. Since you're never looping through the arrays in a way that will get optimized (see this question for details), just about every array access is going to receive bounds checking.

In addition, this port is pretty close to a 1->1 mapping from C. If you run this through a good .NET profiler, you'll probably find some great spots that can be optimized to get this back to near C++ speed with one or two tweaks (that's nearly always been my experience in porting routines like this).

If you want to get this to be at nearly identical speed, though, you'll probably need to convert this to unsafe code and use pointer manipulation instead of directly setting the arrays. This will eliminate all of the bounds checking issues, and get your speed back.

Edit: I see one more huge difference, which may be the reason your C# unsafe code is running slower.

Check out this page about C# compared to C++, in particular:

"The long type: In C#, the long type is 64 bits, while in C++, it is 32 bits."

You should convert the C# version to use int, not long. In C#, long is a 64bit type. This may actually have a profound effect on your pointer manipulation, because I believe you are inadvertantly adding a long->int conversion (with overflow checking) on every pointer call.

Also, while you're at it, you may want to try wrapping the entire function in an unchecked block. C++ isn't doing the overflow checking you're getting in C#.

Reed Copsey 2009-05-19 16:17:22

Rico Mariani about bounds checking: http://blogs.msdn.com/ricom/archive/2006/07/12/663642.aspx

VVS 2009-05-19 17:34:15

@David: I love Rico Mariani's blog, but in this case, it's not a fair comparison. He was comparing the speed reduction in a typical GUI application, not a pure number crunching situation. Even then, he was seeing a 3% drop overall from array bounds checking. This routine is nearly all array access, though, so the number will not be buffered by the other computations. I'd expect much higher than the 3% drop (.6% if you remove the lowest 10%)

Reed Copsey 2009-05-19 17:55:58

I converted my code to an unsafe method that uses pointers to manipulate the data (see my latest edit in the body of the question.) The results were not what I expected. Is there something wrong with my unsafe implementation? I can post the new code if needed.

Robert H. 2009-05-19 19:55:13

@Robert Hamilton: See my edit. I believe the "long" declaration is changing the behavior here.

Reed Copsey 2009-05-20 01:31:50

@Reed Copsey: Good call! Pointer manipulation via long integer was the problem. I changed in the unsafe methods to ints and it's running really fast right now. I updated the post with the new profiler results - impressive.

Robert H. 2009-05-20 18:11:59

Glad that we figured it out... this has matched my exp. in the past. :)

Reed Copsey 2009-05-20 21:08:08

Answer 4

+1 A:

Considering that the managed code does bounds checking on the index of every array access, which the unmanaged code doesn't do, I would say that the difference is smaller than I expected.

If you change the arrays to pointers in the managed code too (as that is what they really are in the unmanaged code), I would expect them to perform about the same.

Guffa 2009-05-19 16:19:06

note that there are cases where the array bounds checking can be optimized away (such as when limit is explicitly < array.Length), but not here

Lucas 2009-05-19 17:30:43

Answer 5

+3 A:

The native compiler can do much deeper and heavier optimizations than a JIT compiler, like vectorization, interprocedural analysis, etc. And FFTs can get great speedups with vectorization.

Bruno Coutinho 2009-05-19 17:27:50

Answer 6

+1 A:

I just ran the code he posted with int instead of long and it did not really make a difference. I know other people have had better luck with FFT in .NET, showing that .NET can reach or exceed the proformance of C++ even with FFT math.

So my answer, either the poster's code is more optimized (for C) then the one in the link, or it is less optimized for C# than the one in the article I linked.

I performed two sets of tests on two machines with .NET 2.0. One machine had XPSP2 and had a single processor, 850MHz Pentium III, with 512Mb of RAM. The other machine had build 5321 of Vista and had a single processor, 2 GHz Mobile Pentium 4, with 1Gb of RAM. In each case I calculated the average of 100 separate FFT calculations on 217 (131072) data values. From these values I calculated the standard error from the standard deviation.

The results are shown in ms. The results for the Pentium III machine are:
  Not Optimized Optimized For Space Optimized For Speed
Unmanaged   92.88 ± 0.09 88.23 ± 0.09 68.48 ± 0.03
Managed C++ 72.89 ± 0.03 72.26 ± 0.04 71.35 ± 0.06
C++/CLI 73.00 ± 0.05 72.32 ± 0.03 71.44 ± 0.04
C# Managed  72.21 ± 0.04 69.97 ± 0.08
The results for the Mobile Pentium 4 are:
          Not Optimized   Optimized For Space Optimized For Speed
Unmanaged   45.2 ± 0.1 30.04 ± 0.04 23.06 ± 0.04
Managed C++ 23.5 ± 0.1 23.17 ± 0.08 23.36 ± 0.07
C++/CLI 23.5 ± 0.1 23.11 ± 0.07 23.80 ± 0.05
C# Managed  23.7 ± 0.1 22.78 ± 0.03

JasonRShaver 2009-05-20 05:34:19

I cannot seem to duplicate this: Average for native code: 0.007396 seconds +/- 0.000017 Average for managed C# code: 0.008422 seconds +/- 0.000027 I'm still getting a %18-19 speed increase on my machine using the native code (his not mine). I'm going put in on my laptop to see if there is any difference there.

Robert H. 2009-05-20 14:31:25

Make me wonder what improvements in OS and processor designs have had on those numbers in general.

JasonRShaver 2009-05-20 19:09:06

Answer 7

A:

Because C# .NET compiler is not the best in producing efficient code. And the entire logic of the language prevents from that. BTW, F# has much better performance than C# in Math

Rinat Abdullin 2009-06-13 11:39:27

ansaurus

tags:

views:

answers:

Why would I see ~20% speed increase using native code?

related questions