views:

464

answers:

4

I am considering porting a small portion of the code in a C# project of mine to C/ASM for performance benefits. (This section of code uses many bitwise operations and is one of the few places where there may exist a real performance increase by using native code.) I then plan to simply call the native function in the separate DLL via P/Invoke. Now, the only data that will be passed between the managed and native code will be purely primitive types (bool, int, long, 1D arrays, etc.). So my question is: will there be any significant overhead using P/invoke simply with primitive types? I am aware that there is a substatial overhead when using more complex types, since they need to be marshalled (pinned/copied), but perhaps in my situation it will be relatively efficient (compared to calling the code from within the native DLL itself even)? If someone could clarify this matter for me, explaining the degrees of performance advantages/hits and the reasons behind them, it would be much appreciated. An alternative way to accomplish the whole task would also be welcome, though since C# lacks support for inline assembly/CIL, I don't believe there is one.

A: 

You could generate a compiled, optimized version of your .NET assembly by using ngen on the end-user's computer (as part of the install process).

In my experience properly formatted C# (e.g., keep allocation outside of loops) will perform very well.

Nick
I'm sure you are right in the vast majority of cases. However I really want to use C/C++ here for the benefits of its very quick bitwise operations (sped up especially by access to specialised x86 instructions via inline ASM). Worth profiling, in any case.
Noldorin
+1  A: 

I seem to recall hearing that there's at least a 30 machine op overhead for each P/Invoke call. But ignore the theory, profile your options and choose the fastest.

Scott Weinstein
+1  A: 

I would personally setup a test harness with a simple expression written in C# and unmanaged C++, then profile the app to see what kind of performance delta you're working with.

Something else to consider is that you'd be introducing a maintenance issue with the app, especially if you have junior-level developers expected to maintain the code. Make sure you know what you're gaining and what what you're losing respective to performance as well as code-clarity and maintainability.

As an aside, JIT'd C# code should have performance on par with C++ with respect to arithmetic operations.

Ryan Emerle
+3  A: 

From MSDN (http://msdn.microsoft.com/en-us/library/aa712982.aspx):

"PInvoke has an overhead of between 10 and 30 x86 instructions per call. In addition to this fixed cost, marshaling creates additional overhead. There is no marshaling cost between blittable types that have the same representation in managed and unmanaged code. For example, there is no cost to translate between int and Int32."

So, it's reasonably cheap, but as always you should measure carefully to be sure you are benefitting from it, and bear in mind any maintenance overhead. As an aside, I would recommend C++/CLI ("managed" C++) over P/Invoke for any complex interop, especially if you're comfortable with C++.

Jim Arnold
Thanks, I must have missed that when looking over the docs. Indeed, I will profile the various implementations to be sure. You also make a good point about managed C++, though would you know if is it typically cheaper to call a managed C++ function than a native one via P/Invoke?
Noldorin
Just to confirm: managed C++ can contain inline ASM, right? Given that this is the case, it would seem to be the better solution.
Noldorin
The key advantage of using C++/CLI for interop is that it takes care of a lot of the marshaling gunk for you. I doubt there is any performance difference, but don't know for sure. And yes, you can use inline asm too.
Jim Arnold