tags:

views:

85

answers:

2

I'm writing a DLL wrapper for my C++ library, to be called from C#. This wrapper should also have callback functions called from lib and implemented in C#. These functions have for instance std::vector as output parameters. I don't know how to make this. How to pass a buffer of unknown size from C# to C++ via a callback function? Let's take this example

CallbackFunction FunctionImplementedInCSharp;

void FunctionCalledFromLib(const std::vector<unsigned char>& input, std::vector<unsigned char>& output)
{
    // Here FunctionImplementedInCSharp (C# delegate) should somehow be called
}

void RegisterFunction(CallbackFunction f)
{
    FunctionImplementedInCSharp = f;
}

The question is how CallbackFunction should be defined and what is the code inside FunctionCalledFromLib? One of the things that dumb me is how to delete a buffer created by C# inside C++ code.

+2  A: 

There are some things you should be aware of. The first is that if you are calling a .NET delegate from unmanaged code, then unless you follow some pretty narrow constraints you will be in for pain. Ideally, you can create a delegate in C# pass it into managed code, marshal it into a function pointer, hold onto it for as long as you like, then call it with no ill effects. The .NET documentation says so. I can tell you that this is simply not true. Eventually, part of your delegate or its thunk will get garbage collected and when you call the function pointer from unmanaged code you will get sent into oblivion. I don't care what Microsoft says, I've followed their prescription to the letter and watched function pointers get turned into garbage, especially in server side code behinds.

Given that, the most effective way to use function pointers is thus:

C# code calls unmanaged code, passing in delegate Unmanaged code marshals the delegate to a function pointer Unmanaged code does some work, possible calling the function pointer Unmanaged code drops all references to the function pointer Unmanaged code returns to managed code

Given that, suppose we have the following in C#:

public void PerformTrick(MyManagedDelegate delegate)
{
    APIGlue.CallIntoUnamangedCode(delegate);
}

and then in managed C++ (not c++/cli):

static CallIntoUnmanagedCode(MyManagedDelegate *delegate)
{
    MyManagedDelegate __pin *pinnedDelegate = delegate;
    SOME_CALLBACK_PTR p = Marshal::GetFunctionPointerForDelegate(pinnedDelegate);
    CallDeepIntoUnmanagedCode(p); // this will call p
}

I haven't done this recently in C++/CLI - the syntax is different - I think it ends up looking like this:

// this is declared in a class
static CallIntoUnamangedCode(MyManagedDelegate ^delegate)
{
    pin_ptr<MyManagedDelegate ^> pinnedDelegate = &delegate;
    SOME_CALLBACK_PTR p = Marshal::GetFunctionPointerForDelegate(pinnedDelegate);
    CallDeepIntoUnmanagedCode(p); // this will call p
}

when you exit this routines, the pinning gets released.

When you really, really need to have function pointers hanging around for a while before calling, I have done the following in C++/CLI:

  1. Made a hashtable that is a map from int -> delegate
  2. Made register/unregister routines that add new delegates into the hashtable, bumping up a counter for the hash int
  3. Made a single static unmanaged callback routine that is registered into unmanaged code with an int from the register call. When this routine is called, it calls back into managed code saying "find the delegate associated with and call it on these arguments".

What happens is that the delegates don't have thunks that do transitions anymore since they're implied. They're free to hang around in limbo being moved by the GC as needed. When they get called, the delegate will get pinned by the CLR and released as needed. I have also seen this method fail, particularly in the case of code that statically registers callbacks at the beginning of time and expects them to stay around to the end of time. I've seen this fail in ASP.NET code behind as well as server side code for Silverlight working through WCF. It's rather unnerving, but the way to fix it is to refactor your API to allow late(r) binding to function calls.

To give you an example of when this will happen - suppose you have a library that includes a function like this:

typedef void * (*f_AllocPtr) (size_t nBytes);
typedef void *t_AllocCookie;

extern void RegisterAllocFunction(f_AllocPtr allocPtr, t_AllocCookie cookie);

and the expectation is that when you call an API that allocates memory, it will be vectored off into the supplied f_AllocPtr. Believe it or not, you can write this in C#. It's sweet:

public IntPtr ManagedAllocMemory(long nBytes)
{
    byte[] data = new byte[nBytes];
    GCHandle dataHandle = GCHandle.Alloc(data, GCHandleType.Pinned);
    unsafe {
        fixed (byte *b = &data[0]) {
            dataPtr = new IntPtr(b);
            RegisterPointerHandleAndArray(dataPtr, dataHandle, data);
            return dataPtr;
        }
    }
}

RegisterPointerHandleAndArray stuffs the triplet away for safe keeping. That way when the corresponding free gets called, you can do this:

public void ManagedFreeMemory(IntPtr dataPointer)
{
    GCHandle dataHandle;
    byte[] data;
    if (TryUnregister(dataPointer, out dataHandle, out data)) {
        dataHandle.Free();
        // do anything with data?  I dunno...
    }
}

And of course this is stupid because allocated memory is now pinned in the GC heap and will fragment it to hell - but the point is that it's doable.

But again, I have personally seen this fail unless the actual pointers are short lived. This typically means wrapping your API so that when you call into a routine that accomplishes a specific task, it registers callbacks, does the task, then pulls the callbacks out.

plinth
OMG, this is way above my current knowledge! It'll take me 2 days just to figure out what all this means. I just hope there's an answer to my question in there. Maybe I should point out that I lack advanced skills in interop, and don't even know so far what the difference is between managed C++ and C++/CLI (I do know plenty of straight C++ and quite enough of C#). Your effort is commendable, but now I have to put some efforts of my own to understand your post.Meanwhile, could someone post some simple solution that works at least sometimes?
Dialecticus
Now I understand the difficulties with keeping callbacks alive. My current solution is to call RegisterFunction with static field delegate on static function. Is this safe in the application's long run?
Dialecticus
A: 

As it turns out the answer to the original question is rather simple, once you know it, and the whole callback issue was no issue. The input buffer parameter is replaced with parameter pair unsigned char *input, int input_length, and the output buffer parameter is replaced with parameter pair unsigned char **output, int *output_length. C# delegate should be something like this

public delegate int CallbackDelegate(byte[] input, int input_length,
                                     out byte[] output, out int output_length);

and wrapper in C++ should be something like this

void FunctionCalledFromLib(const std::vector<unsigned char>& input, std::vector<unsigned char>& output)
{
    unsigned char *output_aux;
    int output_length;

    FunctionImplementedInCSharp(
        &input[0], input.size(), &ouput_aux, &output_length);

    output.assign(output_aux, output_aux + output_length);

    CoTaskMemFree(output_aux); // IS THIS NECESSARY?
}

The last line is the last part of the mini-puzzle. Do I have to call CoTaskMemFree, or will the marshaller do it for me automagically?

As for the beautiful essay by plinth, I hope to bypass the whole problem by using static function.

Dialecticus