views:

341

answers:

5

In my C# code I'm trying to fetch an array of structures from a legacy C++ DLL (the code I cannot change).

In that C++ code, the structure is defined like this:

struct MyStruct
{
    char* id;
    char* description;
};

The method that I'm calling (get_my_structures) returns a pointer to an array of MyStruct structures:

MyStruct* get_my_structures()
{
    ...
}

There is another method that returns the number of stuctures so I do know how many structures get returned.

In my C# code, I have defined MyStruct like this:

[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPStr)]    // <-- also tried without this
  private string _id;
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  private string _description;
}

The interop signature looks like this:

[DllImport("legacy.dll", EntryPoint="get_my_structures")]
public static extern IntPtr GetMyStructures();

Finally, the code that fetches the array of MyStruct structures looks like this:

int structuresCount = ...;
IntPtr myStructs = GetMyStructures();
int structSize = Marshal.SizeOf(typeof(MyStruct));    // <- returns 8 in my case
for (int i = 0; i < structuresCount; i++)
{
    IntPtr data = new IntPtr(myStructs.ToInt64() + structSize * i);
    MyStruct ms = (MyStruct) Marshal.PtrToStructure(data, typeof(MyStruct));
    ...
}

The trouble is, only the very first structure (one at the offset zero) gets marshaled correctly. Subsequent ones have bogus values in _id and _description members. The values are not completely trashed, or so it seems: they are strings from some other memory locations. The code itself does not crash.

I have verified that the C++ code in get_my_structures() does return correct data. The data is not accidentally deleted or modified during or after the call.

Viewed in a debugger, C++ memory layout of the returned data looks like this:

0: id (char*)           <---- [MyStruct 1]
4: description (char*)
8: id (char*)           <---- [MyStruct 2]
12: description (char*)
16: id (char*)          <---- [MyStruct 3]
...

[Update 18/11/2009]

Here is how the C++ code prepares these structures (the actual code is much uglier, but this is a close enough approximation):

static char buffer[12345] = {0};
MyStruct* myStructs = (MyStruct*) &buffer;
for (int i = 0; i < structuresCount; i++)
{
    MyStruct* ms = <some other permanent address where the struct is>;
    myStructs[i].id = (char*) ms->id;
    myStructs[i].description = (char*) ms->description;
}
return myStructs;

Admittedly, the code above does some ugly casting and copies raw pointers around, but it still does seem to do that correctly. At least that's what I see in the debugger: the above (static) buffer does contain all these naked char* pointers stored one after another, and they point to valid (non-local) locations in memory.

Pavel's example shows that this is really the only place where things can go wrong. I will try to analyze what happens with those 'end' locations where the strings really are, not the locations where the pointers get stored.

A: 

You have to use UnmanagedType.LPTStr for char*. Also a StringBuilder is recommended for a non const char*: And a CharSet specification:

[StructLayout(LayoutKind.Sequential, Charset = CharSet.Auto)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPTStr)]
  private StringBuilder _id;
  [MarshalAsAttribute(UnmanagedType.LPTStr)]
  private StringBuilder _description;
}

As for the DllImport declaration, have you tried

[DllImport("legacy.dll", EntryPoint="get_my_structures")]
public static extern MarshalAs(UnmanagedType.LPArray) MyStruct[] GetMyStructures();

?

Also, if the previous doesn't work, leave it at IntPtr and try to Mashal the returned structs like this:

for (int i = 0; i < structuresCount; i++)
{
    MyStruct ms = (MyStruct) Marshal.PtrToStructure(myStructs, typeof(MyStruct));
    ...
    myStructs += Marshal.SizeOf(ms);
}
fretje
If I try to use a StringBuilder I get an ArgumentException when I try to do this: int itemSize = Marshal.SizeOf(typeof(MyStruct)); The error message is "Type 'MyStruct' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed".
vladimir
@vladimir: are you using `UnmanagedType.LPTStr`? Also: try specifying the `CharSet` like the others suggested.
fretje
"Strings are valid members of structures; however, StringBuilder buffers are invalid in structures." -> http://msdn.microsoft.com/en-us/library/s9ts558h.aspx#Mtps_DropDownFilterText
hjb417
@hjb417: I stand corrected... It seems StringBuilder can only be used in DllImport statements of functions that take a char* as parameter.
fretje
@Vladimir: I guess the problem is not with the declaration of your struct, but with the way you convert the returned IntPtr to your individual structs.
fretje
A: 

I usually end up working these things out by trial and error. Make sure you have the CharSet property set on your StructLayout, and I would try UnmanagedType.LPTStr, seems to work better for char *, even though I am not sure why.

[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Auto)]  
public class MyStruct
{
    [MarshalAsAttribute(UnmanagedType.LPTStr)]
    private string _id;
    [MarshalAsAttribute(UnmanagedType.LPTStr)]
    private string _description;
}
Mark Heath
A: 

I think, also, in addition to the answers given, that you need to supply the length as well, ie [MarshalAsAttribute(UnmanagedType.LPTStr), SizeConst = , ArraySubType = System.Runtime.InteropServices.UnmanagedType.AnsiBStr)]

This is a trial and error to get this right, also, another thing to consider, in some WinAPI calls that expect a string parameter, usually a ref parameter, it might be worth your while to try the StringBuilder class also...Nothing else comes to mind on this other than the points I have mentioned here... Hope this helps, Tom

tommieb75
char* is null terminated, so I wouldn't supply the length. That's why we have UnmanagedType.LPWStr and UnmanagedType.LPTStr. If you were to supply the length, you'd also have to make it really big as to not truncate any data. Not an elegant solution.
ParmesanCodice
+1  A: 

I would change the structure. Instead of strings etc. , use IntPtr:

[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  private IntPtr _id;
  private IntPtr _description;
}

Then each value of the C# array could be manually marshalled to string using Marshal.PtrToString taking into account charset etc.

rossoft
Sounds like a viable approach, but unfortunately it doesn't seem to address the issue I have. Even with IntPtr members, those members still have the same (invalid) values in all structures other than the very first one.
vladimir
A: 

I cannot reproduce your problem, which leads me to suspect that it's really on C++ side of things. Here's the complete source code for my attempt.

dll.cpp - compile with cl.exe /LD:

extern "C" {

struct MyStruct
{
    char* id;
    char* description;
};

__declspec(dllexport)
MyStruct* __stdcall get_my_structures()
{
    static MyStruct a[] =
    {
        { "id1", "desc1" },
        { "id2", "desc2" },
        { "id3", "desc3" }
    };
    return a;

}

}

test.cs - compile with csc.exe /platform:x86:

using System;
using System.Runtime.InteropServices;


[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  public string _id;
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  public string _description;
}


class Program
{
    [DllImport("dll")]
    static extern IntPtr get_my_structures();

    static void Main()
    {
        int structSize = Marshal.SizeOf(typeof(MyStruct));
        Console.WriteLine(structSize);

        IntPtr myStructs = get_my_structures();
        for (int i = 0; i < 3; ++i)
        {
            IntPtr data = new IntPtr(myStructs.ToInt64() + structSize * i);
            MyStruct ms = (MyStruct) Marshal.PtrToStructure(data, typeof(MyStruct));

            Console.WriteLine();
            Console.WriteLine(ms._id);
            Console.WriteLine(ms._description);
        }
    }
}

This correctly prints out all 3 structs.

Can you show your C++ code that fills the structs? The fact that you can call it from C++ directly and get correct results does not necessarily mean it's correct. For example, you could be returning a pointer to a stack-allocated struct. When doing a direct call, then, you'd get a technically invalid pointer, but the data would likely remain preserved. When doing P/Invoke marshalling, the stack could be overwritten by P/Invoke data structures by the point it tries to read values from there.

Pavel Minaev
I have updated my question, adding the C++ code. Since you have shown that there's nothing wrong with the interop part itself (nor on the C# side), it is indeed only logical to conclude that it must be the C++ code.
vladimir