views:

545

answers:

4

I am allocating some unmanaged memory in my application via Marshal.AllocHGlobal. I'm then copying a set of bytes to this location and converting the resulting segment of memory to a struct before freeing the memory again via Marshal.FreeHGlobal.

Here's the method:

public static T Deserialize<T>(byte[] messageBytes, int start, int length)
    where T : struct
{
    if (start + length > messageBytes.Length)
        throw new ArgumentOutOfRangeException();

    int typeSize = Marshal.SizeOf(typeof(T));
    int bytesToCopy = Math.Min(typeSize, length);

    IntPtr targetBytes = Marshal.AllocHGlobal(typeSize);
    Marshal.Copy(messageBytes, start, targetBytes, bytesToCopy);

    if (length < typeSize)
    {
        // Zero out additional bytes at the end of the struct
    }

    T item = (T)Marshal.PtrToStructure(targetBytes, typeof(T));
    Marshal.FreeHGlobal(targetBytes);
    return item;
}

This works for the most part, however if I have fewer bytes than the size of the struct requires, then 'random' values are assigned to the last fields (I am using LayoutKind.Sequential on the target struct). I'd like to zero out these hanging fields as efficiently as possible.

For context, this code is deserializing high-frequency multicast messages sent from C++ on Linux.

Here is a failing test case:

// Give only one byte, which is too few for the struct
var s3 = MessageSerializer.Deserialize<S3>(new[] { (byte)0x21 });
Assert.AreEqual(0x21, s3.Byte);
Assert.AreEqual(0x0000, s3.Int); // hanging field should be zero, but isn't

[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Pack = 1)]
private struct S3
{
    public byte Byte;
    public int Int;
}

Running this test repeatedly causes the second assert to fail with a different value each time.


EDIT

In the end, I used leppie's suggestion of going unsafe and using stackalloc. This allocated a byte array that was zeroed as needed, and improved throughput from between 50% and 100%, depending upon the message size (larger messages saw greater benefit).

The final method ended up resembling:

public static T Deserialize<T>(byte[] messageBytes, int startIndex, int length)
    where T : struct
{
    if (length <= 0)
        throw new ArgumentOutOfRangeException("length", length, "Must be greater than zero.");
    if (startIndex < 0)
        throw new ArgumentOutOfRangeException("startIndex", startIndex, "Must be greater than or equal to zero.");
    if (startIndex + length > messageBytes.Length)
        throw new ArgumentOutOfRangeException("length", length, "startIndex + length must be <= messageBytes.Length");

    int typeSize = Marshal.SizeOf(typeof(T));
    unsafe
    {
        byte* basePtr = stackalloc byte[typeSize];
        byte* b = basePtr;
        int end = startIndex + Math.Min(length, typeSize);
        for (int srcPos = startIndex; srcPos < end; srcPos++)
            *b++ = messageBytes[srcPos];
        return (T)Marshal.PtrToStructure(new IntPtr(basePtr), typeof(T));
    }   
}

Unfortunately this still requires a call to Marshal.PtrToStructure to convert the bytes into the target type.

+1  A: 

I've never done this stuff in C# before, but I found Marshal.WriteByte(IntPtr, Int32, Byte) in MSDN. Try that out.

Jon Seigel
+2  A: 

Why not just check whether start + length is within typesize?

BTW: I would just go unsafe here and use a for loop to to zero out the additional memory.

That too will give you the benefit of using stackalloc which is much safer and faster than AllocGlobal.

leppie
@leppie -- thanks for the useful info. I'll check out `stackalloc` too. I have to cater for differing message sizes as the two teams can occasionally manage to avoid synchronised releases if we add fields on the end that the other end ignores. Similarly, if you don't require values, you can expect them and get zeroes instead, which is the case I'm trying to achieve here.
Drew Noakes
@leppie, I'm leaning towards this approach. Could you go into some more detail as to _why_ using `stackalloc` is safer and faster? Once I have the `byte*`, what would be the best way to copy into it?
Drew Noakes
I've put together a version that works with `stackalloc` to populate an array on the stack. I don't think it's possible to get around the call to `Marshal.PtrToStructure` though, is it?
Drew Noakes
@Drew: Nah, I also didnt realize generics sucked so hard when going unsafe :( If your types are known, you could generate all the 'templates'. That would keep it fast.
leppie
Unfortunately this is a generic API that will deal with unknown and varied types (though all of fixed size.)
Drew Noakes
+1  A: 

Yes as Jon Seigel said, you can zero it out using Marshal.WriteByte

In the following example, I zero out the buffer before copying the struct.

if (start + length > messageBytes.Length) 
    throw new ArgumentOutOfRangeException();   
int typeSize = Marshal.SizeOf(typeof(T));    
int bytesToCopy = Math.Min(typeSize, length);   
IntPtr targetBytes = Marshal.AllocHGlobal(typeSize);  
//zero out buffer
for(int i=0; i < typeSize; i++)
{
    Marshal.WriteByte(targetBytes, i, 0);
}
Marshal.Copy(messageBytes, start, targetBytes, bytesToCopy);
hjb417
Each call to Marshal.WriteByte will cause a transition between managed and native code and back, which has a certain overhead. Doing that in a loop can get inefficient. If you want to stick to the Marshal class, I'd try this instead: Marshal.Copy(new byte[typeSize], 0, targetBytes, typeSize)
Mattias S
The other alternative I was thinking of was P/Invoke the LocalAlloc function and pass in the LPTR flag.
hjb417
A: 
[DllImport("kernel32.dll")]
static extern void RtlZeroMemory(IntPtr dst, int length);
...
RtlZeroMemory(targetBytes, typeSize);
Mattias S
The doscs say it's a macro.
hjb417
Dumpbin.exe on kernel32.dll says it isn't just a macro.
Mattias S
@MattiasS -- I need to zero out at `dst + N`. `IntPtr` doesn't support arithmetic, so how can I address this offset?
Drew Noakes
Can't you simply zero out the entire buffer before the Marshal.Copy call? That way, whatever part you don't overwrite with the struct will remain zero. You can do arithmetic on the pointer value if you cast it to a long and then back to IntPtr.
Mattias S