ansaurus

Question

C#: Whats the fastest way of reading and writing binary?

Answer 1

+1 A:

You should do some profiling on your code to reveal whether this is the bottleneck. Also looking at your code it appears that you are using .Net function calls to write one byte to an unmanaged array, involving a pin on the memory and a call to unsafe code...

You might be much better off declaring a .Net System.IO.MemoryStream and seeking and writing around to it, wherever possible using a stream writer to push your changes in, which should use less function calls and won't require unsafe code. You'll find the pointer stuff much more useful in C# if you are doing things like DSP, where you need to perform a single operation to every value in an array etc.

EDIT: Let me also mention that depending on what you are doing you might find that the CPU caching will come into effect, if you can keep working on a single small area of memory that fits into the cache then you will end up with the best performance.

Spence 2010-01-10 10:55:26

The problem is it can be a bottleneck cause the application is communicating with a load of different network devices and is running on low-cost machines, some of these devices use heavy protocols others don't. Do you know a good way to profile the interfaces? problem will be the variable delay of the network..

haze4real 2010-01-10 11:16:37

Answer 2

+5 A:

Important: if you only need the one endian, see the pointer magic by wj32 / dtb

Personally, I would be writing directly to a Stream (perhaps with some buffering), and re-using a shared buffer that I can generally assume is clean. Then you can make some shortcuts and assume index 0/1/2/3.

Certainly don't use BitConverter, as that can't be used for both little/big-endian, which you require. I would also be inclined to just use bit-shifting rather than unsafe etc. It is actally the fastest, based on the following (so I'm glad that this is how I already do it my code here, look for EncodeInt32Fixed):

Set1: 371ms
Set2: 171ms
Set3: 993ms
Set4: 91ms <==== bit-shifting ;-p

code:

using System;
using System.Diagnostics;
using System.Runtime.InteropServices;
static class Program
{
    static void Main()
    {
        const int LOOP = 10000000, INDEX = 100, VALUE = 512;
        byte[] buffer = new byte[1024];
        Stopwatch watch;

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            Set1(buffer, INDEX, VALUE);
        }
        watch.Stop();
        Console.WriteLine("Set1: " + watch.ElapsedMilliseconds + "ms");

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            Set2(buffer, INDEX, VALUE);
        }
        watch.Stop();
        Console.WriteLine("Set2: " + watch.ElapsedMilliseconds + "ms");

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            Set3(buffer, INDEX, VALUE);
        }
        watch.Stop();
        Console.WriteLine("Set3: " + watch.ElapsedMilliseconds + "ms");

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            Set4(buffer, INDEX, VALUE);
        }
        watch.Stop();
        Console.WriteLine("Set4: " + watch.ElapsedMilliseconds + "ms");

        Console.WriteLine("done");
        Console.ReadLine();
    }
    unsafe static void Set1(byte[] target, int index, int value)
    {
        fixed (byte* p = &target[0])
        {
            Marshal.WriteInt32(new IntPtr(p), index, value);
        }
    }

    unsafe static void Set2(byte[] target, int index, int value)
    {
        int* p = &value;
        for (int i = 0; i < 4; i++)
        {
            target[index + i] = *((byte*)p + i);
        }
    }

    static void Set3(byte[] target, int index, int value)
    {
        byte[] data = BitConverter.GetBytes(value);
        Buffer.BlockCopy(data, 0, target, index, data.Length);
    }
    static void Set4(byte[] target, int index, int value)
    {
        target[index++] = (byte)value;
        target[index++] = (byte)(value >> 8);
        target[index++] = (byte)(value >> 16);
        target[index] = (byte)(value >> 24);
    }
}

Marc Gravell 2010-01-10 10:57:29

i don't think Stream would be a good solution, the problem is there may be seeks needed and the data is not always read and writen sequential. Another problem would be endianess..

haze4real 2010-01-10 11:09:27

Fine - don't use the stream; but see `Set4` and the performance evidence...

Marc Gravell 2010-01-10 11:13:07

i need to verify this first and may accept this answer as solution, what about getting/reading?

haze4real 2010-01-10 11:22:16

Same, but in reverse; `return ((int)buffer[index++]) | (((int)buffer[index++]) << 8) | (((int)buffer[index++]) << 16) | (((int)buffer[index]) << 24);` (or switch the shifts top-to-bottom to get other-endian). Note that we cast to int early, as int arithmetic is faster than byte arithmetic.

Marc Gravell 2010-01-10 11:26:15

i think the shifting solution is the best at this time, the big advantage would be the ease of endian swaping.

haze4real 2010-01-10 12:03:33

Answer 3

+2 A:

Pointers are the way to go. Pinning objects with the fixed keyword is extremely cheap, and you avoid the overhead of calling functions like WriteInt32 and BlockCopy. For a "generic solution" you can simply use void* and use your own memcpy (since you're dealing with small amounts of data). However pointers do not work with true generics.

wj32 2010-01-10 11:02:38

Then you clearly didn't write your code properly. Do you really think using bit-shifting (around 8 instructions for each Int32 at *least*) is faster than using a simple mov instruction? And I'm talking about pinning the buffer outside of the loop.

wj32 2010-01-10 11:21:17

can you give me an example of this in c#? you're talking of the second Set method, aren't you? i just need the plain value types: Int16, UInt16, Int32, UInt32, Int64, UInt64

haze4real 2010-01-10 11:27:30

fixed (byte* b = array) { for (...) *(int*)(b + offset) = value; } }. Exactly what you have in your Get method. Look at the Disassembly window if you don't believe that this is in fact the fastest method.

wj32 2010-01-10 11:36:10

Ah, right; I had the wrong end of the stick. But the requirement is for both-endian. So you'd have to have a fallback for the other, and some mechanism of abstracting the two. I'd keep it simple and write the shifts. It also raises the question "Do I bother checking/supporting big-endian hardware" etc.

Marc Gravell 2010-01-10 11:41:10

It shouldn't need a `for` loop, though (from comment)... unless I missed something? Just get the b + offset, swap to `int*` and assign?

Marc Gravell 2010-01-10 11:44:35

Sadly C# doesn't offer a way to utilise the bswap instruction for big/little-endian compatibility, so in that case your solution would be the fastest. As for the for loop, it was meant to indicate that the pinning would be done outside of any loops.

wj32 2010-01-10 11:47:18

Cool; I've added an update, but (as you note) endianness is still the problem here. I'm out of votes already today, but a virtual +1

Marc Gravell 2010-01-10 12:00:24

yes big endian is needed a lot of times, most of the devices we communicate with use big endian order.

haze4real 2010-01-10 12:00:37

Answer 4

+4 A:

Using Marc Gravell's Set1 to Set4 and the Set5 below, I get the following numbers on my machine:

Set1: 197ms
Set2: 102ms
Set3: 604ms
Set4: 68ms
Set5: 55ms <==== pointer magic ;-p

Code:

unsafe static void Set5(byte[] target, int index, int value)
{
    fixed (byte* p = &target[index])
    {
        *((int*)p) = value;                
    }
}

Of course, it gets much faster when the byte array isn't pinned on each iteration but only once:

Set6: 10ms (little endian)
Set7: 85ms (big endian)

Code:

if (!BitConverter.IsLittleEndian)
{
    throw new NotSupportedException();
}

watch = Stopwatch.StartNew();
fixed (byte* p = buffer)
{
    for (int i = 0; i < LOOP; i++)
    {
        *((int*)(p + INDEX)) = VALUE;
    }
}
watch.Stop();
Console.WriteLine("Set6: " + watch.ElapsedMilliseconds + "ms");

watch = Stopwatch.StartNew();
fixed (byte* p = buffer)
{
    for (int i = 0; i < LOOP; i++)
    {
        *((int*)(p + INDEX)) = System.Net.IPAddress.HostToNetworkOrder(VALUE);
    }
}
watch.Stop();
Console.WriteLine("Set7: " + watch.ElapsedMilliseconds + "ms");

dtb 2010-01-10 11:45:54

the problem here would be the overhead generated by endian swaping

haze4real 2010-01-10 11:58:06

Cool; I've added an update, but endianness is still a problem here. I'm out of votes already today, but a virtual +1

Marc Gravell 2010-01-10 11:59:45

@haze4real: the overhead produced by endian swapping is actually not that huge. Example updated.

dtb 2010-01-10 12:05:27

:/ the overhead is huge its about 8 times slower in your test. You shouldn't pin the byte array outside of the loop in your test, cause its the function that needs to be tested... i can't pin it between these function calls

haze4real 2010-01-10 12:07:57

True, but Set7 is still only a few milliseconds slower than Set4. My recommendation would be Marc's bit-shifting solution.

dtb 2010-01-10 12:12:22

Yes but only because you pinned the array outside the loop, which just simulates an amount of function calls and isn't part of the solution cause the function itself gets called at different places in code.

haze4real 2010-01-10 12:15:15

Right, you can only pin the array outside the loop if you want to modify the same array multiple times in a row. If this isn't the case in your application, you can't do this. But if you can, the performance gain is obvious.

dtb 2010-01-10 12:17:42

ansaurus

tags:

views:

answers:

C#: Whats the fastest way of reading and writing binary?

related questions