ansaurus

Question

Best way to combine 2 or more byte arrays in C#

Answer 1

+15 A:

If you simply need a new byte array, then use the following:

byte[] Combine(byte[] a1, byte[] a2, byte[] a3)
{
    byte[] ret = new byte[a1.Length + a2.Length + a3.Length];
    Array.Copy(a1, 0, ret, 0, a1.Length);
    Array.Copy(a2, 0, ret, a1.Length, a2.Length);
    Array.Copy(a3, 0, ret, a1.Length + a2.Length, a3.Length);
    return ret;
}

Alternatively, if you just need a single IEnumerable, consider using the C# 2.0 yield operator:

IEnumerable<byte> Combine(byte[] a1, byte[] a2, byte[] a3)
{
    foreach (byte b in a1)
        yield return b;
    foreach (byte b in a2)
        yield return b;
    foreach (byte b in a3)
        yield return b;
}

FryGuy 2009-01-06 03:03:12

I've done something similar to your 2nd option to merge large streams, worked like a charm. :)

Greg D 2009-01-06 03:21:38

The second option is great. +1.

Martinho Fernandes 2009-01-06 04:28:28

Answer 2

+6 A:

To add to the response where one might need an IEnumerable, if you are able to use LINQ, then you can just use the Concat method:

IEnumerable<byte> arrays = array1.Concat(array2).Concat(array3);

casperOne 2009-01-06 03:11:29

Answer 3

+17 A:

For primitive types (including bytes), use System.Buffer.BlockCopy instead of System.Array.Copy. It's faster.

I timed each of the suggested methods in a loop executed 1 million times using 3 arrays of 10 bytes each. Here are the results:

New Byte Array using System.Array.Copy - 0.2187556 seconds
New Byte Array using System.Buffer.BlockCopy - 0.1406286 seconds
IEnumerable using C# yield operator - 0.0781270 seconds
IEnumerable using Linq's Concat<> - 0.0781270 seconds

I increased the size of each array to 100 elements and re-ran the test:

New Byte Array using System.Array.Copy - 0.2812554 seconds
New Byte Array using System.Buffer.BlockCopy - 0.2500048 seconds
IEnumerable using C# yield operator - 0.0625012 seconds
IEnumerable using Linq's Concat<> - 0.0781265 seconds

I increased the size of each array to 1000 elements and re-ran the test:

New Byte Array using System.Array.Copy - 1.0781457 seconds
New Byte Array using System.Buffer.BlockCopy - 1.0156445 seconds
IEnumerable using C# yield operator - 0.0625012 seconds
IEnumerable using Linq's Concat<> - 0.0781265 seconds

Finally, I increased the size of each array to 1 million elements and re-ran the test, executing each loop only 4000 times:

New Byte Array using System.Array.Copy - 13.4533833 seconds
New Byte Array using System.Buffer.BlockCopy - 13.1096267 seconds
IEnumerable using C# yield operator - 0 seconds
IEnumerable using Linq's Concat<> - 0 seconds

So, if you need a new byte array, use

    byte[] rv = new byte[ a1.Length + a2.Length + a3.Length ];
    System.Buffer.BlockCopy( a1, 0, rv, 0, a1.Length );
    System.Buffer.BlockCopy( a2, 0, rv, a1.Length, a2.Length );
    System.Buffer.BlockCopy( a3, 0, rv, a1.Length + a2.Length, a3.Length );

But, if you can use an IEnumerable<byte>, DEFINITELY prefer Linq's Concat<> method. It's only slightly slower than the C# yield operator, but is more concise and more elegant.

    IEnumerable<byte> rv = a1.Concat(a2).Concat(a3);

If you have an arbitrary number of arrays and are using .NET 3.5, you can make the System.Buffer.BlockCopy solution more generic like this:

    private byte[] Combine( params byte[][] arrays )
    {
        byte[] rv = new byte[ arrays.Sum( a => a.Length ) ];
        int offset = 0;
        foreach ( byte[] array in arrays ) {
            System.Buffer.BlockCopy( array, 0, rv, offset, array.Length );
            offset += array.Length;
        }
        return rv;
    }

EDIT: To Jon Skeet's point regarding iteration of the subsequent data structures (byte array vs. IEnumerable), I re-ran the last timing test (1 million elements, 4000 iterations), adding a loop that iterated over the full array with each pass:

New Byte Array using System.Array.Copy - 78.20550510 seconds
New Byte Array using System.Buffer.BlockCopy - 77.89261900 seconds
IEnumerable using C# yield operator - 551.7150161 seconds
IEnumerable using Linq's Concat<> - 448.1804799 seconds

The point is, it is VERY important to understand the efficiency of both the creation and the usage of the resulting data structure. Simply focusing on the efficiency of the creation may overlook the inefficiency associated with the usage. Kudos, Jon.

Matt Davis 2009-01-06 03:53:10

so the IEnemerable is always faster than the blockCopy?

Stormenet 2009-01-06 06:30:28

Yes. There is no memory being allocated/copied (aside from the iterator anonymous class).

FryGuy 2009-01-06 06:45:05

But are you actually converting it into an array at the end, as the question requires? If not, of course it's faster - but it's not fulfilling the requirements.

Jon Skeet 2009-01-06 08:17:34

Isn't (lazy) functional programming grand? ;-)

peSHIr 2009-01-06 08:58:15

If you argue the IEnumerable<> solution doesn't fulfill the requirement, then I'll argue the requirement is poor because it offers no context. A combined array *may* not be necessary, in which case it'd be foolish to create one just to satisfy a poor requirement. Better to update the requirement.

Matt Davis 2009-01-06 19:49:05

Answer 4

A:

Concat is the right answer, but for some reason a handrolled thing is getting the most votes. If you like that answer, perhaps you'd like this more general solution even more:

    IEnumerable<byte> Combine(params byte[][] arrays)
    {
        foreach (byte[] a in arrays)
            foreach (byte b in a)
                yield return b;
    }

which would let you do things like:

    byte[] c = Combine(new byte[] { 0, 1, 2 }, new byte[] { 3, 4, 5 }).ToArray();

Mark Maxham 2009-01-06 05:33:11

The question specifically asks for the most *efficient* solution. Enumerable.ToArray isn't going to be very efficient, as it can't know the size of the final array to start with - whereas the hand-rolled techniques can.

Jon Skeet 2009-01-06 08:16:08

Answer 5

+4 A:

Many of the answers seem to me to be ignoring the stated requirements:

The result should be a byte array
It should be as efficient as possible

These two together rule out a LINQ sequence of bytes - anything with yield is going to make it impossible to get the final size without iterating through the whole sequence.

If those aren't the real requirements of course, LINQ could be a perfectly good solution (or the IList<T> implementation). However, I'll assume that Superdumbell knows what he wants.

(EDIT: I've just had another thought. There's a big semantic difference between making a copy of the arrays and reading them lazily. Consider what happens if you change the data in one of the "source" arrays after calling the Combine (or whatever) method but before using the result - with lazy evaluation, that change will be visible. With an immediate copy, it won't. Different situations will call for different behaviour - just something to be aware of.)

Here are my proposed methods - which are very similar to those contained in some of the other answers, certainly :)

public static byte[] Combine(byte[] first, byte[] second)
{
    byte[] ret = new byte[first.Length + second.Length];
    Buffer.BlockCopy(first, 0, ret, 0, first.Length);
    Buffer.BlockCopy(second, 0, ret, first.Length, second.Length);
    return ret;
}

public static byte[] Combine(byte[] first, byte[] second, byte[] third)
{
    byte[] ret = new byte[first.Length + second.Length + third.Length];
    Buffer.BlockCopy(first, 0, ret, 0, first.Length);
    Buffer.BlockCopy(second, 0, ret, first.Length, second.Length);
    Buffer.BlockCopy(third, 0, ret, first.Length + second.Length,
                     third.Length);
    return ret;
}

public static byte[] Combine(params byte[][] arrays)
{
    byte[] ret = new byte[arrays.Sum(x => x.Length)];
    int offset = 0;
    foreach (byte[] data in arrays)
    {
        Buffer.BlockCopy(data, 0, ret, offset, data.Length);
        offset += data.Length;
    }
    return ret;
}

Of course the "params" version requires creating an array of the byte arrays first, which introduces extra inefficiency.

Jon Skeet 2009-01-06 08:39:46

+1. Answers should at very least satisfy the requirements.

Mehrdad Afshari 2009-01-06 08:54:39

Jon, I understand precisely what you're saying. My only point is that sometimes questions are asked with a particular implementation already in mind without realizing that other solutions exist. Simply providing an answer without offering alternatives seems like a disservice to me. Thoughts?

Matt Davis 2009-01-06 19:58:08

@Matt: Yes, offering alternatives is good - but it's worth explaining that they *are* alternatives rather than passing them off as the answer to the question being asked. (I'm not saying that you did that - your answer is very good.)

Jon Skeet 2009-01-06 20:11:47

(Although I think your performance benchmark should show the time taken to go through all the results in each case, too, to avoid giving lazy evaluation an unfair advantage.)

Jon Skeet 2009-01-06 20:12:22

You can also change `params byte[][]` to `IEnumerable<byte[]>`, thus avoiding the need for an array

ohadsc 2010-10-25 14:54:06

Answer 6

A:

The memorystream class does this job pretty nicely for me. I couldn't get the buffer class to run as fast as memorystream.

using (MemoryStream ms = new MemoryStream())
{
  ms.Write(BitConverter.GetBytes(22),0,4);
  ms.Write(BitConverter.GetBytes(44),0,4);
  ms.ToArray();
}

Andrew 2010-05-14 12:49:24

ansaurus

tags:

views:

answers:

Best way to combine 2 or more byte arrays in C#

related questions