tags:

views:

90

answers:

3

I need to write an integer to a byte array such that leading zeros are omitted and the bytes are written in big endian order.

Example:

int    original = 0x00123456;

byte[] encoded  = Encode(original);  //  == new byte[] { 0x12, 0x34, 0x56 };

int    decoded  = Decode(encoded);   //  == 0x123456

My Decode method:

private static int Decode(byte[] buffer, int index, int length)
{
    int result = 0;
    while (length > 0)
    {
        result = (result << 8) | buffer[index];
        index++;
        length--;
    }
    return result;
}

I'm struggling to come up with an Encode method that doesn't require a temporary buffer or reverses the bytes after writing them in little endian order. Can anyone help?

private static int Encode(int value, byte[] buffer, int index)
{

}
+3  A: 
private static int Encode(int value, byte[] buffer, int index)
{
    int length = 0;
    int valueCopy = value;
    while (valueCopy != 0)
    {
        valueCopy >>= 8;
        length++;
    }
    for (int i = 0; i < length; i++)
    {
        buffer[index + length - i - 1] = (byte)value;
        value >>= 8;
    }
    return length;
}
Pavel Radzivilovsky
The double-loop is somewhat ugly, but not nearly as ugly as what I've come up with so far. +1.
dtb
@dtb: If your input size is fixed, i.e. 32-bit or 64-bit, you can entirely do away with the loops and hard-code the conversion - it will even be a little faster.
casablanca
@casablanca: Well, my input size is always 32-bit (a non-negative `int`), but I want to omit leading zeros when writing the value to the byte[]. Can you post an answer that shows how a solution without loops looks like? (Without omitting leading zeros it's simple; I'm struggling with the variable-length output.)
dtb
there's inherent need for the loop to find how long is the target number.
Pavel Radzivilovsky
@digEmAll: You can only unroll the loop if you know the number of iterations, which isn't obvious in this case.@Pavel: You can break down the loop into `if` statements that check if each byte of the number (from left to right) is non-zero, and if so, append it to the buffer.
casablanca
@casablanca: Yes, I thought to it some minutes after writing, comment removed ;)
digEmAll
@Pavel Radzivilovsky. I think this line `buffer[length - i - 1] = (byte)value;` should be `buffer[index + length - i - 1] = (byte)value;`. Despite the double loop, your method looks like the more readable, so +1.
Juan Pablo Califano
@Juan, thanks, added.
Pavel Radzivilovsky
A: 

Note that you are saving the value to a variable Length byte array. If you save these byteArray, you need save also the length.

You can see the protected Functions Write7BitEncodedInt from BinaryWriter and Read7BitEncodedInt from BinaryReader.

These functions save storage on disk for positive numbers. A number 0-128 needs only one byte.

Microsoft uses these functions to store/retrieve the String length prefix when save to Stream.

To use these Functions, you can create own Class derived from BinaryReader / BinaryWriter.

x77
Yes, I'm also saving the length so I can properly decode the byte[] back to an integer. Unfortunately I can't use Write7BitEncodedInt because I need to write the integer value exactly in the format described in my question.
dtb
+2  A: 

As per OP's request, here is a version without loops for a 32-bit number:

private static int Encode(int value, byte[] buffer, int index)
{
    byte temp;
    bool leading = true;

    temp = (value >> 24) & 0xFF;
    if (temp > 0) {
      buffer[index++] = temp;
      leading = false;
    }

    temp = (value >> 16) & 0xFF;
    if (temp > 0 || leading == false) {
      buffer[index++] = temp;
      leading = false;
    }

    temp = (value >> 8) & 0xFF;
    if (temp > 0 || leading == false) {
      buffer[index++] = temp;
      leading = false;
    }

    temp = value & 0xFF;
    buffer[index++] = temp;

    return index;
}

Version using a loop for 32-bit numbers:

private static int Encode(int value, byte[] buffer, int index)
{
    int length = 0;

    for (int i = 3; i >= 0; i++) {
      byte temp = (byte)(value >> (8 * i));
      if (temp > 0 || length > 0) {
        buffer[index++] = temp;
        length++;
      }
    }

    return length;
}

Note that this version doesn't write anything if the input is just 0.

casablanca
-1. This is wrong. It doesn't exclude leading zeros. It excludes every byte whose value is zero, instead. Try it with 0x00120056 and you'll see the result is not correct.
Juan Pablo Califano
Sorry, I overlooked that part. It's fixed now.
casablanca
No problem. Removed downvote...
Juan Pablo Califano
@casablanca: I like this. Please do me a small favour and turn this back into a loop (the shift is just 8*i). Can you also change it so it returns the number of bytes written?
dtb
@dtb, putting this in a loop will mean that you have a conditional branch in the loop. Once you have the first non-zero byte you can blindly pack the rest of the bytes without the conditional branch.
Chris Taylor
@Chris Taylor: I'm looking for a solution that is simple, readable, maintainable, not necessarily one that takes a few nanoseconds less than other solution. If you see two loops, you need to figure out what each loop's doing, while if there's only one loop, you know immediately that this loop simply writes some bytes to the byte[] in some way.
dtb
Pavel Radzivilovsky