views:

494

answers:

4

Hi all,

I'm creating a binary file to transmit to a third party that contains images and information about each image. The file uses a record length format, so each record is a particular length. The beginning of each record is the Record Length Indicator, which is 4 characters long and represents the length of the record in Big Endian format.

I'm using a BinaryWriter to write to the file, and for the Record Length Indicator I'm using Encoding.Default.

The problem I'm having is that there is one character in one record that is displaying as a "?" because it is unrecognized. My algorithm to build the string for the record length indicator is this:

  private string toBigEndian(int value)
    {
        string returnValue = "";            
        string binary = Convert.ToString(value, 2).PadLeft(32, '0');
        List<int> binaryBlocks = new List<int>();
        binaryBlocks.Add(Convert.ToInt32(binary.Substring(0, 8), 2));
        binaryBlocks.Add(Convert.ToInt32(binary.Substring(8, 8), 2));
        binaryBlocks.Add(Convert.ToInt32(binary.Substring(16, 8), 2));
        binaryBlocks.Add(Convert.ToInt32(binary.Substring(24, 8), 2));

        foreach (int block in binaryBlocks)
        {                
            returnValue += (char)block;
        }

        Console.WriteLine(value);

        return returnValue;
    }

It takes the length of the record, converts it to 32-bit binary, converts that to chunks of 8-bit binary, and then converts each chunk to its appropriate character. The string that is returned here does contain the correct characters, but when it's written to the file, one character is unrecognized. This is how I'm writing it:

//fileWriter is BinaryWriter and record is Encoding.Default
fileWriter.Write(record.GetBytes(toBigEndian(length)));

Perhaps I'm using the wrong type of encoding? I've tried UTF-8, which should work, but it gives me extra characters sometimes.

Thanks in advance for your help.

+1  A: 

If you really want a binary four bytes (i.e. not just four characters, but a big-endian 32-bit length value) then you want something like this:

byte[] bytes = new byte[4];
bytes[3] = (byte)((value >> 24) & 0xff);
bytes[2] = (byte)((value >> 16) & 0xff);
bytes[1] = (byte)((value >> 8) & 0xff);
bytes[0] = (byte)(value & 0xff);
fileWriter.Write(bytes);
Simon Steele
+6  A: 

The problem is that you should not return the value as a string at all.

When you cast the value to a char, and then encode it as 8 bit characters, there are several values that will be encoded into the wrong byte code, and several values that will fail to be encoded at all (resulting in the ? characters). The only way not to lose data in that step would be to encode it as UTF-16, but that would give you eight bytes instead of four.

You should return is as a byte array, so that you can write it to the file without converting it back and forth between character data and binary data.

private byte[] toBigEndian(int value) {
   byte[] result = BitConverter.GetBytes(value);
   if (BitConverter.IsLittleEndian) Array.Reverse(result);
   return result;
}

fileWriter.Write(toBigEndian(length));
Guffa
Exactly correct and solved the issue. Thank you.
Aaron
A: 

To read/write bits from binary streams with appropriate endianess use the BitConverter class, since it has explicit support for endianess: http://msdn.microsoft.com/en-us/library/system.bitconverter.islittleendian.aspx

Converting to binary then tokenizing into bytes is, I must say, the most unorthodox way I see yet :)

Remus Rusanu
The IsLittleEndian property is a read only that tells you if the system is big or little endian. It does not allow you to set endianness. For that you need to roll your own, or grab one of many found online.
Jon B
BitConverter has no support for endianness. This property will only indicate whether the current platform is little endian or not. It will not do the conversion to big endian for you.
Dave Van den Eynde
The theory goes that you should check the endianess to know whether to revert or not the GetBytes output. But you're right, my answer was misleading in that the BitConverter cannot actually offer the output already in the proper endianess.
Remus Rusanu
+1  A: 

Do not create a string from a int to write bytes. Better try this:

byte[] result = 
    {
      (byte)( value >> 24 ),
      (byte)( value >> 16 ),
      (byte)( value >> 8 ) ,
      (byte)( value >> 0 )
    };
Dr Spack