views:

741

answers:

3

I'm looking at the C# library called BitStream, which allows you to write and read any number of bits to a standard C# Stream object. I noticed what seemed to me a strange design decision:

When adding bits to an empty byte, the bits are added to the MSB of the byte. For example:

var s = new BitStream();
s.Write(true);
Debug.Assert(s.ToByteArray()[0] == 0x80);  // and not 0x01

var s = new BitStream();
s.Write(0x7,0,4);
s.Write(0x3,0,4);
Debug.Assert(s.ToByteArray()[0] == 0x73); // and not 0x37

However, when referencing bits in a number as the input, the first bit of the input number is the LSB. For example

//s.Write(int input,int bit_offset, int count_bits)
//when referencing the LSB and the next bit we'll write
s.Write(data,0,2); //and not s.Write(data,data_bits_number,data_bits_number-2)

It seems inconsistent to me. Since in this case, when "gradually" copying a byte like in the previous example (the first four bits, and then the last four bits), we will not get the original byte. We need to copy it "backwards" (first the last four bits, then the first four bits).

Is there a reason for that design that I'm missing? Any other implementation of bits stream with this behaviour? What are the design considerations for that?

It seems that ffmpeg bitstream behaves in a way I consider consistent. Look at the amount it shifts the byte before ORing it with the src pointer in the put_bits function.

As a side note:

The first byte added, is the first byte in the byte array. For example

var s = new BitStream();
s.Write(0x1,0,4);
s.Write(0x2,0,4);
s.Write(0x3,0,4);
Debug.Assert(s.ToByteArray()[0] == 0x12); // and not s.ToByteArray()[1] == 0x12
+1  A: 

Is there a reason for that design that I'm missing? Any other implementation of bits stream with this behaviour? What are the design considerations for that?

I doubt there was any significant meaning behind the descision. Technically it just does not matter so long as the writer and reader agree on the ordering.

csharptest.net
It seems inconsistent to me. Since in this case, when "gradually" But I just showed it does matter. Quote: "copying a byte like in the previous example (the first four bits, and then the last four bits), we will not get the original byte. We need to copy it "backwards" (first the last four bits, then the first four bits)."
Elazar Leibovich
Like I said, when both the reader and writer agree on the bit ordering, it doesn't matter. IMO you should use the BitStream to both read and write. If you have other intentions, like reading the resulting bytes, you should probably just write your own stream.
csharptest.net
+1  A: 

Here are some additional considerations:

In the case of the boolean - only one bit is required to represent true or false. When that bit gets added to the beginning of the stream, the bit stream is "1." When you extend that stream to byte length it forces the padding of zero bits to the end of the stream, even though those bits did not exist in the stream to begin with. Position in the stream is important information just like the values of the bits, and a bit stream of "1000000" or 0x80 safeguards the expectation that subsequent readers of the stream may have that the first bit they read is the first bit that was added.

Second, other data types like integers require more bits to represent so they are going to take up more room in the stream than booleans. Mixing different size data types in the same stream can be very tricky when they aren't aligned on byte boundaries.

Finally, if you are on Intel x86 your CPU architecture is "little-endian" which means LSB first like you are describing. If you need to store values in the stream as big-endian you'll need to add a conversion layer in your code - similar to what you've shown above where you push one byte at a time into the stream in the order you want. This is annoying, but commonly required if you need to interop with big-endian Unix boxes or as may be required by a protocol specification.

Hope that helps!

csharpguy
A: 

I agree with Elazar.

As he/she points out, this is a case where the reader and writer do NOT agree on the bit ordering. In fact, they're incompatible.

chennai