tags:

views:

620

answers:

2

I have a unsigned char buffer, and I'm wondering how I would write and read signed and unsigned bits to this byte buffer.

In the Source Engine there is a class named bf_write, which two main methods (used by WriteString, WriteChar, WriteLong, etc.) use two functions named WriteUBitLong and WriteSBitLong.

Thanks in advance

+7  A: 

If the number of bits is a compile-time constant:

#include <bitset>
...
std::bitset<100> b;
b[2]=true;

If it's not, use Boost.dynamic_bitset

Or, if you're desperate, std::vector, which is indeed a packed bit vector:

#include <vector>
...
std::vector<bool> b(100);
b[2]=true;

You seem to want to use a library that requires bit vectors packed in an array of bytes. Without knowing exactly what order it places the bits in, I can only note that:

1) all of the above will probably use at least 32-bit ints with bits ordered least->most or most->least significant

2) on little endian (Intel/AMD) CPUs, this means that the memory occupied by the bytes an array of ints may not be consistent with the ordering of bits within the int. if it's "bit 0 is the lsb of int 0, ... bit 32 is the lsb of int 1, ..." then that's the same in little endian as "bit 0 is the lsb of char 0, ... bit 32 is the lsb of char 4 ...", in which case you can just cast a pointer to the int array to a pointer to char array

3) supposing the native order of bytes in your bit set / vector isn't exactly what the library needs, then you have to either have to create your own that has the layout they want, or transcribe a copy into their layout.

a) if the order of bits within a byte is different, a 256 entry lookup table giving the byte with bits reversed would be efficient. you could generate the table with a small routine.

b) to reverse bytes from little<->big endian:

inline void endian_swap(unsigned short& x)
{
    x = (x>>8) | 
        (x<<8);
}

inline void endian_swap(unsigned int& x)
{
    x = (x>>24) | 
        ((x<<8) & 0x00FF0000) |
        ((x>>8) & 0x0000FF00) |
        (x<<24);
}    

inline void endian_swap(unsigned long long& x)
{
    x = (x>>56) | 
        ((x<<40) & 0x00FF000000000000) |
        ((x<<24) & 0x0000FF0000000000) |
        ((x<<8)  & 0x000000FF00000000) |
        ((x>>8)  & 0x00000000FF000000) |
        ((x>>24) & 0x0000000000FF0000) |
        ((x>>40) & 0x000000000000FF00) |
        (x<<56);
}

To get/set a particular bit within a word, with bit #0 in the least significant bit of word 0:

typedef unsigned char block_t;
const unsigned block_bits=8;

inline void set_bit(block_t *d,unsigned i) {
  unsigned b=i/block_bits;
  unsigned bit=i-(block_bits*b); // same as i%b
  block_t &bl=d[b];
  bl|=(1<<bit); // or bit with 1 (others anded w/ 0)
}

inline void clear_bit(block_t *d,unsigned i) {
  unsigned b=i/block_bits;
  unsigned bit=i-(block_bits*b); // same as i%b
  block_t &bl=d[b];
  bl&=(~(1<<bit)); // and bit with 0 (other bits anded w/ 1)
}

inline void modify_bit(block_t *d,unsigned i,bool val) {
  if (val) set_bit(d,i) else clear_bit(d,i);
}

inline bool get_bit(block_t const* d,unsigned i) {
  unsigned b=i/block_bits;
  unsigned bit=i-(block_bits*b); // same as i%b
  return d[b]&(1<<bit);
}

Obviously if the rule for bit organization differs, you have to change the above.

Using the widest possible int your CPU processes efficiently as block_t is best (dont' forget to change block_bits), unless the endianness doesn't work out w/ the library you're using.

wrang-wrang
Thanks so much! That's brilliant (would +1 but need 15 rep).If you don't mind, please would you tell me how I would write a short and or character using your set_bit functions (don't waste time padding, I know how to do that :))
Saul Rennison
A: 

I think a few macros are enough:

#define set_bit0(buf, i) ((buf)[(i)/8]&=~(1u<<(i)%8))
#define set_bit1(buf, i) ((buf)[(i)/8]|=1<<(i)%8)
#define get_bit(buf, i) ((buf)[(i)/8]>>(i)%8&1)

In addition, swapping endianness can be done in a faster way. For example, for a 64-bit integer v, the following operations swap its endianness:

v = ((v & 0x00000000FFFFFFFFLLU) << 32) | (v >> 32);
v = ((v & 0x0000FFFF0000FFFFLLU) << 16) | ((v & 0xFFFF0000FFFF0000LLU) >> 16);
v = ((v & 0x00FF00FF00FF00FFLLU) << 8) | ((v & 0xFF00FF00FF00FF00LLU) >> 8);