views:

412

answers:

4

I am trying to write some processor independent code to write some files in big endian. I have a sample of code below and I can't understand why it doesn't work. All it is supposed to do is let byte store each byte of data one by one in big endian order. In my actual program I would then write the individual byte out to a file, so I get the same byte order in the file regardless of processor architecture.

#include <iostream>

int main (int argc, char * const argv[]) {
 long data = 0x12345678;
 long bitmask = (0xFF << (sizeof(long) - 1) * 8);
 char byte = 0;

    for(long i = 0; i < sizeof(long); i++) {
  byte = data & bitmask;
  data <<= 8;
 }
    return 0;
}

For some reason byte always has the value of 0. This confuses me, I am looking at the debugger and see this:

data = 00010010001101000101011001111000 bitmask = 11111111000000000000000000000000

I would think that data & mask would give 00010010, but it just makes byte 00000000 every time! How can his be? I have written some code for the little endian order and this works great, see below:

#include <iostream>

int main (int argc, char * const argv[]) {
 long data = 0x12345678;
 long bitmask = 0xFF;
 char byte = 0;

    for(long i = 0; i < sizeof(long); i++) {
  byte = data & bitmask;
  data >>= 8;
 }
    return 0;
}

Why does the little endian one work and the big endian not? Thanks for any help :-)

+1  A: 

You're getting the shifting all wrong.

#include <iostream>

int main (int argc, char * const argv[]) {
   long data = 0x12345678;
   int shift = (sizeof(long) - 1) * 8
   const unsigned long mask = 0xff;
   char byte = 0;

   for (long i = 0; i < sizeof(long); i++, shift -= 8) {
      byte = (data & (mask << shift)) >> shift;
   }
   return 0;
}

Now, I wouldn't recommend you do things this way. I would recommend instead writing some nice conversion functions. Many compilers have these as builtins. So you can write your functions to do it the hard way, then switch them to just forward to the compiler builtin when you figure out what it is.

#include <tr1/cstdint> // To get uint16_t, uint32_t and so on.

inline uint16_t to_bigendian(uint16_t val, char bytes[2])
{
    bytes[0] = (val >> 8) & 0xffu;
    bytes[1] = val & 0xffu;
}

inline uint32_t to_bigendian(uint32_t val, char bytes[4])
{
   bytes[0] = (val >> 24) & 0xffu;
   bytes[1] = (val >> 16) & 0xffu;
   bytes[2] = (val >> 8) & 0xffu;
   bytes[3] = val & 0xffu;
}

This code is simpler and easier to understand than your loop. It's also faster. And lastly, it is recognized by some compilers and automatically turned into the single byte swap operation that would be required on most CPUs.

Omnifarious
+1  A: 

In your example, data is 0x12345678.

Your first assignment to byte is therefore:

byte = 0x12000000;

which won't fit in a byte, so it gets truncated to zero.

try:

byte = (data & bitmask) >> (sizeof(long) - 1) * 8);
plinth
hehe beaten by 12 seconds :D
Goz
A: 

because you are masking off the top byte from an integer and then not shifting it back down 24 bits ...

Change your loop to:

for(long i = 0; i < sizeof(long); i++) {
        byte = (data & bitmask) >> 24;
        data <<= 8;
    }
Goz
Thanks people, problem solved!
orgazoid
the shiftright length should probably be `sizeof(long)-sizeof(char)` in order to be portable...
xtofl
+5  A: 

You should use the standard functions ntohl() and kin for this. They operate on explicit sized variables (i.e. uint16_t and uin32_t) rather than compiler-specific long, which necessary for portability.

Some platforms provide 64-bit versions in <endian.h>

Will
They are somewhat platform specific.
Omnifarious
but I can't think of a platform without them, e.g. win32 http://msdn.microsoft.com/en-us/library/ms738556%28VS.85%29.aspx
Will
surely if you use sizeof() then you are platform independent?
orgazoid
sizeof() tells you the size - in bytes - of the host storage, not the size on any other platform.
Will