ansaurus

Question

how is data stored at bit level according to "Endianness" ?

Answer 1

+4 A:

Your machine almost certainly can't address individual bits of memory, so the layout of bits inside a byte is meaningless. Endianness refers only to the ordering of bytes inside multibyte objects.

To make your second program make sense (though there isn't really any reason to, since it won't give you any meaningful results) you need to learn about the bitwise operators - particularly & for this application.

Carl Norum 2010-05-28 20:40:18

I know how to use showbits function to print the bits...using AND masks and all.but the bits are not printed as they are stored.not even the "as-it-is-stored" byte order is followed...

bakra 2010-05-28 20:50:43

@bakra, I expect your `showbits` routine just goes MSB-to-LSB order. As I mentioned in my answer, it doing such a thing doesn't give any meaningful information about the way your computer stores bits, since that doesn't matter to you in any way (and can't matter, since they're not addressable anyway).

Carl Norum 2010-05-28 20:54:42

Answer 2

+3 A:

This line here:

temp = *( (bool*)ptr + i );

... when you do pointer arithmetic like this, the compiler moves the pointer on by the number you added times the sizeof the thing you are pointing to. Because you are casting your void* to a bool*, the compiler will be moving the pointer along by the size of one "bool", which is probably just an int under the covers, so you'll be printing out memory from further along than you thought.

You can't address the individual bits in a byte, so it's almost meaningless to ask which way round they are stored. (Your machine can store them whichever way it wants and you won't be able to tell). The only time you might care about it is when you come to actually spit bits out over a physical interface like I2C or RS232 or similar, where you have to actually spit the bits out one-by-one. Even then, though, the protocol would define which order to spit the bits out in, and the device driver code would have to translate between "an int with value 0xAABBCCDD" and "a bit sequence 11100011... [whatever] in protocol order".

Vicky 2010-05-28 20:49:27

+1, good answer. Bit-by-bit output like you describe is usually called "MSB-first" or "LSB-first" rather than "big-endian" or "little-endian" to make explicit the expected behaviour.

Carl Norum 2010-05-28 20:50:53

Answer 3

+2 A:

Endianness, as you discovered by your experiment refers to the order that bytes are stored in an object.

Bits do not get stored differently, they're always 8 bits, and always "human readable" (high->low).

Now that we've discussed that you don't need your code... About your code:

for(int i=0;i<32;i++)
{   
  temp = *( (bool*)ptr + i );
  ...
}

This isn't doing what you think it's doing. You're iterating over 0-32, the number of bits in a word - good. But your temp assignment is all wrong :)

It's important to note that a bool* is the same size as an int* is the same size as a BigStruct*. All pointers on the same machine are the same size - 32bits on a 32bit machine, 64bits on a 64bit machine.

ptr + i is adding i bytes to the ptr address. When i>3, you're reading a whole new word... this could possibly cause a segfault.

What you want to use is bit-masks. Something like this should work:

for (int i = 0; i < 32; i++) {
  unsigned int mask = 1 << i;
  bool bit_is_one = static_cast<unsigned int>(ptr) & mask;
  ...
}

Stephen 2010-05-28 20:57:20

bakra 2010-05-28 21:17:28

@bakra : Sorry, edited. I'm sure it works if you cast the ptr to an `unsigned int`. Just tried it to be sure.

Stephen 2010-05-28 21:39:44

Answer 4

+3 A:

Just for completeness, machines are described in terms of both byte order and bit order.

The intel x86 is called Consistent Little Endian because it stores multi-byte values in LSB to MSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31.

The Motorola 68000 is called Inconsistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31 (same as intel, which is why it is called 'Inconsistent' Big Endian).

The 32-bit IBM/Motorola PowerPC is called Consistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^31 and b31 = 2^0.

Under normal high level language use the bit order is generally transparent to the developer. When writing in assembly language or working with the hardware, the bit numbering does come into play.

Amardeep 2010-05-28 21:47:20

Answer 5

+2 A:

Byte Endianness

On different machines this code may give different results:

union endian_example {
   unsigned long u;
   unsigned char a[sizeof(unsigned long)];
} x;

x.u = 0x0a0b0c0d;

int i;
for (i = 0; i< sizeof(unsigned long); i++) {
    printf("%u\n", (unsigned)x.a[i]);
}

This is because different machines are free to store values in any byte order they wish. This is fairly arbitrary. There is no backwards or forwards in the grand scheme of things.

Bit Endianness

Usually you don't have to ever worry about bit endianness. The most common way to access individual bits is with shifts ( >>, << ) but those are really tied to values, not bytes or bits. They preform an arithmatic operation on a value. That value is stored in bits (which are in bytes).

Where you may run into a problem in C with bit endianness is if you ever use a bit field. This is a rarely used (for this reason and a few others) "feature" of C that allows you to tell the compiler how many bits a member of a struct will use.

struct thing {
     unsigned y:1; // y will be one bit and can have the values 0 and 1
     signed z:1; // z can only have the values 0 and -1
     unsigned a:2; // a can be 0, 1, 2, or 3
     unsigned b:4; // b is just here to take up the rest of the a byte
};

In this the bit endianness is compiler dependant. Should y be the most or least significant bit in a thing? Who knows? If you care about the bit ordering (describing things like the layout of a IPv4 packet header, control registers of device, or just a storage formate in a file) then you probably don't want to worry about some different compiler doing this the wrong way. Also, compilers aren't always as smart about how they work with bit fields as one would hope.

nategoose 2010-05-28 22:21:04

ansaurus

tags:

views:

answers:

how is data stored at bit level according to "Endianness" ?

Byte Endianness

Bit Endianness

related questions