views:

39

answers:

1

I want to read 4 bytes which are a little-endian encoding of an unsigned 32-bit integer, and assign the value to a Java int

(Yes, in reality I will use a 'long', but in this case I 'know' that the unsigned value is never so big that it will overflow a signed int in two's complement notation, and it suits my illustration to use an int).

The 4 bytes in question encode the value '216' little-endian style:

0xD8000000

And basically I just need to stuff the following bit pattern into Java int:

0x000000D8

The following simple code ought to do it ... and for the first three '0x00' bytes it succeeds:

byte b1 = din.readByte();
byte b2 = din.readByte();
byte b3 = din.readByte();
byte b4 = din.readByte();
int s = 0;
s = s | b4;
s = (s << 8);
s = s | b3;
s = (s << 8);
s = s | b2;
s = (s << 8);
s = s | b1;
return s;

However, it screws up on:

s = s | b1;

... because the bits of b1 are 1101 1000, which is a negative number (-40) in two's complement notation, since the most-significant bit is a 1. When Java widens b1 to an int before evaluating the bitwise or operator |, -40 is encoded as 0xFFFFFFD8, which screws our naive assumption that the first 3 bytes of the widened int will be 0's.

So my strategy runs aground. But what should I do instead? Is it even possible to solve this using primitive operators (please give solution), or must we resort to the class library? (I don't play around with bits and bytes directly very much in my normal coding, so I lack the idiom for what seems like it ought to be 'everyday' code).

As for the class library approach, the following fragment gets the right result:

ByteBuffer b = ByteBuffer.allocate(4).order(ByteOrder.LITTLE_ENDIAN).put((byte) 0xD8).put((byte) 0x00).put((byte) 0x00).put((byte) 0x00);
b.flip();
int s = b.getInt();

... which is fine for readability, but uses 8 method invocations which I'd rather dispense with.

thanks! David.

+3  A: 

Just include & 0xff for each byte to int promotion, in order to make sure that the top bits are set to 0:

byte b1 = din.readByte();
byte b2 = din.readByte();
byte b3 = din.readByte();
byte b4 = din.readByte();
int s = 0;
s = s | (b4 & 0xff);
s = (s << 8);
s = s | (b3 & 0xff);
s = (s << 8);
s = s | (b2 & 0xff);
s = (s << 8);
s = s | (b1 & 0xff);
return s;

Or more compactly:

byte b1 = din.readByte();
byte b2 = din.readByte();
byte b3 = din.readByte();
byte b4 = din.readByte();
return ((b4 & 0xff) << 24)
     | ((b3 & 0xff) << 16)
     | ((b2 & 0xff) << 8)
     | ((b1 & 0xff) << 0);

(Obviously the "shift left 0" is unnecessary, but it keeps the consistency higher.)

Jon Skeet
Gotta watch for that sign-extension.
dty
Ah, I'd seen that sort of thing during my hunting without taking the time to execute the code in my head. But now that you say it, it makes perfect sense and I won't be intimidated by that terse bit packing in future! Thanks.
David Bullock