views:

134

answers:

3

Is there an easy way to read these integers in? I'd prefer a built in method, but I assume it is possible to do with some bit operations.
Cheers

edit
I thought of another way to do it that is different to the ways below and in my opinion is clearer. It pads with zeros at the other end, then shifts the result. No if required because shifting fills with whatever the msb is initially.

struct.unpack('<i','\0'+ bytes)[0] >> 8
+4  A: 

Python's struct module lets you interpret bytes as different kinds of data structure, with control over endianness.

If you read a single three-byte number from the file, you can convert it thus:

struct.unpack('<I', bytes + '\0')

The module doesn't appear to support 24-bit words, hence the '\0'-padding.

EDIT: Signed numbers are trickier. You can copy the high-bit:

struct.unpack('<i', bytes + ('\0' if bytes[2] < '\x80' else '\xff'))
Marcelo Cantos
Thanks, but I forgot to say that they are signed so padding the most significant bits will not work, as fair as my limited understanding of two's complement goes.
jolly swagman
@jolly: I've amended my answer accordingly.
Marcelo Cantos
@Marcelo: For improved clarity and speed, try `.... if bytes[2] < '\x80' else ....`
John Machin
Awesome, thanks a lot.
jolly swagman
@John: Silly me. Thanks for the improvement.
Marcelo Cantos
+2  A: 

Are your 24-bit integers signed or unsigned? Bigendian or littleendian?

struct.unpack('<I', bytes + '\x00')[0] # unsigned littleendian
struct.unpack('>I', '\x00' + bytes)[0] # unsigned bigendian

Signed is a little more complicated ... get the unsigned value as above, then do this:

signed = unsigned if not (unsigned & 0x800000) else unsigned - 0x1000000
John Machin
Sorry mate, didn't see you there below the fold. Apparently I don't have enough reputation to upvote but thanks for the effort!
jolly swagman
+2  A: 

If you don't mind using an external library then my bitstring module could be helpful here.

from bitstring import Bits
s = Bits(filename='some_file')
a = s.read('uintle:24')

This reads in the first 24 bits and interprets it as an unsigned little-endian integer. After the read s.pos is set to 24 (the bit position in the stream), so you can then read more. For example if you wanted to get a list of the next 10 signed integers you could use

l = s.readlist('10*intle:24')

or if you prefer you could just use slices and properties and not bother with reads:

a = s[0:24].uintle

Another alternative if you already have the 3 bytes of data from you file is just to create and interpret:

a = Bits(bytes=b'abc').uintle
Scott Griffiths
@Scott: I'd prefer not to use an external library for this particular project, but I'll check it out nonetheless. What is performance like for something like this? They will potentially be coming in at around 3Mb/s, so 130,000 of them every second. Frankly the sample rate is far higher than necessary so I can just discard the majority of them, but if I don't will this library manage?
jolly swagman
@jolly: If performance is a concern then you should stick with the `struct` method. Bitstring is (for now) pure Python so it won't beat this. It's reasonably efficient but the emphasis has been on making bitwise tasks as easy as possible, not as fast as possible - at least not yet :)
Scott Griffiths