tags:

views:

195

answers:

2

I have read samples out of a wave file using the wave module, but it gives the samples as a string, it's out of wave so it's little endian (for example, '`\x00').

What is the easiest way to convert this into a python integer, or a numpy.int16 type? (It will eventually become a numpy.int16, so going directly there is fine).

Code needs to work on little endian and big endian processors.

+8  A: 

The struct module converts packed data to Python values, and vice-versa.

>>> import struct
>>> struct.unpack("<h", "\x00\x05")
(1280,)
>>> struct.unpack("<h", "\x00\x06")
(1536,)
>>> struct.unpack("<h", "\x01\x06")
(1537,)

"h" means a short int, or 16-bit int. "<" means use little-endian.

Ned Batchelder
+4  A: 

struct is fine if you have to convert one or a small number of 2-byte strings to integers, but array and numpy itself are better options. Specifically, numpy.fromstring (called with the appropriate dtype argument) can directly convert the bytes from your string to an array of (whatever that dtype is). (If numpy.little_endian is false, you'll then have to swap the bytes -- see here for more discussion, but basically you'll want to call the byteswap method on the array object you just built with fromstring).

Alex Martelli
This is really good to know as well, I'm going to go with the struct solution though, so that I don't need to worry about correcting endianess manually.
Jeffrey Aylesworth
If you're in no hurry, or have very few data points, that's fine. Otherwise, you _can_ codify endianness as a string as part of the `dtype` (I don't know the details offhand, though).
Alex Martelli