ansaurus

Question

Answer 1

A:

You could coble together an ASM based solution using CorePy. I wonder though, if you might be able to gain enough performance from the some other part of your algorithm. I/O and manipulations on 1GB chunks of data are going to take a while which ever way you slice it.

One other thing you might find helpful would be to switch to C once you have prototyped the algorithm in python. I did this for manipulations on a whole-world DEM (height) data set one time. The whole thing was much more tolerable once I got away from the interpreted script.

Ewan Todd 2009-10-27 18:30:37

Answer 2

A:

I'd expect something like this to be faster

arrayName[0] = unpack('>'+'f'*line_count*sample_count, map.read(arrayName.itemsize*line_count*sample_count))

Please don't use map as a variable name

gnibbler 2009-10-27 18:51:14

Answer 3

+4 A:

with open(fileName, "rb") as f:
  arrayName = numpy.fromfile(f, numpy.float32)
arrayName.byteswap(True)

Pretty hard to beat for speed AND conciseness;-). For byteswap see here (the True argument means, "do it in place"); for fromfile see here.

This works as is on little-endian machines (since the data are big-endian, the byteswap is needed). You can test if that is the case to do the byteswap conditionally, change the last line from an unconditional call to byteswap into, for example:

if struct.pack('=f', 2.3) == struct.pack('<f', 2.3):
  arrayName.byteswap(True)

i.e., a call to byteswap conditional on a test of little-endianness.

Alex Martelli 2009-10-27 20:19:49

that is remarkably straightforward. thank youwhat's weird is i had seen those when trying to figure out how to do this, but it just didn't register for some reason. comes with experience i suppose =)

Rayjan 2009-10-27 20:39:08

numpy.float32 has native byte order that might not be always big-endian. http://stackoverflow.com/questions/1632673/python-file-slurp-w-endian-conversion/1633525#1633525

J.F. Sebastian 2009-10-27 20:44:00

Indeed it will mostly be little-endian, but if you're running e.g. on a Power PC machine it will be big endian (if that's an issue just conditionally omit the byteswap call -- let me edit the answer to add that bit).

Alex Martelli 2009-10-27 21:38:57

Answer 4

+3 A:

Slightly modified @Alex Martelli's answer:

arr = numpy.fromfile(filename, numpy.dtype('>f4'))
# no byteswap is needed regardless of endianess of the machine

J.F. Sebastian 2009-10-27 20:42:17

ansaurus

tags:

views:

answers:

Python File Slurp w/ endian conversion

related questions