I am new to PyTables, and am looking at using it to process data generated from an agent-based modeling simulation and stored in HDF5. I'm working with a 39 MB test file, and am experiencing some strangeness. Here's the layout of the table:
/example/agt_coords (Table(2000000,)) ''
description := {
"agent": Int32Col(shape=(), dflt=0, pos=0),
"x": Float64Col(shape=(), dflt=0.0, pos=1),
"y": Float64Col(shape=(), dflt=0.0, pos=2)}
byteorder := 'little'
chunkshape := (20000,)
Here's how I'm accessing it in Python:
from tables import *
>>> h5file = openFile("alternate_hose_test.h5", "a")
h5file.root.example.agt_coords
/example/agt_coords (Table(2000000,)) ''
description := {
"agent": Int32Col(shape=(), dflt=0, pos=0),
"x": Float64Col(shape=(), dflt=0.0, pos=1),
"y": Float64Col(shape=(), dflt=0.0, pos=2)}
byteorder := 'little'
chunkshape := (20000,)
>>> coords = h5file.root.example.agt_coords
Now here's where things get weird.
[x for x in coords[1:100] if x['agent'] == 1]
[(1, 25.0, 78.0), (1, 25.0, 78.0)]
>>> [x for x in coords if x['agent'] == 1]
[(1000000, 25.0, 78.0), (1000000, 25.0, 78.0)]
>>> [x for x in coords.iterrows() if x['agent'] == 1]
[(1000000, 25.0, 78.0), (1000000, 25.0, 78.0)]
>>> [x['agent'] for x in coords[1:100] if x['agent'] == 1]
[1, 1]
>>> [x['agent'] for x in coords if x['agent'] == 1]
[1, 1]
I don't understand why the values are screwed up when I iterate over the whole table, but not when I take a small subset of the whole set of rows. I'm sure this is an error in how I'm using the library, so any help in this matter would be extremely appreciated.