I need to process files with data segments separated by a blank space, for example:
93.18 15.21 36.69 33.85 16.41 16.81 29.17
21.69 23.71 26.38 63.70 66.69 0.89 39.91
86.55 56.34 57.80 98.38 0.24 17.19 75.46
[...]
1.30 73.02 56.79 39.28 96.39 18.77 55.03
99.95 28.88 90.90 26.70 62.37 86.58 65.05
25.16 32.61 17.47 4.23 34.82 26.63 57.24
36.72 83.30 97.29 73.31 31.79 80.03 25.71
[...]
2.74 75.92 40.19 54.57 87.41 75.59 22.79
.
.
.
for this I am using the following function. In every call I get the necessary data, but I need to speed-up the code.
Is there a more efficient way?
EDIT: I will be updating the code with the changes that achieve improvements
ORIGINAL:
def get_pos_nextvalues(pos_file, indices):
result = []
for line in pos_file:
line = line.strip()
if not line:
break
values = [float(value) for value in line.split()]
result.append([float(values[i]) for i in indices])
return np.array(result)
NEW:
def get_pos_nextvalues(pos_file, indices):
result = ''
for line in pos_file:
if len(line) > 1:
s = line.split()
result += ' '.join([s [i] for i in indices])
else:
break
else:
return np.array([])
result = np.fromstring(result, dtype=float, sep=' ')
result = result.reshape(result.size/len(indices), len(indices))
return result
.
pos_file = open(filename, 'r', buffering=1024*10)
[...]
while(some_condition):
vs = get_pos_nextvalues(pos_file, (4,5,6))
[...]
speedup = 2.36