If you want to write it to disk so that it will be easy to read back in as a numpy array, look into numpy.save
. Pickling it will work fine, as well, but it's less efficient for large arrays (which yours isn't, so either is perfectly fine).
If you want it to be human readable, look into numpy.savetxt
.
Edit: So, it seems like savetxt
isn't quite as great an option for arrays with >2 dimensions... But just to draw everything out to it's full conclusion:
I just realized that numpy.savetxt
chokes on ndarrays with more than 2 dimensions... This is probably by design, as there's no inherently defined way to indicate additional dimensions in a text file.
E.g. This (a 2D array) works fine
import numpy as np
x = np.arange(20).reshape((4,5))
np.savetxt('test.txt', x)
While the same thing would fail (with a rather uninformative error: TypeError: float argument required, not numpy.ndarray
) for a 3D array:
import numpy as np
x = np.arange(200).reshape((4,5,10))
np.savetxt('test.txt', x)
One workaround is just to break the 3D (or greater) array into 2D slices. E.g.
x = np.arange(200).reshape((4,5,10))
with file('test.txt', 'w') as outfile:
for slice_2d in x:
np.savetxt(outfile, slice_2d)
However, our goal is to be clearly human readable, while still being easily read back in with numpy.loadtxt
. Therefore, we can be a bit more verbose, and differentiate the slices using commented out lines. By default, numpy.loadtxt
will ignore any lines that start with #
(or whichever character is specified by the comments
kwarg). (This looks more verbose than it actually is...)
import numpy as np
# Generate some test data
data = np.arange(200).reshape((4,5,10))
# Write the array to disk
with file('test.txt', 'w') as outfile:
# I'm writing a header here just for the sake of readability
# Any line starting with "#" will be ignored by numpy.loadtxt
outfile.write('# Array shape: {0}\n'.format(data.shape))
# Iterating through a ndimensional array produces slices along
# the last axis. This is equivalent to data[i,:,:] in this case
for data_slice in data:
# The formatting string indicates that I'm writing out
# the values in left-justified columns 7 characters in width
# with 2 decimal places.
np.savetxt(outfile, data_slice, fmt='%-7.2f')
# Writing out a break to indicate different slices...
outfile.write('# New slice\n')
This yields:
# Array shape: (4, 5, 10)
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00
10.00 11.00 12.00 13.00 14.00 15.00 16.00 17.00 18.00 19.00
20.00 21.00 22.00 23.00 24.00 25.00 26.00 27.00 28.00 29.00
30.00 31.00 32.00 33.00 34.00 35.00 36.00 37.00 38.00 39.00
40.00 41.00 42.00 43.00 44.00 45.00 46.00 47.00 48.00 49.00
# New slice
50.00 51.00 52.00 53.00 54.00 55.00 56.00 57.00 58.00 59.00
60.00 61.00 62.00 63.00 64.00 65.00 66.00 67.00 68.00 69.00
70.00 71.00 72.00 73.00 74.00 75.00 76.00 77.00 78.00 79.00
80.00 81.00 82.00 83.00 84.00 85.00 86.00 87.00 88.00 89.00
90.00 91.00 92.00 93.00 94.00 95.00 96.00 97.00 98.00 99.00
# New slice
100.00 101.00 102.00 103.00 104.00 105.00 106.00 107.00 108.00 109.00
110.00 111.00 112.00 113.00 114.00 115.00 116.00 117.00 118.00 119.00
120.00 121.00 122.00 123.00 124.00 125.00 126.00 127.00 128.00 129.00
130.00 131.00 132.00 133.00 134.00 135.00 136.00 137.00 138.00 139.00
140.00 141.00 142.00 143.00 144.00 145.00 146.00 147.00 148.00 149.00
# New slice
150.00 151.00 152.00 153.00 154.00 155.00 156.00 157.00 158.00 159.00
160.00 161.00 162.00 163.00 164.00 165.00 166.00 167.00 168.00 169.00
170.00 171.00 172.00 173.00 174.00 175.00 176.00 177.00 178.00 179.00
180.00 181.00 182.00 183.00 184.00 185.00 186.00 187.00 188.00 189.00
190.00 191.00 192.00 193.00 194.00 195.00 196.00 197.00 198.00 199.00
# New slice
Reading it back in is very easy, as long as we know the shape of the original array. We can just do numpy.loadtxt('test.txt').reshape((4,5,10))
. As an example (You can do this in one line, I'm just being verbose to clarify things):
# Read the array from disk
new_data = np.loadtxt('test.txt')
# Note that this returned a 2D array!
print new_data.shape
# However, going back to 3D is easy if we know the
# original shape of the array
new_data = new_data.reshape((4,5,10))
# Just to check that they're the same...
assert np.all(new_data == data)