If you're concerned with keeping the size of the data file as small as possible, here are some suggestions:
- Write the data to a binary file (i.e. using FWRITE) instead of to a text file (i.e. using FPRINTF).
- If your data contains all integer values, convert it to or save it as a signed or unsigned integer type instead of the default double precision type MATLAB uses.
- If your data contains floating point values, but you don't need the range or resolution of the default double precision type, convert it to or save it as a single precision type.
- If your data is sufficiently sparse (i.e. there are many more zeroes than non-zeroes in your matrix), then you can use the FIND function to get the row and column indices of the non-zero values, then just save these to your file.
Here are a couple of examples to illustrate:
data = double(rand(16,2^20) <= 0.00001); %# A large but very sparse matrix
%# Writing the values as type double:
fid = fopen('data_double.dat','w'); %# Open the file
fwrite(fid,size(data),'uint32'); %# Write the matrix size (2 values)
fwrite(fid,data,'double'); %# Write the data as type double
fclose(fid); %# Close the file
%# Writing the values as type uint8:
fid = fopen('data_uint8.dat','w'); %# Open the file
fwrite(fid,size(data),'uint32'); %# Write the matrix size (2 values)
fwrite(fid,data,'uint8'); %# Write the data as type uint8
fclose(fid); %# Close the file
%# Writing out only the non-zero values:
[rowIndex,columnIndex,values] = find(data); %# Get the row and column indices
%# and the non-zero values
fid = fopen('data_sparse.dat','w'); %# Open the file
fwrite(fid,numel(values),'uint32'); %# Write the length of the vectors (1 value)
fwrite(fid,rowIndex,'uint32'); %# Write the row indices
fwrite(fid,columnIndex,'uint32'); %# Write the column indices
fwrite(fid,values,'uint8'); %# Write the non-zero values
fclose(fid); %# Close the file
The files created above will differ drastically in size. The file 'data_double.dat'
will be about 131,073 KB, 'data_uint8.dat'
will be about 16,385 KB, and 'data_sparse.dat'
will be less than 2 KB.
Note that I also wrote the data\vector sizes to the files so that the data can be read back in (using FREAD) and reshaped properly. Note also that if I did not supply a 'double'
or 'uint8'
argument to FWRITE, MATLAB would be smart enough to figure out that it didn't need to use the default double precision and would only use 8 bits to write out the data values (since they are all 0 and 1).