views:

50

answers:

1

Hi, I'm printing a variable say z1 which is a 1-D array containing floating point numbers to a text file so that I can import into Matlab or GNUPlot for plotting. I've heard that binary files (.dat) are smaller than .txt files. The definition that I currently use for printing to a .txt file is:

void create_out_file(const char *file_name, const long double *z1, size_t z_size){
FILE *out;
size_t i;
 if((out = _fsopen(file_name, "w+", _SH_DENYWR)) == NULL){
 fprintf(stderr, "***> Open error on output file %s", file_name);
 exit(-1);
 }
for(i = 0; i < z_size; i++)
fprintf(out, "%.16Le\n", z1[i]);
fclose(out);
}   

I have three questions:

  1. Are binary files really more compact than text files?;

  2. If yes, I would like to know how to modify the above code so that I can print the values of the array z1 to a binary file. I've read that fprintf has to be replaced with fwrite. My output file say dodo.dat should contain the values of array z1 with one floating number per line.

  3. I have %.16Le up in my code but I think that %.15Le is right as I have 15 precision digits with long double. I have put a dot (.) in the width position as I believe that this allows expansion to an arbitrary field to hold the desired number. Am I right? As an example with %.16Le, I can have an output like 1.0047914240730432e-002 which gives me 16 precision digits and the width of the field has the right width to display the number correctly. Is placing a dot (.) in the width position instead of a width value a good practice?

Thanks a lot...

UPDATE Is changing to:

for(i = 0; i < z_size; i++)
fwrite(&z1, sizeof(long double), 1, out);

ok in addition to the change to "wb+" ? I can't read the binary file in Matlab.

A: 
  1. yes, binary files are more compact, but you lose portability and there are various other potential problems too, so unless your data files are problematically huge, or slow to export/import, it's a good idea to stick with text if you can (you can always compress them for storage, e.g. with zip)

  2. open you file with "wb" instead of "w" and use fwrite() - you no longer have "lines" in your file - it will just be a stream of (binary) floating point values

  3. you may be getting confused between double and long double - a long double can be up to 16 bytes in size and have a precision of up to around 32 digits (however this is implementation-dependent - long double can commonly be 10, 12 or 16 bytes). A double is usually 8 bytes and has a precision of around 16 digits.

MATLAB may not be able to cope with long double (as it is not well standardized) so you probably just want to write doubles to your data file, e.g.

for (i = 0; i < z_size; i++)
{
    double z = (double)z1[i];
    fwrite(&z, sizeof(double), 1, out);
}
Paul R
Thanks. Is my new implementation ok?
yCalleecharan
@yCalleecharan: not quite - you missed out the index of z1, and you probably want to write a file of `double` rather than `long double` - see edit above.
Paul R
Thanks a lot. It's working fine now. I've tested in Matlab using:format long; fid = fopen('vz3.dat', 'r'); mydata = fread(fid,'double') and it seems fine. I defined z1 to be long double but here when you're converting it to double, you're casting it? Both double and long double has 15 precision digits on my machine.
yCalleecharan
@yCalleecharan: I'd be very surprised if `long double` had the same precision as `double` on your system - it would make it kind of pointless - but maybe it's an old or non-standard compiler like MSVC. Anyway, glad to hear it's all working now. Good luck !
Paul R
I'm using MVS2008. The parameters DBL_DIG and LDBL_DIG are both 15 for my machine. I can try with Borland though. In any case, eventually I'm importing the result in Matlab and I will have to live with the fact that Matlab has some issue with long double.
yCalleecharan
I think with Microsoft long double may only be 80 bits (compared with 64 bits for a double) so that doesn't add many digits. I would still have thought that it would be more like 20 digits, but then again with Microsoft anything is possible. At least you have something that works now. Very few numerical applications require more than double precision, except for intermediate values - double should be fine for most input and output data.
Paul R
Thanks for the info. I've tested Borland C++ BuilderX. Smallest long double LDBL_MIN : 3.362103e-4932, Largest long double LDBL_MAX : 1.189731e+4932 and Precision for long double LDBL_DIG : 18.
yCalleecharan