How to convert a floating point number into a sequence of bytes so that it can be persisted in a file? Such algorithm must be fast and highly portable. It must allow also the opposite operation, deserialization. It would be nice if only very tiny excess of bits per value (persistent space) is required.
Take a look at Google Protocol Buffers. Granted it's C++, but you can get some ideas from there. The code is BSD licensed, so you can do whatever with it :)
What level of portability do you require? If the file is to be read on a computer with the same OS that it was generated on, than you using a binary file and just saving and restoring the bit pattern should work. Otherwise as boytheo said, ASCII is your friend.
Converting to an ascii representation would be the simplest, but if you need to deal with a colossal number of floats, then of course you should go binary. But this can be a tricky issue if you care about portability. Floating point numbers are represented differently in different machines.
If you don't want to use a canned library, then your float-binary serializer/deserializer will simply have to have "a contract" on where each bit lands and what it represents.
Here's a fun website to help with that: link.
Assuming you're using mainstream compilers, floating point values in C and C++ obey the IEEE standard and when written in binary form to a file can be recovered in any other platform, provided that you write and read using the same byte endianess. So my suggestion is: pick an endianess of choice, and before writing or after reading, check if that endianess is the same as in the current platform; if not, just swap the bytes.
What do you mean, "portable"?
For portability, remember to keep the numbers within the limits defined in the Standard: use a single number outside these limits, and there goes all portability down the drain.
double planck_time = 5.39124E-44; /* second */
5.2.4.2.2 Characteristics of floating types <float.h>
[...] 10 The values given in the following list shall be replaced by constant expressions with implementation-defined values [...] 11 The values given in the following list shall be replaced by constant expressions with implementation-defined values [...] 12 The values given in the following list shall be replaced by constant expressions with implementation-defined (positive) values [...] [...]
Note the implementation-defined in all these clauses.
You could always convert to IEEE-754 format in a fixed byte order (either little endian or big endian). For most machines, that would require either nothing at all or a simple byte swap to serialize and deserialize. A machine that doesn't support IEEE-754 natively will need a converter written, but doing that with ldexp and frexp (stanard C library functions)and bit shuffling is not too tough.
This might give you a good start - it packs a floating point value into an int
and long long
pair, which you can then serialise in the usual way.
#define FRAC_MAX 9223372036854775807LL /* 2**63 - 1 */
struct dbl_packed
{
int exp;
long long frac;
};
void pack(double x, struct dbl_packed *r)
{
double xf = fabs(frexp(x, &r->exp)) - 0.5;
if (xf < 0.0)
{
r->frac = 0;
return;
}
r->frac = 1 + (long long)(xf * 2.0 * (FRAC_MAX - 1));
if (x < 0.0)
r->frac = -r->frac;
}
double unpack(const struct dbl_packed *p)
{
double xf, x;
if (p->frac == 0)
return 0.0;
xf = ((double)(llabs(p->frac) - 1) / (FRAC_MAX - 1)) / 2.0;
x = ldexp(xf + 0.5, p->exp);
if (p->frac < 0)
x = -x;
return x;
}
This version has excess of only one byte per one floating point value to indicate the endianness. But I think, it is still not very portable however.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#define LITEND 'L'
#define BIGEND 'B'
typedef short INT16;
typedef int INT32;
typedef double vec1_t;
typedef struct {
FILE *fp;
} WFILE, RFILE;
#define w_byte(c, p) putc((c), (p)->fp)
#define r_byte(p) getc((p)->fp)
static void w_vec1(vec1_t v1_Val, WFILE *p)
{
INT32 i;
char *pc_Val;
pc_Val = (char *)&v1_Val;
w_byte(LITEND, p);
for (i = 0; i<sizeof(vec1_t); i++)
{
w_byte(pc_Val[i], p);
}
}
static vec1_t r_vec1(RFILE *p)
{
INT32 i;
vec1_t v1_Val;
char c_Type,
*pc_Val;
pc_Val = (char *)&v1_Val;
c_Type = r_byte(p);
if (c_Type==LITEND)
{
for (i = 0; i<sizeof(vec1_t); i++)
{
pc_Val[i] = r_byte(p);
}
}
return v1_Val;
}
int main(void)
{
WFILE x_FileW,
*px_FileW = &x_FileW;
RFILE x_FileR,
*px_FileR = &x_FileR;
vec1_t v1_Val;
INT32 l_Val;
char *pc_Val = (char *)&v1_Val;
INT32 i;
px_FileW->fp = fopen("test.bin", "w");
v1_Val = 1234567890.0987654321;
printf("v1_Val before write = %.20f \n", v1_Val);
w_vec1(v1_Val, px_FileW);
fclose(px_FileW->fp);
px_FileR->fp = fopen("test.bin", "r");
v1_Val = r_vec1(px_FileR);
printf("v1_Val after read = %.20f \n", v1_Val);
fclose(px_FileR->fp);
return 0;
}
fwrite(), fread()? You will likely want binary, and you cannot pack the bytes any tighter unless you want to sacrifice precision which you would do in the program and then fwrite() fread() anyway; float a; double b; a=(float)b; fwrite(&a,1,sizeof(a),fp);
If you are carrying different floating point formats around they may not convert in a straight binary sense, so you may have to pick apart the bits and perform the math, this to the power that plus this, etc. IEEE 754 is a dreadful standard to use but widespread so it would minimize the effort.