views:

132

answers:

4

I have written a C++ library that saves my data (a collection of custom structs etc) into a binary file. I currently use (i.e. CREATE and CONSUME) the files locally, on my Windows (XP) machine. For simplicity, lets think of the library in two parts: a WRITER (Creates the files) and a READER or CONSUMER (simply reads data from the files).

Recently though, I would like to also CONSUME (i.e. read) the data files I have CREATED on my XP machine, on my Linux machine. I must point out at this stage that both machines are PCs (so have the same ENDIANess etc).

I can build a reader (and compile for Linux [Ubuntu 9.10 to be precise]), since I am the library creator. My question, before I embark down this road (of building the reader etc) is:

Assuming I have succesfully built the reader for Linux,

Can I simply copy accross, files that were CREATED on the windows (XP) machine to the Linux (Ubuntu 9.10) machine and use the Linux reader to successfully read the copied over file?

+9  A: 

For the files to be binary compatible:

  • endianness must match (as it does for you)
  • bitfield packing order must be the same
  • sizes and signedness of types must be the same
  • the compiler must make the same decisions about padding and alignment

It's certainly possible for all of these conditions to be fulfilled, or for you to not happen to be hitting any cases for which they are not. At the very least, though, I'd add some sanity checks and/or sentinel members to detect problems.

moonshadow
Hi moonshadow, thanks for the feedback. Could you elaborate some more - when you have the time, using a simple class that contains a std::vector<someStruct> I will be able to understand unambigiously, what you mean by sanity checks and/or sentinel members. The only way I can think of implementing these checks would be by using the constants defined in limits.h - is that what you meant? - or maybe you have a more elegant approach?
Stick it to THE MAN
Also, I am not sure how to check if requirements 2, 3 and 4 (that you listed above) hold true. I build using VS2008 on XP, and using gcc 4.4.1 on Ubuntu - any tips on how I can check that these requirements are NOT violated?
Stick it to THE MAN
@Stick it: By "sentinel members", I mean arrange for the top-level structures written to your file to contain a member with a known constant value, and also place one at the end of the file; at load time, check that these members contain the value you expect - this should catch problems with sizes / padding differing between the compilers.
moonshadow
An excellent example of a database that is binary comatible is the FoxPro database. It initially was designed to be compatible across a wide number of platforms.
Dave
+1  A: 

Binary files should be compatible across machines with the same endianess.

The issue you may have in your code is the size of ints, you can't necessarily assume that the compiler on different OS's has the same size int. So either copy blocks of bytes and cast them, or use int16, int32 etc.

Martin Beckett
A: 

If:

  • the machines have the same endianess (as you stated they have) and
  • you do open the streams in binary mode, as text mode might do funny things e.g. with line-ends and
  • you have programmed cleanly so you don't stumble over implementation-defined stuff like alignments, data type sizes, and struct packing,

then yes, your files should be portable.

The third bullet point is what makes a file format a "portable" one. Depending on what kind of data you have in your structs, it can be very easy or a bit tricky. Bitfields, or data being reinterpreted from a different type are especially tricky.

DevSolar
+1  A: 

You might consider taking a look at the Boost Serialization Library. A lot of thought has been put into it, and it will handle many of the potential cross-platform incompatibilities for you. Of course, it's possible that it's overkill for your particular use case, especially if you've already got your writers & readers implemented.

Edward Loper