views:

682

answers:

4

Hi, Could somebody point me in the right direction of how I could read a binary file that is defined by a C struct? It has a few #define inside of the struct, which makes me thing that it will complicate things.
The structure looks something like this: (although its larger and more complicated than this)

struct Format {
    unsigned long str_totalstrings;
    unsigned long str_name;
    #define STR_ORDERED 0x2
    #define STR_ROT13 0x4
    unsigned char stuff[4];
    #define str_delimiter stuff[0]
}

I would really appreciate it if somebody could point me in the right direction on how to do this. Or if theres any tutorial out there that covers this topic?

Thanks a lot in advance for your help.

+4  A: 

Reading a binary defined by a struct is easy.

Format myFormat;
fread(&myFormat, sizeof(Format), 1, fp);

the #defines don't affect the structure at all. (Inside is an odd place to put them, though).

However, this is not cross-platform safe. It is the simplest thing that will possibly work, in situations where you are assured the reader and writer are using the same platform.

The better way would be to re-define your structure as such:

struct Format {
    Uint32 str_totalstrings;  //assuming unsigned long was 32 bits on the writer.
    Uint32 str_name;
    unsigned char stuff[4];
};

and then have a 'platform_types.h" which typedefs Uint32 correctly for your compiler. Now you can read directly into the structure, but for endianness issues you still need to do something like this:

myFormat.str_totalstrings = FileToNative32(myFormat.str_totalstrings);
myFormat.str_name =   FileToNative32(str_name);

where FileToNative is either a no-op or a byte reverser depending on platform.

AShelly
I'd recommend sizeof myFormat instead, and I think you're missing an argument to fread(). Also, this assumes the endianness of the host is the same as the machine that wrote the file. In general, doing I/O on whole structs is a bad idea, imo.
unwind
How safe is it that way? I've done it reading in the specific amount of bytes and filling the struct elements. There was a reason for it I have managed to forget..
Kevin
Right, Unwind. That answers my question already! :)
Kevin
It's not safe across platforms and compilers, because the actual size of built-in types is not fixed by the standard at all. Doing this is brittle and dangerous.
MadKeithV
I agree its not cross-platform safe. Editing my answer to reflect that...
AShelly
+3  A: 

You can also use unions to do this parsing if you have the data you want to parse already in memory.

union A {
    char* buffer;
    Format format;
};

A a;
a.buffer = stuff_you_want_to_parse;

// You can now access the members of the struct through the union.
if (a.format.str_name == "...")
    // do stuff

Also remember that long could be different sizes on different platforms. If you are depending on long being a certain size, consider using the types defined int stdint.h such as uint32_t.

bradtgmurray
As an alternative to this, I would prefer using char* with reinterpret_cast. For example, take a char* buffer and fill it with your data. Then: Format* format = reinterpret_cast<Format*>(buffer); format->str_name = "...";
Tom
This has the same cross-platform issues as my answer
AShelly
A: 

You have to find out the endiannes of the machine where the file was written so you can interpret integers properly. Look out for ILP32 vs LP64 mismatch. The original structure packing/alignment might also be important.

Nikolai N Fetissov
+1  A: 

Using C++ I/O library:

#include <fstream>
using namespace std;

ifstream ifs("file.dat", ios::binary);
Format f;
ifs.get(&f, sizeof f);

Using C I/O library:

#include <cstdio>
using namespace std;

FILE *fin = fopen("file.dat", "rb");
Format f;
fread(&f, sizeof f, 1, fin);
Ferruccio