views:

101

answers:

3

Hi there,

A couple of days ago, I asked how you could reverse engineer a file format. While that didn't really work out, someone gave me the file format. (Click Here) Thank you Xadet.

I'm still quite new to all this, and I was wondering where I should go from here. I am guessing I will have to use inline-asm in C++ to use this format, but I wouldn't know how to actually open the file using this, or insert data into it.

So the question would be, how do I use the file format to get or insert data? And the file format looks like asm, but I don't want to start programming in pure ASM. I've seen people programming asm in C++ before, that's why I think it would be a good choice

Any help would be greatly apreciated.

+1  A: 

The file format description doesn't look like asm, it looks like pseudocode.

Alf P. Steinbach
Aah, that would be a bummer.. But lets say I do have the file format, do you have an example on how to use it? It's all a little confusing, I can find file formats, and how to make file formats, but I cant find anything about actually using/compiling one.
Nick
@Nick: What's your experience with C++? You don't "compile" formats, they're just a predefined way of storing information, and you can read it because it's in an expected format. Just parse the file and get the information you need.
GMan
Not that great to be honest, I'm more of a C# person. That doesn't mean I don't know anything about it though. :P
Nick
@Nick: Is there a reason why you're not using C#, then?
GMan
I was under the assumption that this was asm, and as far as I know C# doesn't support inline asm, unless I use some kind of dll. I just thought it might be easier, and save some space this way.
Nick
It doesn't look like ASM to me... you can probably parse this almost as easily in C# as you can in C++... those strange names (DWORD, FLOAT) etc. just tell you the bitwise format each bit of data is encoded in - your language probably has another name for the same format.
Tony
Thanks for the help everyone, I guess I should do more research before I actually start working on this.
Nick
@Nick: I still don't understand that. Even if the file were assembly, why on Earth would that mean you need to program in assembly?
GMan
A: 

I assume you don't want to have a C++ program that reads that file format document when it starts, then parses the actual data file on that basis. Instead, you just want a C++ program dedicated to reading the current version of that file format? (This is much simpler and will run faster). You don't need to use ASM. What you do need to do is work out the C++ types that are equivalent to the names used in the format file. For example, I think DWORD is used in Microsoft languages to refer to an integer of a specific size - maybe 32 or 64 bits. Track that stuff down, then create C++ structs with equivalent members.

For example:

#include <inttypes.h> // if on Windows, try __int32, __int64 etc. instead

typedef int64_t DWORD;  // or whatever width you find it's meant to be
typedef int32_t WORD;
typedef ??? ZSTR;  // google it...?
typedef float FLOAT;

struct dds
{
    ZSTR path;
    WORD is_skin;
    WORD alpha_enabled;
    WORD two_sided;
    WORD alpha_test_enabled;
    WORD alpha_ref;
    WORD z_write_enabled;
    WORD z_test_enabled;
    WORD blending_mode; // None = 0, Custom = 1, Normal = 2, Lighten = 3
    WORD specular_enabled;
    FLOAT alpha;
    WORD glow_type; // None = 0, NotSet = 1, Simple = 2, Light = 3, Texture = 4, TextureLight = 5, Alpha = 6
    FLOAT red;
    FLOAT green;
    FLOAT blue;
};

// point p at the entire input, which you'll have loaded into memory somewhere
// (e.g. f/stat() the file size then allocate heap and read into it, or memory map)
const char* p = input;
DWORD mesh_count = *(const DWORD*)p;
p += sizeof(DWORD);
for (int i = 0; i < mesh_count; ++i)
{
    const dds& d = *(const dds*)p;
    // you can use d.red, d.alpha etc. here to do anything you like
    p += sizeof dds;
}

// continue processing effect count etc... in same style

HTH, Tony

Tony
const dds is not a good idea because of packing/alignment issues.
frast
Alright, thank you. I still don't actually know how to use it, but that's probably just because I should do more research on what a file format is. Atleast now I know how to convert the 'pseudocode'.
Nick
@frast: yeah... was a bit lazy about that. Googling ZSTR, it looks like an ASCIIZ string with no padding, so the following data may well be misaligned. If your architecture's sensitive to that, then it may be best to read each successive field (or group sans ZSTR) into the same maximally-aligned buffer.
Tony
@Nick: this file format is similar to having C++/C#/whatever variables in memory, except that the relative location of each variable is exactly specified (instead of being whereever the compiler deems suitable), and that they're located inside a file on disk instead of in RAM. You can think of each field as being a sequence of bits somehow copied from RAM to disk: your job is to copy them back, so they end up in your variables.
Tony
That cleared up alot, thank you!
Nick
A: 

It's a kind of scripting language for defining data format, something similar to XDR. You just have to write a parser for it (don't try to use the script in the runtime). Write some functions like get_WORD_BE() or get_DWORD_LE(), etc so that you don't depend on endiannes.

Yes, and if you want to use Tony's approach, add some #pragma pack(1).

ruslik