views:

519

answers:

4

I have a byte stream I need parsed into a struct, and I also need to be able to parse the struct back to a byte stream.

Below is an example of what I want where I've used BitConverter to parse the values. I hope there is a more efficient way of doing this, because my structs are HUGE!

ref struct TestStruct
{
    int TestInt;
    float TestFloat;
};

int main(array<System::String ^> ^args)
{
    // populating array - just for demo, it's really coming from a file
    array<unsigned char>^ arrBytes = gcnew array<unsigned char>(8);
    Array::Copy(BitConverter::GetBytes((int)1234), arrBytes, 4);
    Array::Copy(BitConverter::GetBytes((float)12.34), 0, arrBytes, 4, 4);

    // parsing to struct - I want help
    TestStruct^ myStruct = gcnew TestStruct();
    myStruct->TestInt = BitConverter::ToInt32(arrBytes, 0);
    myStruct->TestFloat = BitConverter::ToSingle(arrBytes, 4);

    String^ str = Console::ReadLine();
    return 0;
}
+1  A: 

Here is an explanation of serialization in .NET

For general C++ (not managed), look at boost::serialize

Dmitry Khalatov
How the binary format is formatted is already defined. This brings me to custom serialization and brings back the same issue. Several 100 lines with BitConverter, or does there exist a better method?
rozon
A: 

You mention both C++ and .net. For C++ only, you should be able to do something along the lines of

char buffer[sizeof(MYSTRUCT)];
memcopy((char*) &mystruct, buffer, sizeof(MYSTRUCT));

For .net you MUST use serialization if you want to avoid saving each item seperately - the classes are not guaranteed to be stored in a contiguous block of memory. It's annoying, but that's one of the 'features' of managed code - you have to let it manage it for you.

Adam Davis
Yes! If you could explain how to do this in NET with managed ref types.
rozon
You can't. You have to save each item separately, or use serialization.
Adam Davis
Can I with serialization define how the output binary format should be without saving each item separately/manually?
rozon
+1  A: 

For stuff like this, you usually use a code generator. Let's assume the source looks like this:

struct a {
    int i;
}

struct b {
    string name;
    struct a a;
}

What you do is you write a simple parser which searches the source (probably some header file) for "struct", then you read the name of the struct (anything between "struct" and "{"). Write this to the output:

cout << "struct " << name << " * read_struct_" << name << " (stream in) {" << NL
    << "    struct " << name << " * result = malloc (sizeof(struct " << name << "));" NL
parseFields (headerStream);
cout << "    return result;" << NL << "}" << NL ; }

Note my C++ is a bit rusty so this probably doesn't compile but you should get the idea.

In parseFields, you read each line and split it into two parts: Anything before the last space (i.e. "int" in the first example) and the stuff between the last space and and the ";". In this case, that would be "i". You now write to the output:

cout << "read_" << fieldType << "(in, &result->" << fieldName << ");" << NL;

Note: You'll need to replace all the spaces ub the field type with "_".

In the output, this looks like so:

struct a * read_struct_a (stream in) {
   struct a * result = malloc(sizeof(struct a));
   read_int(in, &result->i);
   return result;
}

This allows you to define how to read or write an int somewhere else (in a utility module).

Now, you have code which reads the structure definitions from a header file and creates new code that can read the structure out of some stream. Duplicate this to write the structure to a stream. Compile the generated code and you're done.

You will also want to write unit tests to verify that the parsing works correctly :) Just create a structure in memory, use the write methods to save it somewhere and read it back again. The two structures should be identical, now. You will want to write a third code generator to create code to compare two structures.

Aaron Digulla
You'll need to explain this more. The structures are identical to the source format, and I don't control the source (or format).
rozon
A: 

Serialization is great, but in my case I don't need the 'extras' and I'd have to do the same job anyhow to have full control of the bits. For most it could be the solution I guess.

Giving Mr. Digulla the correct answer, as it is the one resembling my solution the most. Also thanks to Mr. Davis who put me straight on the fact that it had to be done...

rozon