views:

351

answers:

5

I'm looking for a way to serialize a bunch of C++ structs in the most convenient way so that the serialization is portable across C++ and Java (at a minimum) and across 32bit/64bit, big/little endian platforms. The structures to be serialized just contain data, i.e. they're pure data objects with no state or behavior.

The idea being that we serialize the structs into an octet blob that we can store in a database "generically" and be read out later on. Thus avoiding changing the database whenever a struct changes and also avoiding assigning each data member to a field - i.e. we only want one table to hold everything "generically" as a binary blob. This should make less work for developers and require less changes when structures change.

I've looked at boost.serialize but don't think there's a way to enable compatibility with Java. And likewise for inheriting Serializable in Java.

If there is a way to do it by starting with an IDL file that would be best as we already have IDL files that describe the structures.

Cheers in advance!

+1  A: 

Why haven't you chosen XML, as this perfectly suits your demand. Both C++ and Java allow for an easy implementation.

Furthermore, I doubt your idea of storing everything as a blob in the database, use a relational database what a database has been designed for, or switch to some object oriented database like http://www.versant.com/en%5FUS/products/objectdatabase which supports both Java and C++.

Jan Jongboom
XML and other human readable formats aren't really an option due to the amount of overhead that will create. At present we are looking at storing 1TB of raw data to a single disk in under a day. The significant overhead of XML will mean we can't store as much raw data as we need to.
fwgx
+2  A: 

I'm surprised Jon Skeet hasn't already pounced on this one :-)

Protocol Buffers is pretty much designed for this sort of scenario -- passing structured data cross-language.

That said, if you're using a database the way you suggest, you really shouldn't be using a full-strength RDBMS like Oracle or SQL Server but rather a lightweight key-value store such as Berkeley DB or one of the many "cloud table" engines.

Jeffrey Hantin
A: 

You need ASN.1! (Some people refer to this as binary XML.) ASN.1 is very compact and thus ideal to transfer data between two systems. And for those who don't think this is ever used: several Internet protocols are based upon the ASN.1 model for data serialization!

Unfortunately, there aren't many libraries available for Java or C++ that will support ASN.1. I had to work with it several years ago and just couldn't find a good, free or inexpensive tool to allow support for ASN.1 in C++. At Objective Systems they are selling ASN.1/XML solutions but it's extremely expensive. (The ASN.1 compiler for C++ and Java, that is!) It costs you an arm and a leg at least! (But then you will have a tool that you can use with only one hand...)

Workshop Alex
Nice suggestion, but anything expensive is a no-no on this project. I'll bear it in mind though :)
fwgx
+1  A: 

If I want to go really really cross language, I normally would suggest JSON, as the ease of javascript support and an abundance of libraries, as well as being human readable and modifiable (I prefer it to XML as I find it smaller in terms of chars, faster, and more readable). It's not the most efficient in terms of space, however, and a more machine readable format like protocol buffers or thrift would have advantages there (thrift can be made from an IDL, but it is also made for encoding services, so it could be heavier than you want).

Todd Gardner
A: 

I'd suggest saving the data with SQLite database. The structs can be stored as database rows in SQLite tables.

The resulting database file is binary compatible across many different platforms and can be stored as a BLOB in your main database. I believe the file size is comparable to compressed XML file with the same data, but memory usage during processing will be significantly less than XML DOM.

etanizar