ansaurus

Question

serialize any data type as vector<uint8_t> - use reinterpret_cast?

Answer 1

+6 A:

My suggestion is to ignore all the people telling you that reinterpret_cast is bad. They tell you it is bad, because it's generally not a good practice to take the memory map of one type and pretend that it's another type. But in this case, that is exactly what you want to do, as your entire purpose is to transmit the memory map as a series of bytes.

It is far better than using a double-static_cast, as it fully details the fact that you are taking one type and purposefully pretending that it is something else. This situation is exactly what reinterpret_cast is for, and dodging using it with a void pointer intermediary is simply obscuring your meaning with no benefit.

Also, I'm sure that you're aware of this, but watch for pointers in T.

jdmichal 2010-07-01 20:21:58

Answer 2

+1 A:

You're not doing any actual encoding here, you're just copying the raw representation of the data from memory into a byte array and then sending that out over the network. That's not going to work. Here's a quick example as to why:

struct A {
  int a;
};

struct B {
  A* p_a;
}

What happens when you use your method to send a B out over the network? The recipient receives p_a, the address of some A object on your machine, but that object is not on their machine. And even if you sent them the A object too, it wouldn't be at the same address. There's no way that can work if you just send the raw B struct. And that's not even considering more subtle issues like endianness and floating point representation which can affect the transmission of such simple types as int and double.

What you are doing right now is fundamentally no different than just casting to uint8_t* as far as whether it's going to work or not is concerned (it won't work, except for the most trivial cases).

What you need to do is devise a method of serialization. Serialization means any way of solving this sort of problem: how to get objects in memory out onto the network in a form such that they can be meaningfully reconstructed on the other side. This is a tricky problem, but it is a well-known and repeatedly solved problem. Here's a good starting point for reading: http://www.parashift.com/c++-faq-lite/serialization.html

Tyler McHenry 2010-07-01 20:27:25

So, yes, misnomer. Regarding the rest of your comment: the question, as posed, is a simplification to inquire about whether or not to `reinterpret_cast` (or similar) - I'll rename to be more specfic. I'm aware of the subtleties in transferring data and internally everything has a pack/unpack which essentially does what I describe above for its own data.

ezpz 2010-07-01 20:48:48

Answer 3

+1 A:

You can get rid of one cast by exploiting the fact that any pointer can be implicitly cast to void*. Also, you might want to add a few const:

//Beware, brain-compiled code ahead!
template <typename T>
inline void encode (std::vector< uint8_t >& dst, const T& data)
{
    const void* pdata = &data;
    uint8_t* src = static_cast<uint8_t*>(pdata);
    dst.insert(dst.end(), src, src + sizeof(T));
}

You might want to add a compile-time check for T being a POD, no struct, and no pointer.

However, interpreting some object's memory at the byte-level is never going to be save, period. If you have to do it, then do it in a nice wrapper (as you have done), and get over it. When you port to a different platform/compiler, have an eye on these things.

sbi 2010-07-01 20:31:28

I have the `const` in there but elided for brevity. I do not, however, have the check for pointer and/or struct. This is used only by myself, but it would probably be safest to add those checks to be sure. Thanks.

ezpz 2010-07-01 20:53:56

ansaurus

tags:

views:

answers:

serialize any data type as vector<uint8_t> - use reinterpret_cast?

related questions