views:

218

answers:

5

Lately I've been diving into network programming, and I'm having some difficulty constructing a packet with a variable "data" property. Several prior questions have helped tremendously, but I'm still lacking some implementation details. I'm trying to avoid using variable sized arrays, and just use a vector. But I can't get it to be transmitted correctly, and I believe it's somewhere during serialization.

Now for some code.

Packet Header

class Packet {

    public:         
        void* Serialize();
        bool Deserialize(void *message);

        unsigned int sender_id;
        unsigned int sequence_number;
        std::vector<char> data;
};

Packet ImpL

typedef struct {
    unsigned int sender_id;
    unsigned int sequence_number;
    std::vector<char> data;
} Packet;

void* Packet::Serialize(int size) {
    Packet* p = (Packet *) malloc(8 + 30);
    p->sender_id = htonl(this->sender_id);
    p->sequence_number = htonl(this->sequence_number);
    p->data.assign(size,'&'); //just for testing purposes
}

bool Packet::Deserialize(void *message) {
   Packet *s = (Packet*)message;
   this->sender_id = ntohl(s->sender_id);
   this->sequence_number = ntohl(s->sequence_number);
   this->data = s->data;
}

During execution, I simply create a packet, assign it's members, and send/receive accordingly. The above methods are only responsible for serialization. Unfortunately, the data never gets transferred.

Couple of things to point out here. I'm guessing the malloc is wrong, but I'm not sure how else to compute it (i.e. what other value it would be). Other than that, I'm unsure of the proper way to use a vector in this fashion, and would love for someone to show me how (code examples please!) :)

Edit: I've awarded the question to the most comprehensive answer regarding the implementation with a vector data property. Appreciate all the responses!

+3  A: 

This trick works with a C-style array at the end of the struct, but not with a C++ vector. There is no guarantee that the C++ vector class will (and it most likely won't) put its contained data in the "header object" that is present in the Packet struct. Instead, that object will contain a pointer to somewhere else, where the actual data is stored.

Thomas Padron-McCarthy
Thanks for the quick response. Are you forced to use C-style arrays for a variable payload struct then?
Rev316
Yes, you can't use vector.
Hans Passant
+1  A: 

I think the problem centres around you trying the 'serialise' the vector that way and you're probably assuming that the vector's state information gets transmitted. As you've found, that doesn't really work that way as you're trying to move an object across the network and things like pointers etc don't mean anything on the other machine.

I think the easiest way to handle this would be to change Packet to the following structure:

struct Packet {
    unsigned int sender_id;
    unsigned int sequence_number;
    unsigned int vector_size;
    char data[1];
};

The data[1] bit is an old C trick for variable length array - it has to be the last element in the struct as you're essentially writing past the size of the struct. You have to get the allocation for the data structure right for this, otherwise you'll be in a world of hurt.

Your serialisation function then looks something like this:

void* Packet::Serialize(std::vector<char> &data) {
    Packet* p = (Packet *) malloc(sizeof(Packet) + data.size());
    p->sender_id = htonl(this->sender_id);
    p->sequence_number = htonl(this->sequence_number);
    p->vector_size = htonl(data.size());
    ::memcpy(p->data, data[0], size);
}

As you can see, we'll transmit the data size and the contents of the vector, copied into a plain C array which transmits easily. You have to keep in mind that in your network sending routine, you have to calculate the size of the structure properly as you'll have to send sizeof(Packet) + sizeof(data), otherwise you'll get the vector cut off and are back into nice buffer overflow territory.

Disclaimer - I haven't tested the code above, it's just written from memory so you might have to fix the odd compilation error.

Timo Geusch
Thanks for showing the code to use the variable array "hack", but I gave the "answer" to the solution with vectors.
Rev316
+1  A: 

I think you need to work directly with byte arrays returned by the socket functions.

For these purposes it's good to have two distinct parts of a message in your protocol. The first part is a fixed-size "header". This will include the size of the byes that follow, the "payload", or, data in your example.

So, to borrow some of your snippets and expand on them, maybe you'll have something like this:

typedef struct {
    unsigned int sender_id;
    unsigned int sequence_number;
    unsigned int data_length;   // this is new
} PacketHeader;

So then when you get a buffer in, you'll treat it as a PacketHeader*, and check data_length to know how much bytes will appear in the byte vector that follows.

I would also add a few points...

  • Making these fields unsigned int is not wise. The standards for C and C++ don't specify how big int is, and you want something that will be predictable on all compilers. I suggest the C99 type uint32_t defined in <stdint.h>

  • Note that when you get bytes from the socket... It is in no way guaranteed to be the same size as what the other end wrote to send() or write(). You might get incomplete messages ("packets" in your terminology), or you might get multiple ones in a single read() or recv() call. It's your responsibility to buffer these if they are short of a single request, or loop through them if you get multiple requests in the same pass.

asveikau
Thanks for the advice!
Rev316
+1  A: 

This cast is very dangerous as you have allocated some raw memory and then treated it as an initialized object of a non-POD class type. This is likely to cause a crash at some point.

Packet* p = (Packet *) malloc(8 + 30);

Looking at your code, I assume that you want to write out a sequence of bytes from the Packet object that the seralize function is called on. In this case you have no need of a second packet object. You can create a vector of bytes of the appropriate size and then copy the data across.

e.g.

void* Packet::Serialize(int size)
{
    char* raw_data = new char[sizeof sender_id + sizeof sequence_number + data.size()];
    char* p = raw_data;
    unsigned int tmp;

    tmp = htonl(sender_id);
    std::memcpy(p, &tmp, sizeof tmp);
    p += sizeof tmp;

    tmp = htonl(sequence_number);
    std::memcpy(p, &tmp, sizeof tmp);
    p += sizeof tmp;

    std::copy(data.begin(), data.end(), p);

    return raw_data;
}

This may not be exactly what you intended as I'm not sure what the final object of your size parameter is and your interface is potentially unsafe as you return a pointer to raw data that I assume is supposed to be dynamically allocated. It is much safer to use an object that manages the lifetime of dynamically allocated memory then the caller doesn't have to guess whether and how to deallocate the memory.

Also the caller has no way of knowing how much memory was allocated. This may not matter for deallocation but presumably if this buffer is to be copied or streamed then this information is needed.

It may be better to return a std::vector<char> or to take one by reference, or even make the function a template and use an output iterator.

Charles Bailey
+1  A: 

i think you might want to do like this: `

struct PacketHeader 
{
    unsigned int senderId;
    unsigned int sequenceNum;
};

class Packet
{
    protected:
        PacketHeader header;
        std::vector<char> data;
    public:
        char* serialize(int& packetSize);
        void deserialize(const char* data,int dataSize);
}

char* Packet::serialize(int& packetSize)
{
    packetSize = this->data.size()+sizeof(PacketHeader);
    char* packetData = new char[packetSize];
    PacketHeader* packetHeader = (PacketHeader*)packetData;
    packetHeader->senderId = htonl(this->header.senderId);
    packetHeader->sequenceNum = htonl(this->header.sequenceNum);
    char* packetBody = (packetData + sizeof(packetHeader));
    for(size_t i=0 ; i<this->data.size() ; i++)
    {
        packetBody[i] = this->data.at(i);
    }
    return packetData;
}

void deserialize(const char* data,int dataSize)
{
    PacketHeader* packetHeader = (PacketHeader*)data;
    this->header.senderId = ntohl(packetHeader->senderId);
    this->header.sequenceNum = ntohl(packetHeader->sequenceNum);
    this->data.clear();
    for(int i=sizeof(PacketHeader) ; i<dataSize ; i++)
    {
        this->data.push_back(data[i]);
    }
}

`

those codes does not include bound checking and free allocated data, don't forget to delete the returned buffer from serialize() function, and also you can use memcpy instead of using loop to copy byte per byte into or from std::vector.

most compiler sometime add padding inside a structure, this would cause an issue if you send those data intact without disable the padding, you can do this by using #pragma pack(1) if you are using visual studio

disclaimer: i don't actually compile those codes, you might want to recheck it

uray
Thanks Uray for the example, helped a lot!
Rev316