views:

1614

answers:

10

Is this the best way to make a variable sized struct in C++? I don't want to use vector because the length doesn't change after initialization.

struct Packet
{
    unsigned int bytelength;
    unsigned int data[];
};

Packet* CreatePacket(unsigned int length)
{
    Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
    output->bytelength = length;
    return output;
}

Edit: renamed variable names and changed code to be more correct.

+2  A: 

If you are truly doing C++, there is no practical difference between a class and a struct except the default member visibility - classes have private visibility by default while structs have public visibility by default. The following are equivalent:

struct PacketStruct
{
    unsigned int bitlength;
    unsigned int data[];
};
class PacketClass
{
public:
    unsigned int bitlength;
    unsigned int data[];
};

The point is, you don't need the CreatePacket(). You can simply initialize the struct object with a constructor.

struct Packet
{
    unsigned long bytelength;
    unsigned char data[];

    Packet(unsigned long length = 256)  // default constructor replaces CreatePacket()
      : bytelength(length),
        data(new unsigned char[length])
    {
    }

    ~Packet()  // destructor to avoid memory leak
    {
        delete [] data;
    }
};

A few things to note. In C++, use new instead of malloc. I've taken some liberty and changed bitlength to bytelength. If this class represents a network packet, you'll be much better off dealing with bytes instead of bits (in my opinion). The data array is an array of unsigned char, not unsigned int. Again, this is based on my assumption that this class represents a network packet. The constructor allows you to create a Packet like this:

Packet p;  // default packet with 256-byte data array
Packet p(1024);  // packet with 1024-byte data array

The destructor is called automatically when the Packet instance goes out of scope and prevents a memory leak.

Matt Davis
You would be better off using a default value in the constructor and have a single one than having duplicated code with a default constructor
JProgrammer
Good point. I've been doing a lot of C# lately, which doesn't allow default parameters. I'll update the post.
Matt Davis
Your initialization of the data member isn't going to work, because it's not a pointer. If you change it to a pointer, you lose the contiguous layout of the length with the data, which I think is the O.P.s goal.
Mark Ransom
I say the opposite - don't ever use new[]. It's no more lightweight then vector but you have to remember to delete it yourself. The C++ FAQ agrees with me: http://www.parashift.com/c++-faq-lite/containers.html
Jimmy J
+3  A: 

If you never add a constructor/destructor, assignment operators or virtual functions to your structure using malloc/free for allocation is safe.

It's frowned upon in c++ cirles, but I consider the usage of it okay if you document it in the code.

Some comments to your code:

struct Packet
{
    unsigned int bitlength;
    unsigned int data[];
};

If I remember right declaring an array without a length is non-standard. It works on most compilers but may give you a warning. If you want to be complient declare your array of length 1.

Packet* CreatePacket(unsigned int length)
{
    Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
    output->bitlength = length;
    return output;
}

This works, but you don't take the size of the structure into account. The code will break once you add new members to your structure. Better do it this way:

Packet* CreatePacket(unsigned int length)
{
    size_t s = sizeof (Packed) - sizeof (Packed.data);
    Packet *output = (Packet*) malloc(s + length * sizeof(unsigned int));
    output->bitlength = length;
    return output;
}

And write a comment into your packet structure definition that data must be the last member.

Btw - allocating the structure and the data with a single allocation is a good thing. You half the number of allocations that way, and you improve the locality of data as well. This can improve the performance quite a bit if you allocate lots of packages.

Unfortunately c++ does not provide a good mechanism to do this, so you often end up with such malloc/free hacks in real world applications.

Nils Pipenbrinck
Hi this seems to be the best solution, but my compiler (gcc on mingw) won't let me do sizeof(Packet.data). It will however, let me do Packet test; sizeof(test.data);
Unknown
A: 

You should declare a pointer, not an array with an unspecified length.

Paul Nathan
But then you have two separately-managed chunks of memory. The OP is allocating the whole structure, array and all, in a single contiguous chunk.
bk1e
+5  A: 

This is OK (and was standard practice for C).

But this is not a good idea for C++.
This is because the compiler generates a whole set of other methods automatically for you around the class. These methods do not understand that you have cheated.

For Example:

void copyRHSToLeft(Packet& lhs,Packet& rhs)
{
    lhs = rhs;  // The compiler generated code for assignement kicks in here.
                // Are your objects going to cope correctly??
}


Packet*   a = CreatePacket(3);
Packet*   b = CreatePacket(5);
copyRHSToLeft(*a,*b);

Use the std::vector<> it is much safer and works correctly.
I would also bet it is just as efficient as your implementation after the optimizer kicks in.

Alternatively boost contains a fixed size array:
http://www.boost.org/doc/libs/1_38_0/doc/html/array.html

Martin York
The problem with the boost array template for this user is that the size for boost::array<> is determined at compile time.
Michael Burr
If I use vector though, won't that be non-contiguous from the length member?
Unknown
It may. There is no guarantee on that. But why would you want to make sure of that ?
Benoît
@unknown - You didn't mention that as a requirement in your question - you might want to clarify (and mention why you need the data to be contiguous with the length).
Michael Burr
Well, I'm not sure it needs to be a requirement, but it just makes more sense to me to make it contiguous. Doesn't it?
Unknown
No, if it is not a requirement then expecting or requiring a specific memory layout is silly. Let the compiler work that out you work on how to use the object correctly
Martin York
He is worried about there being a second allocation: new vector<int>(50); will cause two allocation: one for the vector object and one for the array of 50 ints maintained by the vector object.
jmucchiello
+2  A: 

I'd probably just stick with using a vector<> unless the minimal extra overhead (probably a single extra word or pointer over your implementation) is really posing a problem. There's nothing that says you have to resize() a vector once it's been constructed.

However, there are several The advantages of going with vector<>:

  • it already handles copy, assignment & destruction properly - if you roll your own you need to ensure you handle these correctly
  • all the iterator support is there - again, you don't have to roll your own.
  • everybody already knows how to use it

If you really want to prevent the array from growing once constructed, you might want to consider having your own class that inherits from vector<> privately or has a vector<> member and only expose via methods that just thunk to the vector methods those bits of vector that you want clients to be able to use. That should help get you going quickly with pretty good assurance that leaks and what not are not there. If you do this and find that the small overhead of vector is not working for you, you can reimplement that class without the help of vector and your client code shouldn't need to change.

Michael Burr
+4  A: 

Some thoughts on what you're doing:

  • Using the C-style variable length struct idiom allows you to perform one free store allocation per packet, which is half as many as would be required if struct Packet contained a std::vector. If you are allocating a very large number of packets, then performing half as many free store allocations/deallocations may very well be significant. If you are also doing network accesses, then the time spent waiting for the network will probably be more significant.
  • This structure represents a packet. Are you planning to read/write from a socket directly into a struct Packet? If so, you probably need to consider byte order. Are you going to have to convert from host to network byte order when sending packets, and vice versa when receiving packets? If so, then you could byte-swap the data in place in your variable length struct. If you converted this to use a vector, it would make sense to write methods for serializing / deserializing the packet. These methods would transfer it to/from a contiguous buffer, taking byte order into account.
  • Likewise, you may need to take alignment and packing into account.
  • You can never subclass Packet. If you did, then the subclass's member variables would overlap with the array.
  • Instead of malloc and free, you could use Packet* p = ::operator new(size) and ::operator delete(p), since struct Packet is a POD type and does not currently benefit from having its default constructor and its destructor called. The (potential) benefit of doing so is that the global operator new handles errors using the global new-handler and/or exceptions, if that matters to you.
  • It is possible to make the variable length struct idiom work with the new and delete operators, but not well. You could create a custom operator new that takes an array length by implementing static void* operator new(size_t size, unsigned int bitlength), but you would still have to set the bitlength member variable. If you did this with a constructor, you could use the slightly redundant expression Packet* p = new(len) Packet(len) to allocate a packet. The only benefit I see compared to using global operator new and operator delete would be that clients of your code could just call delete p instead of ::operator delete(p). Wrapping the allocation/deallocation in separate functions (instead of calling delete p directly) is fine as long as they get called correctly.
bk1e
+1  A: 

There are already many good thoughts mentioned here. But one is missing. Flexible Arrays are part of C99 and thus aren't part of C++, although some C++ compiler may provide this functionality there is no guarantee for that. If you find a way to use them in C++ in an acceptable way, but you have a compiler that doesn't support it, you perhaps can fallback to the "classical" way

quinmars
+1  A: 

You probably want something lighter than a vector for high performances. You also want to be very specific about the size of your packet to be cross-platform. But you don't want to bother about memory leaks either.

Fortunately the boost library did most of the hard part:

struct packet
{
   boost::uint32_t _size;
   boost::scoped_array<unsigned char> _data;

   packet() : _size(0) {}

       explicit packet(packet boost::uint32_t s) : _size(s), _data(new unsigned char [s]) {}

   explicit packet(const void * const d, boost::uint32_t s) : _size(s), _data(new unsigned char [s])
   {
  std::memcpy(_data, static_cast<const unsigned char * const>(d), _size);
   }
};

typedef boost::shared_ptr<packet> packet_ptr;

packet_ptr build_packet(const void const * data, boost::uint32_t s)
{

 return packet_ptr(new packet(data, s));
}
Edouard A.
How is that lighter than std::vector? You've used more memory and put an extra level of indirection into it.
Jimmy J
No exceptions. No validation. You are certain that the memory you have is the memory actually allocated (vector may reserve more). The packet structure can evolve. Not vector. Which means you'd have to put a vector into the message in the end...
Edouard A.
+1  A: 

You can use the "C" method if you want but for safety make it so the compiler won't try to copy it:

struct Packet
{
    unsigned int bytelength;
    unsigned int data[];

private:
   // Will cause compiler error if you misuse this struct
   void Packet(const Packet&);
   void operator=(const Packet&);
};
Jimmy J
A: 

There's nothing whatsoever wrong with using vector for arrays of unknown size that will be fixed after initialization. IMHO, that's exactly what vectors are for. Once you have it initialized, you can pretend the thing is an array, and it should behave the same (including time behavior).

T.E.D.