views:

386

answers:

9

So I have a couple of structs...

struct myBaseStruct
{
};

struct myDerivedStruct : public myBaseStruct
{
    int a, b, c, d;
    unsigned char* ident;
};

myDerivedStruct* pNewStruct;

...and I want to dynamically allocate enough space so that I can 'memcpy' in some data, including a zero-terminated string. The size of the base struct is apparently '1' (I assume because it can't be zero) and the size of the derived is 20, which seems to make sense (5 x 4).

So, I have a data buffer which is a size of 29, the first 16 bytes being the ints and the remaining 13 being the string.

How can I allocate enough memory for pNewStruct so that there is enough for the string? Ideally, I just want to go:

  • allocate 29 bytes at pNewStruct;
  • memcpy from buffer into pNewStruct;

Thanks,

+1  A: 

You can dynamically allocate space by doing:

myDerivedStruct* pNewStruct = reinterpret_cast<myDerivedStruct*>(new char[size]);

however

Are you sure you want to do this?

Also, note that if you are intending to use ident as the pointer to the start of your string, that would be incorrect. You infact need &ident, since the ident variable is itself at the start of your unused space, interpreting what is at that space as a pointer is most likely going to be meaningless. Hence, it would make more sense if ident were unsigned char or char rather than unsigned char*.

[edit again] I'd just like to emphasise that what you're doing is really a really really bad idea.

Autopulated
Oh God please make the pain stop!
Nikola Smiljanić
Why is it a bad idea?
acron
Because: - You'll have to *always* do all of the memory allocation for these structures yourself, otherwise you'll end up corrupting memory. - The values of the fields of your structure when you memcpy into it will be platform dependant (most obviously they will depend on endianness, and the size of int). - sizeof(myDerivedStruct) will be misleading - Your code will be very confusing to read, and anyone who maintains it in future may not understand what you're doing, even if you do.
Autopulated
+1  A: 

Mixing memcpy and new seems like a terrible idea in this context. Consider using malloc instead.

Cory Petosky
I think this is probably the best idea. "D'oh."
acron
A: 
char* buffer = [some data here];
myDerivedStruct* pNewStruct = new myDerivedStruct();
memcpy(buffer,pNewStruct,4*sizeof(int));
pNewStruct->ident = new char[ strlen(buffer+(4*sizeof int)) ];
strcpy(pNewStruct->ident,buffer+(4*sizeof int));

Something like that.

Bryan Ross
I would like to disclaim that doing things this way is a Bad Idea(tm)If you're using c++, you should be using std::string
Bryan Ross
STL isn't considered optimal for the platform I'm working, unfortunately :/
acron
+4  A: 

In the current C++ standard, myDerivedStruct is non-POD, because it has a base class. The result of memcpying anything into it is undefined.

I've heard that C++0x will relax the rules, so that more classes are POD than in C++98, but I haven't looked into it. Also, I doubt that very many compilers would lay out your class in a way that's incompatible with PODs. I expect you'd only have trouble with something that didn't do the empty base class optimisation. But there it is.

If it was POD, or if you're willing to take your chances with your implementation, then you could use malloc(sizeof(myStruct)+13) or new char[sizeof(myStruct)+13] to allocate enough space, basically the same as you would in C. The motivation presumably is to avoid the memory and time overhead of just putting a std::string member in your class, but at the cost of having to write the code for the manual memory management.

Steve Jessop
Off the top of my head, I don't think this is true. It has no virtual functions or custom default constructor, so I think it is POD.
Autopulated
Off the top of my head, 9/4: "a POD-struct is an aggregate class", and 8.5.1/1:"An aggregate is an array or class with ... no base classes". OK, I lied about it being off the top of my head ;-)
Steve Jessop
Okay, Spec wins :D
Autopulated
@Steve: Then why do compilers (GCC) that complain about certain operations on not-POD classes not complain on simple derived structs?
Zan Lynx
@Steve: In particular I can declare arrays of them to overlay mmap()'d data without triggering default constructors.
Zan Lynx
I don't know anything about that particular GCC warning, so I can't explain it, sorry. Since you say the warning is about default constructors, I would speculate that perhaps it has something to do with the fact that myDerivedStruct's constructor still does nothing on GCC. If you use `offsetof` with myDerivedStruct, then you do get a warning from GCC, but it gives the "right" answer anyway.
Steve Jessop
+6  A: 

You go back to C or abandon these ideas and actually use C++ as it's intended.

  • Use the constructor to allocate memory and destructor to delete it.
  • Don't let some other code write into your memory space, create a function that will ensure memory is allocated.
  • Use a std:string or std::vector to hold the data rather than rolling your own container class.

Ideally you should just say:

myDerivedClass* foo = new myDerivedClass(a, b, c, d, ident);

jmucchiello
I could do this, but it breaks down efficiency massively. Also, the incoming buffer has the data packed nicely - what is the point in splitting it all up only to reassemble it? All I want to do is copy the data from the buffer and give it some context using my struct.
acron
@acron: Ask yourself if efficiency matters *so much* that you can't afford to write reliable, readable C++ code. If you determine that it does, consider using C, as suggested.
Steve S
+1  A: 

You can allocate any size you want with malloc:

myDerivedStruct* pNewStruct = (myDerivedStruct*) malloc(
      sizeof(myDerivedStruct) + sizeof_extra data);

You have a different problem though, in that myDerivedStruct::ident is a very ambigous construct. It is a pointer to a char (array), then the structs ends with the address where the char array starts? ident can point to anywhere and is very ambigous who owns the array ident points to. It seems to me that you expect the struct to end with the actual char array itself and the struct owns the extra array. Such structures usualy have a size member to keep track of teir own size so that API functions can properly manage them and copy them, and the extra data starts, by convention, after the structure ends. Or they end with a 0 length array char ident[0] although that creates problems with some compilers. For many reasons, there is no place for inheritance in such structs:

struct myStruct 
{
size_t size;    
int a, b, c, d;    
char ident[0];
};
Remus Rusanu
Array declarations of size 0 are ill-formed in C++ (as well as in C, BTW). *All* compilers have problems with it, not just some.
AndreyT
Yeah, so based on the assumption that each string will have atleast 1 char, I have used 'char ident;'. This method has worked.
acron
AndreyT
acron
@AndreyT: GNU C allows that 0-length array, as an extension. Not sure whether that means it "has problems with it". It forbids it in pedantic mode. C99 allows a flexible-length array at the end of a struct, `char ident[]`, which is intended for this purpose.
Steve Jessop
Old compilers tolerated zero length arrays for the 'struct hack'. Old system headers were full of structs ending in 0 length arrays. Newer compilers explicitly forbit it, and some accept the new syntax of sizeless array. What 'old' and 'new' means is relative I guess... This is also discussed http://stackoverflow.com/questions/627364/zero-length-arrays-vs-pointers
Remus Rusanu
Since he's using 0 terminates string in his struct he should set the array size to 1. Then he doesn't need the +1 after the strlen() call. Regardless, this is not a C++ solution at all.
jmucchiello
@Steve Jessop: I know that GCC allows it. Yet the question is about C, and not about GCC. In C99 it is `[]`, not `[0]`, which is not the same.
AndreyT
@Remus Rusanu: "Struct hack" does not require a 0-length array. Array of any length will work just as well. The "correct" form of struct hack normally uses an array of size 1, as I used in my answer. Arrays of size 0 for struct hack is a typical sign of "dirty" code. Some people are known to use size 0 since it simplifies (not really) the total size calculation, but I don't accept this as an excuse. Onca gain, canonical "struct hack" uses an array of size 1, not 0.
AndreyT
@acron: Huh? You said you used `char ident;`. Such an `ident` is *not an array*. You can't use `ident[0]` or `ident[n]` for this reason. What you are saying makes no sense. Please, clarify.
AndreyT
A: 

Is the buffer size known at compile time? A statically allocated array would be an easier solution in that case. Otherwise, see Remus Rusanu's answer above. That's how the win32 api manages variable sized structs.

struct myDerivedStruct : public myBaseStruct
{
    int a, b, c, d;
    unsigned char ident[BUFFER_SIZE];
};
christopher_f
A: 

Firstly, I don't get what's the point of having a myBaseStruct base. You proivided no explanation.

Secondly, what you declared in your original post will no work with the data layout you described. For what you described in the OP, you need the last member of the struct to be an array, not a pointer

struct myDerivedStruct : public myBaseStruct {
    int a, b, c, d;
    unsigned char ident[1];
};

Array size doesn't matter, but it should be greater than 0. Arrays of size 0 are explicitly illegal in C++.

Thirdly, if you for some reason want to use new specifically, you'll have to allocate a buffer of char objects of required size and then convert the resultant pointer to your pointer type

char *raw_buffer = new char[29];
myDerivedStruct* pNewStruct = reinterpret_cast<myDerivedStruct*>(raw_buffer);

After that you can do your memcpy, assuming that the size is right.

AndreyT
+1  A: 

You can overallocate for any class instance, but it implies a certain amount of management overhead. The only valid way to do this is by using a custom memory allocation call. Without changing the class definition, you can do this.

void* pMem = ::operator new(sizeof(myDerivedStruct) + n);
myDerivedStruct* pObject = new (pMem) myDerivedStruct;

Assuming that you don't overload operator delete in the hierarchy then delete pObject will be a correct way to destroy pObject and deallocate the allocated memory. Of course, if you allocate any objects in the excess memory area then you must correctly free them before deallocating the memory.

You then have access to n bytes of raw memory at this address: void* p = pObject + 1. You can memcpy data to and from this area as you like. You can assign to the object itself and shouldn't need to memcpy its data.

You can also provide a custom memory allocator in the class itself that takes an extra size_t describing the amount of excess memory to allocate enabling you to do the allocation in a single new expression, but this requires more overhead in the class design.

myDerivedStruct* pObject = new (n) myDerivedStruct;

and

struct myDerivedStruct
{
    // ...
    void* operator new(std::size_t objsize, std::size_t excess storage);

    // other operator new and delete overrides to make sure that you have no memory leaks
};
Charles Bailey