views:

299

answers:

11

I'm using C++, and I have the following structures:

struct ArrayOfThese {
  int a;
  int b;
};

struct DataPoint {
  int a;
  int b;
  int c;
};

In memory, I want to have 1 or more ArrayOfThese elements at the end of each DataPoint. There are not always the same number of ArrayOfThese elements per DataPoint.

Because I have a ridiculous number of DataPoints to assemble and then stream across a network, I want all my DataPoints and their ArrayOfThese elements to be contiguous. Wasting space for a fixed number of the ArrayOfThese elements is unacceptable.

In C, I would have made an element at the end of DataPoint that was declared as ArrayOfThese d[0];, allocated a DataPoint plus enough extra bytes for however many ArrayOfThese elements I had, and used the dummy array to index into them. (Of course, the number of ArrayOfThese elements would have to be in a field of DataPoint.)

In C++, is using placement new and the same 0-length array hack the correct approach? If so, does placement new guarantee that subsequent calls to new from the same memory pool will allocate contiguously?

A: 

Could you make those into classes with the same superclass and then use your favourite stl container of choice, using the superclass as the template?

ldog
Don't do this unless you want to learn about slicing. You can store pointers to a superclass in an STL container, but if you start trying to store subclasses in a container specialized for the superclass, you'll end up losing any subclass information.
Eclipse
+1  A: 

Prior to C++0X, the language had no memory model to speak of. And with the new standard, I don't recall any talk of guarantees of contiguity.

Regarding this particular question, it sounds as if what you want is a pool allocator, many examples of which exist. Consider, for instance, Modern C++ Design, by Alexandrescu. The small object allocator discussion is what you should look at.

Don Wakefield
+1  A: 

I think boost::variant might accomplish this. I haven't had an opportunity to use it, but I believe it's a wrapper around unions, and so a std::vector of them should be contiguous, but of course each item will take up the larger of the two sizes, you can't have a vector with differently-sized elements.

Take a look at the comparison of boost::variant and boost::any.

If you want the offset of each element to be dependent on the composition of the previous elements, you will have to write your own allocator and accessors.

Tim Sylvester
+1 I was going to say if there isnt a problem with having some of the data outside of the struct that vector would be a great choice.
acidzombie24
+5  A: 

Since you are dealing with plain structures that have no constructors, you could revert to C memory management:

void *ptr = malloc(sizeof(DataPoint) + n * sizeof(ArrayOfThese));
DataPoint *dp = reinterpret_cast<DataPoint *>(ptr));
ArrayOfThese *aotp = reinterpet_cast<ArrayOfThese *>(reintepret_cast<char *>(ptr) + sizeof(DataPoint));
R Samuel Klatchko
His *ArrayOfThese* is the element, not an array, so you need _ArrayOfThese*_, I think.
NVRAM
.. but only in the last line.
NVRAM
NVRAM spotted a coding error (his comment will now seem wrong as I corrected my code, but it was correct when he wrote it).
R Samuel Klatchko
+1  A: 

Seems like it would be simpler to allocate an array of pointers and work with that rather than using placement new. That way you could just reallocate the whole array to the new size with little runtime cost. Also if you use placement new, you have to explicitly call destructors, which means mixing non-placement and placement in a single array is dangerous. Read http://www.parashift.com/c++-faq-lite/dtors.html before you do anything.

Jherico
+3  A: 

Since your structs are PODs you might as well do it just as you would in C. The only thing you'll need is a cast. Assuming n is the number of things to allocate:

DataPoint *p=static_cast<DataPoint *>(malloc(sizeof(DataPoint)+n*sizeof(ArrayOfThese)));

Placement new does come into this sort of thing, if your objects have a a non-trivial constructor. It guarantees nothing about any allocations though, for it does no allocating itself and requires the memory to have been already allocated somehow. Instead, it treats the block of memory passed in as space for the as-yet-unconstructed object, then calls the right constructor to construct it. If you were to use it, the code might go like this. Assume DataPoint has the ArrayOfThese arr[0] member you suggest:

void *p=malloc(sizeof(DataPoint)+n*sizeof(ArrayOfThese));
DataPoint *dp=new(p) DataPoint;
for(size_t i=0;i<n;++i)
    new(&dp->arr[i]) ArrayOfThese;

What gets constructed must get destructed so if you do this you should sort out the call of the destructor too.

(Personally I recommend using PODs in this sort of situation, because it removes any need to call constructors and destructors, but this sort of thing can be done reasonably safely if you are careful.)

brone
A: 

Two questions:

  1. Is the similarity between ArrayOfThese and DataPoint real, or a simplification for posting? I.e. is the real difference just one int (or some arbitrary number of the same type of items)?
  2. Is the number of ArrayOfThese associated with a particular DataPoint known at compile time?

If the first is true, I'd think hard about simply allocating an array of as many items as necessary for one DataPoint+N ArrayOfThese. I'd then build a quick bit of code to overload operator[] for that to return item N+3, and overload a(), b() and c() to return the first three items.

If the second is true, I was going to suggest essentially what I see fnieto has just posted, so I won't go into more detail.

As far as placement new goes, it doesn't really guarantee anything about allocation -- in fact, the whole idea about placement new is that it's completely unrelated to memory allocation. Rather, it allows you to create an object at an arbitrary address (subject to alignment restrictions) in a block of memory that's already allocated.

Jerry Coffin
The similarity was just a simplification, and the numbers of ArrayOfThese are not known at compile time.
Jim Hunziker
+1  A: 

don't confuse data organisation inside your program and data organisation for serialization: they do not have the same goal.

for streaming across a network, you have to consider both side of the channel, the sending and the receiving side: how does the receiving side differentiate between a DataPoint and an ArrayOfThese ? how does the receiving side know how many ArrayOfThese are appended after a DataPoint ? (also to consider: what is the byte ordering of each side ? does data types have the same size in memory ?)

personally, i think you need a different structure for streaming your data, in which you add the number of DataPoint you are sending as well as the number of ArrayOfThese after each DataPoint. i would also not care about the way data is already organized in my program and reorganize/reformat to suit my protocol and not my program. after that writing a function for sending and another for receiving is not a big deal.

Adrien Plisson
+1  A: 

Why not have DataPoint contain a variable-length array of ArrayOfThese items? This will work in C or C++. There are some concerns if either struct contains non-primitive types

But use free() rather than delete on the result:

struct ArrayOfThese {
  int a;
  int b;
};


struct DataPoint {
  int a;
  int b;
  int c;
  int length;
  ArrayOfThese those[0];
};

DataPoint* allocDP(int a, int b, int c, size_t length)
{
    // There might be alignment issues, but not for most compilers:
    size_t sz = sizeof(DataPoint) + length * sizeof(ArrayOfThese);
    DataPoint dp = (DataPoint*)calloc( sz );
    // (Check for out of memory)
    dp->a = a; dp->b = b; tp->c = c; dp->length = length;
}

Then you can use it "normally" in a loop where the DataPoint knows its length:

DataPoint *dp = allocDP( 5, 8, 3, 20 );

for(int i=0; i < dp->length; ++i)
{
    // Initialize or access: dp->those[i]
}
NVRAM
BTW, the concern with non-primitive types is the lack of CTOR/DTOR use vs. alloc/free. Those can be addressed, but it's non-trivial.
NVRAM
This answer is the C way of doing it that I mentioned in the question.
Jim Hunziker
+2  A: 

As Adrian said in his answer, what you do in memory doesn't have to be the same as what you stream over the network. In fact, it might even be good to clearly divide this, because having a communication protocol relying on your data being designed in a specific way makes huge problem if you later need to refactor your data.

The C++ way to store an arbitrary number of elements contiguously is of course to std::vector. Since you didn't even consider this, I assume that there's something that makes this undesirable. (Do you only have small numbers of ArrayOfThese and fear the space overhead associated with std::vector?)

While the trick with over-allocating a zero-length array probably isn't guaranteed to work and might, technically, invoke the dreaded undefined behavior, it's a widely spread one. What platform are you on? On Windows, this is done in the Windows API, so it's hard to imagine a vendor shipping a C++ compiler which wouldn't support this.

If there's a limited number of possible ArrayOfThese element counts, you could also use fnieto's trick to specify those few numbers and then new one of the resulting template instances, depending on the run-time number:

struct DataPoint {
  int a;
  int b;
  int c;
};

template <std::size_t sz>
struct DataPointWithArray : DataPoint {
  ArrayOfThese array[sz];
};

DataPoint* create(std::size_t n)
{
  switch(n) {
    case 1: return new DataPointWithArray[1];
    case 2: return new DataPointWithArray[2];
    case 5: return new DataPointWithArray[5];
    case 7: return new DataPointWithArray[7];
    case 27: return new DataPointWithArray[27];
    default: assert(false);
  }
  return NULL;
}
sbi
`vector` doesn't work well with variably sized elements. See http://stackoverflow.com/questions/1626846/how-do-i-allocate-variably-sized-structures-contiguously-in-memory/1626884#1626884
Jim Hunziker
@Jim: But `ArrayOfThese` isn't variably-sized, is it? And that answer you linked to dealt with polymorphic classes, so it's irrelevant.
sbi
I think you meant to write things like `new DataPointWithArray<27>`. But this is allocating data from the heap at random locations in memory. Besides having to stream this over a network, I need to access tons of these in sequence during processing, so having locality of reference is important to me.
Jim Hunziker
A: 

Here's the code I ended up writing:

#include <iostream>
#include <cstdlib>
#include <cassert>
using namespace std;

struct ArrayOfThese {
  int e;
  int f;
};

struct DataPoint {
  int a;
  int b;
  int c;
  int numDPars;
  ArrayOfThese d[0];

  DataPoint(int numDPars) : numDPars(numDPars) {}

  DataPoint* next() {
    return reinterpret_cast<DataPoint*>(reinterpret_cast<char*>(this) + sizeof(DataPoint) + numDPars * sizeof(ArrayOfThese));
  }

  const DataPoint* next() const {
    return reinterpret_cast<const DataPoint*>(reinterpret_cast<const char*>(this) + sizeof(DataPoint) + numDPars * sizeof(ArrayOfThese));
  }
};

int main() {
  const size_t BUF_SIZE = 1024*1024*200;

  char* const buffer = new char[BUF_SIZE];
  char* bufPtr = buffer;

  const int numDataPoints = 1024*1024*2;
  for (int i = 0; i < numDataPoints; ++i) {
    // This wouldn't really be random.
    const int numArrayOfTheses = random() % 10 + 1;

    DataPoint* dp = new(bufPtr) DataPoint(numArrayOfTheses);

    // Here, do some stuff to fill in the fields.
    dp->a = i;

    bufPtr += sizeof(DataPoint) + numArrayOfTheses * sizeof(ArrayOfThese);
  }

  DataPoint* dp = reinterpret_cast<DataPoint*>(buffer);
  for (int i = 0; i < numDataPoints; ++i) {
    assert(dp->a == i);
    dp = dp->next();
  }

  // Here, send it out.

  delete[] buffer;

  return 0;
}
Jim Hunziker