views:

334

answers:

3

I've been going deeper into C++ recently and my bugs seem to get complex.

I have a vector of objects, each object contains a vector of floats. I decided I needed to create a further flat array containing all the float values of all objects in one. It's a little more complex than that but the gist of the problem is that as I loop through my objects extracting the float values, at some point my vector of objects is changed, or corrupted in some strange way. (My read operations are all const functions)

Another example was with MPI. I was just getting started so I just wanted to run the exact same code on two different nodes with their own memory and with no data transfer happening, all very simple. To my surprise I got segmentation errors and after hours tracking, I found that one assignment of one variable was setting an entirely different variable to NULL.

So I am curious, how is it possible that read operations can affect my data structures. Similarly how can a seemingly unrelated operation affect another. I couldn't expect solutions to my problems with those brief descriptions but any advice will be greatly appreciated.

Update: Here's a segment of the code, I didn't post originally because I am not sure how much can be extracted from it without understanding the whole system.

One thing I just found out though was that when I stopped assigning the value to my flat array and just cout'ed instead, the seg errors disappeared. So perhaps I am declaring my array wrong, but even if I was I'm not sure how it would affect the object vector.

void xlMasterSlaveGpuEA::FillFlatGenes() {
    int stringLength = pop->GetGenome(0).GetLength();
    for (int i=0;i<pop->GetPopSize();i++)
        for (int j=0;j<stringLength;j++)
            flatGenes[(i*stringLength)+j]<< pop->GetGenome(i).GetFloatGene(j);
}

float xlVectorGenome::GetFloatGene(unsigned int i) const {
    return GetGene(i);
}

my flat array is a member function

float * flatFitness;

initailsed in the constructor like so:

flatFitness = new float(popSize);

Update 2:

I just want to point out that the two examples above are not related, the first one is not multi threaded. The second MPI example is technically, but MPI is distributed memory and I deliberately attempted the most simple implementation I could think of, which is both machines running code independently. There is however one extra detail, I put in a condtional saying

if node 1 then do bottom half of loop

if node 1 then do top half

Again the memory should be isolated, they should be working as if they know nothing about each other.. but removing this conditional and making both loops do all cubes, eliminates the error

A: 

I suspect you have multi-threading or memory corruption issues you are not aware of. The behavior you describe isn't any sort of standard, by-design, desirable behavior.

jeffamaphone
Yes, this sounds like an almost textbook case of missing memory barriers and insufficient inter-thread synchronisation. It only takes one thread to be updating the data object at some point after it becomes visible to another thread, and lack of synchronisation will bite you sooner or later.
Bill Michell
Maybe, except he never said he was using threads. MPI is process-level parallelism unless you combine it with something else.
tgamblin
Heh, there was no code posted when I responded.
jeffamaphone
A: 

jeffamaphone may be right that this is a threading issue. Another possibility is that the objects that you're reading have already been deleted. You would then be reading from an invalid address. It's also possible that the data structures you're writing into at this time are stored at the same location that the vectors formerly occupied. This would result in the behavior you're describing.

EDIT (based on your update):

This may be faulty: stringLength is initialized outside of the outer loop, but it looks like it needs to be updated during that outer loop:

int stringLength = pop->GetGenome(0).GetLength();
for (int i=0;i<pop->GetPopSize();i++)
    for (int j=0;j<stringLength;j++)
        flatGenes[(i*stringLength)+j]<< pop->GetGenome(i).GetFloatGene(j);

Suggested fix:

for (int i=0;i<pop->GetPopSize();i++) {
    int stringLength = pop->GetGenome(i).GetLength();
    for (int j=0;j<stringLength;j++) {
        flatGenes[(i*stringLength)+j]<< pop->GetGenome(i).GetFloatGene(j);
    }
}
Dan Breslau
+13  A: 

This is not an array constructor:

float * flatFitness;
flatFitness = new float(popSize);

You're creating one float on the heap here, initialized with value popSize. If you want an array of floats you need to use brackets instead of parentheses:

float *flatFitness = new float[popSize];

This could easily be causing the problems you describe. Also, remember when you create arrays, you need to delete using delete [] (eventually):

delete [] flatFitness;

If you just use delete, it might work, but the behavior is undefined.

If you want to avoid using array syntax altogether, why not use std::vector? You can create a vector of popSize elements like this:

#include <vector>

std::vector<float> flatFitness(popSize);

This will be freed automatically when it falls out of scope, so you don't have to worry about new or delete.

Update (re: comment): If you're already using std::vectors elsewhere in your code, take a look at std::vector::swap(). You may be able to avoid copying things altogether and just swap a couple vectors back and forth between buffering for CUDA and the processing you're doing here.

tgamblin
thanks, i'll check that out
zenna
funnily enough I'm extracting the data from std::vectors as I need to send it to the GPU through CUDA. There's probably a much more elegant way
zenna
Take a look at vector::swap(). You might be able to instantiate some vectors here, then swap them with the ones you use for CUDA so that you don't have to do any copying at all. Just make sure they're the right size using resize() or the constructor (as above), or you might end up writing to memory you don't have.
tgamblin