tags:

views:

282

answers:

4

We recently received a report that our application will occasionally fail to run. I tracked down the problem code to this:

struct ARRAY2D
{
   long[] col;
}

int numRows = 800000;
int numCols = 300;
array = (ARRAY2D*) malloc(numRows * numCols * sizeof(long))

This allocation of 800 Mb can fail if the user doesn't have a large enough free block. What is the best way to change how I allocate the memory?

Keep in mind that I have a large amount of code that accesses this object like this: array[row].col[colNum], so I need something that requires minor or primarily find & replace editing of the array access code.

+7  A: 

Will there be a lot of default values in your ARRAY2D? If yes, you need a sparse array. The minimal change would be to use an unordered_map (or hash_map or map):

static const int numRows = 800000;
static const int numCols = 300;

struct ARRAY2D {
  long col[numCols];
  // initialize a column to zero; not necessary.
  ARRAY2D() { memset(col, 0, sizeof(col)); }
};


// no need to malloc
std::unordered_map<int, ARRAY2D> array;
...
// accessing is same as before ...
array[1204].col[212] = 4423;
printf("%d", array[1204].col[115]);
...
// no need to free.

If the row indices are always continuous but much less than numRows, use a std::vector instead.

std::vector<ARRAY2D> array;
...
// resize to the approach value.
array.resize(2000);
...
// accessing is same as before ...
array[1204].col[212] = 4423;
printf("%d", array[1204].col[115]);
...
// no need to free.
KennyTM
There will be default values in certain parts of the array. Sometimes there were be less rows than numRows and less columns than numCols.
Brian
OP has tagged with C and C++, so I'd assume he is looking for solution usable in C as well
mloskot
Then why did he tag it C++?
Andreas Bonini
@Brian: Having less rows and columns than the maximum is fine because an unordered map will use more memory only when the number of entries increases.
KennyTM
@Andreas Bonini presumably to indicate he is using C features only, elements of C standard library.
mloskot
@mloskot: I find this unlikely
Andreas Bonini
@mloskot: Even if OP restricts to C there is still lots of hash table libraries for C.
KennyTM
If STL is the best choice, then I can use it.
Brian
@Andreas Bonini I'm curious, just a guess or have read it from the question itself? I don't have access to my magic crystal ball, so I can only read words that are written. It says: C and C++ (see tags)
mloskot
If I'd use an unordered map for the columns, should I also use it for the rows?
Brian
@Brian: Sorry. How will you access the 2D array again? `array.col[colNum]` is syntax error. Do you mean `array[rowNum].col[colNum]`?
KennyTM
@KennyTM - yes you're correct
Brian
@Brian: See update.
KennyTM
+6  A: 

You can allocate smaller chunks of memory separately, instead of one huge block.

long** array = NULL;  
array = (long**) malloc(numCols * sizeof(long*));  
for (int i = 0; i < numCols; i++)  
   array[i] = (long*)  malloc(numRows * sizeof(long));

Generally, memory allocation may fail, every allocation. However, let's say statistically, due to memory fragmentation, allocating a single large block of memory has higher chance to fail more often than allocating N number of smaller blocks. Although, also the solution above may cause problems as it is a bit like a double-bladed sword because it may lead to further memory fragmentation.

In other words, there is no generally perfect answer and solution depends on details of a system and application.

As from the comments it seems C++ library is a possibility, then solution based on std::vector (i.e. generic vector of vectors in C++) or using Boost.MultiArray

mloskot
I'm not sure I understand your code snippet. The allocation can still fail, right?
batbrat
@barbrat I updated my answer to address your question
mloskot
A: 

Your code has syntax errors: you are missing a semicolon and long[] col; is invalid C or C++.

Given:

struct ARRAY2D
{
   long *col;
};
ARRAY2D *array;
int numRows = 800000;
int numCols = 300;
array = (ARRAY2D*) malloc(numRows * numCols * sizeof(long));

you are potentially allocating the wrong amount of memory: sizeof(long) should be replaced by sizeof *array, or sizeof(ARRAY2D).

Assuming you got the right amount, you can index your array as: array[i], for i in the range [0, numRows*numCols). You haven't allocated any memory for col members of any of the array[i], so you can't index into col of any of those. Your use of array[row].col[colNum] is therefore wrong given the allocation scheme you have posted.

Perhaps it would help if you posted some real code that works.

Alok
A: 
sambowry