views:

57

answers:

4

I'm trying to create a matrix data structure in C. I have a struct and have a two dimensional void pointer array (size is dynamically defined at the heap) for the cargo part(data) in this struct.

Given a column index, I want to get the values of this column in a one dimensional array. It is easy to this with one for or while loop. But if the number of rows in this matrix is N, then it'll take O(N) time for getting a column vector. Can I do this more efficiently with memory operations like memcpy and how? Otherwise how can I improve the performance(My data is pretty structured and I need to store this in some kind of matrix).

+4  A: 

If the number of rows in a column is N, you cannot copy, read, or otherwise manipulate the whole column in less time than O(N). This is a firm lower bound; every element must be considered, and there are N of them.

So no, you cannot make it faster than O(N).

Do note that saying x[3][5] is translated by the compiler to x+((3*num_cols)+5)*size_of_element for known-sized 2D arrays. One way to make your array faster would, thus, be to remove its dynamic sizing.

Another important point is that sequential access to memory is not always the fastest - so just rotating your array ninety degrees would not necessarily give you the best results. Look into blocking as an optimization technique. Bottom line: what memory layout is best depends on both your access patterns and hardware parameters such as cache line length and cache size.

Borealid
+1  A: 

As Borealid says, you can't improve on O(N). You could, however speed up the copying operation if you re-order your data so that rows are columns and columns are rows. This would allow you to use memcpy to duplicate the data.

Jon Cage
memcpy takes O(N) time too.
JeremyP
Yep JeremyP is right memcpy is O(N) too:http://google.com/codesearch/p?hl=en#lIRf952n7hs/libc/memcpy.c
systemsfault
Agreed, but it's probably more efficient than looping through the array ...probably ;-)
Jon Cage
I suspect there's a need to copy out both rows and columns, so transposing the matrix just moves the problem to rows..
R..
Possibly, but the OP didn't mention it so I thought it would be worth pointing out.
Jon Cage
+1  A: 

My solution:

  1. Don't use multi-dimensional arrays. They're inflexible pre-C99 (can't vary all the dimensions) and preclude doing efficient operations like the following. Instead just use a one-dimensional array and do the element indexing arithmetic yourself.

  2. Now, you can setup a pointer src pointing to the first element of a column (src = &matrix[row*ncols+col];), and copy the column with: for (i=0; i<nrows; i++, src+=ncols) dest[i] = *src;

R..
Hmm actually i'm familiar with mapping one dimensional arrays to multidimensions from cuda, but I'm not sure if this will make a significant performance difference.
systemsfault
Unless your compiler can factor the multiplication out of the loop, it should make a huge difference.
R..
+1  A: 

If you want to copy the data in your matrix, you can't do it in less than O(N) time whether it is a row or a column, except for small N where hardware features might help.

However, if your matrices are immutable, you can use smoke and mirrors to give the illusion of having a separate column vector.

The below code is typed straight in to the answer text box and has not even been compiled. Use at your own risk!

Your matrix type is defined as a struct thus:

typedef struct 
{
    unsigned int refCount;  // how many Matrixes are referencing this data ref
    size_t lineWidth;       // number of doubles between element at row = n, col = 0 and row = n +1, col = 0 
    double* data;           // the actual data
} DataRef;

typedef struct
{
    size_t rows;            // num rows in matrix
    size_t cols;            // num cols in matrix
    size_t dataOffset;      // offset in doubles from the start of data of element at row = 0, col = 0
    DataRef* data;
} Matrix;

To create a brand new matrix (I've left out all the error handling to make it simpler).

Matrix* matrix_create(size_t rows, size_t cols, const double* values)
{
    Matrix* ret = calloc(1, sizeof *ret);
    ret->rows = rows;
    ret->cols = cols;
    ret->dataOffset = 0;
    ret->data = calloc(1, sizeof *dataRef);
    ret->data->lineWidth = cols;
    ret->data->data = allocateAndCopy(rows * cols, values); // mallocs a new block of doubles big enough for the values
    ret->data->refCount = 1;
    return ret;
}

To access an element (again no error handling, eg bounds errors)

double matrix_elementAt(Matrix* matrix, size_t row, size_t col)
{
    size_t offset = matrix->dataOffset + row * matrix->data->lineWidth + col;
    return *(matrix->data->data + offset);
}

To create a new matrix from a rectangular region of another matrix (again, error handling needed)

Matrix* matrix_createFromRegion(Matrix* old, size_t startRow, size_t startCol, size_t rows, size_t cols)
{
    Matrix* ret = calloc(1, sizeof *ret);
    ret->rows = rows;
    ret->cols = cols;
    ret->dataOffset = old->dataOffset + startRow * old->dataLineWidth + startCol;
    ret->data = old->data;
    ret->data->refCount++;
    return ret;
}

To create a new matrix from a column in another matrix:

Matrix* vector = matrix_createFromRegion(aMatrix, 0, colYouWant, matrix_numRows(aMatrix), 1);

To free a matrix

void matrix_free(Matrix* aMatrix)
{
    if (aMatrix->data->refCount == 1)
    {
        free(aMatrix->data->data);
        free(aMatrix->data);
    }
    else
    {
        aMatrix->data->refCount--;
    }
    free(aMatrix);
}

If you want mutable matrices, any time you modify an element, check the refCount and if it is greater than 1, copy the DataRef before modifying it (decrement the refCount on the old dataRef), otherwise modify the dataRef in place.

Now the above uses lots of mallocs and so might be less efficient than the naive implementation for small matrices. However, you could maintain a list of unused DataRef structs and Matrix structs and instead of freeing them when you are done, put them on the free list. When allocating new ones, get the structs from the free lists unless they are empty. That way, getting a matrix that represents a column of an existing matrix will often take constant time.

JeremyP
Thanx i think this is the best solution here, actually my matrices are big very big so it doesn't fit to 5 gb RAM :D.
systemsfault