tags:

views:

3709

answers:

6

Which is the best way to store a 2D array in c# in order to optimize performance when performing lots of arithmetic on the elements in the array?

We have large (approx 1.5G) arrays, which for example we want to multiply with each other element by element. Performance is critical. The context in which this is done is in c#. Is there any smart way of storing the arrays and iterating over them? Could we write these parts in unmanaged C++ and will this really increase performance? The arrays need to be accessible to the rest of the c# program.

Currently (in c) the array is stored as a single long vector. We perform calculations on each element in the array and overwrite the old value. The calculations are usually unique for each element in the vector.

Timing experiments show that storing and iterating over the data as an array in C# is slower than storing it as a 2D array. I would like to know if there is an even better way of handling the data. The specific arithmetics performed are not relevant for the question.

A: 

You may find your best option is C++/CLI so you can have the matrix multiplication done using some vector architecture (SSE/AltiVec). However, I would first write what you'd like to do in C# and see the performance. Try techniques you would normally use such as loop unrolling, matrix chunking/blocking, and simple parallelization. If that does not provide the performance required, only then jump into the C++/CLI.

sixlettervariables
A: 

Can you explain the problem in detail? Do you want each element to be multiplied with the rest of the elements in the array? or each subarray to be multiplied with the other arrays in the 2D array?

If each operation ( like multiplying an element with the rest of the elements in the array ) doesn't have side effects, its better to read the array elements in batches from a file, and save the results to a file. That way you don't need to load all the elements in the array in one stretch.

Also Unmanaged code is almost a standard in these kind of operations.

Thanks.

+3  A: 

For best array performance, make sure you're using a single dimension array with lower index of 0.

To access the elements of the array as fast as possible, you can use unsafe pointers like so:

int[] array = Enumerable.Range(0, 1000).ToArray();

int count = 0;
unsafe {
    fixed (int* pArray = array) {
        for (int i = 0; i < array.Length; i++) {
            count += *(pArray + i);
        }
    }
}

EDIT Drat! Didn't notice you said 2D array. This trick won't work with a multi-dimensional array so I'm not sure how much help it will be. Although you could turn any array into a single-dimension array by doing some arithmetic on the array index. Just depends on if you care about the performance hit in indexing the array or in iterating over the array.

Cameron MacFarland
@Cameron MacFarland: actually it can work for 2D arrays, you just have to do something like *(pArray + (ii * cols) + jj).
sixlettervariables
sixlettervariables: Good to know. Although this only applies to rectangular arrays [,] and not jagged arrays [][] unless you use the column count for the current array, which would make things slower.
Cameron MacFarland
+6  A: 
Jason Stevenson
+2  A: 

If you download F#, and reference one of the runtime libraries (I think it's FSharp.PowerPack), and use Microsoft.FSharp.Maths.Matrix. It optimises itself based on whether you are using a dense or sparse matrix.

TraumaPony
A: 

Do you iterate the matrix by row or by colum or both? Do you always access nearby elements or do you do random accesses on the matrix.

If there is some locality in your accesses but you're not accessing it sequential (typical in matrix multiplication for example) then you can get a huge performance difference by storing your matrix in a more cache-friendly way.

A pretty easy way to do that is to write a little access function to turn your row/colum indices into an index and work on a one dimensional matrix, the cache-friendy way.

The function should group nearby coordinates into nearby indices. The morton-order can be used if you work on power of two sizes. For non-power sizes you can often bring just the lowest 4 bits into morton order and use normal index-arithmetic for the upper bits. You'll still get a significant speed-up, even if the coordinate to index conversion looks seems to be a costly operation.

http://en.wikipedia.org/wiki/Z-order_(curve) <-- sorry, can't link that SO does not like URL's with a dash in it. You have to cut'n'paste.

A speed up of factor 10 and more are realistic btw. It depends on the algorithm you ron over your matrices though.

Nils Pipenbrinck