ansaurus

Question

Answer 1

+8 A:

My guess would be the fastest is, for a matrix, to use 1D STL array and override the () operator to use it as 2D matrix.

However, the STL also defines a type specifically for non-resizeable numerical arrays: valarray. You also have various optimisations for in-place operations.

valarray accept as argument a numerical type:

valarray<double> a;

Then, you can use slices, indirect arrays, ... and of course, you can inherit the valarray and define your own operator()(int i, int j) for 2D arrays ...

PierreBdR 2008-09-30 12:08:52

My upvote is for valarray, not necessarily to make a custom matrix type. Well, custom matrix type could work, but should still be based off valarray instead of vector (valarray supports slicing, which makes getting a column just as easy as getting a row).

Chris Jester-Young 2008-09-30 12:12:56

Careful inheriting from std::valarray; it is not designed for inheritance, as most of the "STL".

Patrick Johnmeyer 2008-09-30 13:15:16

You can inherit any class of the STL as long as you don't add data to them as the constructor won't be called. There is no pb adding methods though.

PierreBdR 2008-09-30 13:33:52

You can inherit "not-designed-for-inheritance" classes if you're using _private_ inheritance, if I understand correctly. e.g., class matrix : private std::valarray<double> or the like. Essentially you're not relying on any virtual behaviour.

Chris Jester-Young 2008-09-30 20:50:41

Public or protected inheritance of "not-designed-for-inheritance" classes is a no-no, however.

Chris Jester-Young 2008-09-30 20:51:14

I disagree with the last statement. The problem with "not-designed-for-inheritance" classes is the destructor is not virtual => the derived destructor might not be called. So, any variable allocated in the derived class won't be freed => you should not add any variable.

PierreBdR 2008-10-01 21:33:58

Note that even if you don't add member data, deleting a derived object via pointer to base type is undefined behavior if the destructor is nonvirtual.

2009-01-19 18:53:15

Answer 2

+5 A:

My recommendation would be to use Boost.UBLAS, which provides fast matrix/vector classes.

Martin Cote 2008-09-30 12:08:57

I should have clarified that while I'm dealing with matrices, the operations I'm performing aren't typical linear algebra. UBLAS looks very good for linear algebra, but perhaps overkill if I'm only using it as 2D array storage.

Chris Johnson 2008-09-30 12:22:54

I have tried various linear algebra libraries for use as 2d data (maps) but they are not convenient to use for non-linear algebra purposes nor faster than a vector of vectors. UBLAS (and others) is only fast for multiplication and other 'typical' matrix usages, not so much for accessing.

Roel 2008-09-30 12:33:35

Answer 3

+5 A:

Very likely this is a locality-of-reference issue. vector uses new to allocate its internal array, so each row will be at least a little apart in memory due to each block's header; it could be a long distance apart if memory is already fragmented when you allocate them. Different rows of the array are likely to at least incur a cache-line fault and could incur a page fault; if you're really unlucky two adjacent rows could be on memory lines that share a TLB slot and accessing one will evict the other.

In contrast your other solutions guarantee that all the data is adjacent. It could help your performance if you align the structure so it crosses as few cache lines as possible.

vector is designed for resizable arrays. If you don't need to resize the arrays, use a regular C++ array. STL operations can generally operate on C++ arrays.

Do be sure to walk the array in the correct direction, i.e. across (consecutive memory addresses) rather than down. This will reduce cache faults.

Mike Dimmick 2008-09-30 12:17:09

I hadn't thought about the block headers in the vector solution. I knew about the potential slowdown from walking the wrong way though: my speed tests show that walking the wrong way can be four times slower than doing it the right way!

Chris Johnson 2008-09-30 12:28:34

Answer 4

+7 A:

If you're using GCC the compiler can analyze your matrix accesses and change the order in memory in certain cases. The magic compiler flag is defined as:

-fipa-matrix-reorg

Perform matrix flattening and transposing. Matrix flattening tries to replace a m-dimensional matrix with its equivalent n-dimensional matrix, where n < m. This reduces the level of indirection needed for accessing the elements of the matrix. The second optimization is matrix transposing that attemps to change the order of the matrix's dimensions in order to improve cache locality. Both optimizations need fwhole-program flag. Transposing is enabled only if profiling information is avaliable.

Note that this option is not enabled by -O2 or -O3. You have to pass it yourself.

Nils Pipenbrinck 2008-09-30 12:19:53

Does this really work with std::vector? I doubt it.

lothar 2009-06-02 00:56:00

Would be both amazing and scary indeed.

peterchen 2010-08-12 10:32:24

Answer 5

+1 A:

To be fair depends on the algorithms you are using upon the matrix.

The double name[n*m] format is very fast when you are accessing data by rows both because has almost no overhead besides a multiplication and addition and because your rows are packed data that will be coherent in cache.

If your algorithms access column ordered data then other layouts might have much better cache coherence. If your algorithm access data in quadrants of the matrix even other layouts might be better.

Try to make some research directed at the type of usage and algorithms you are using. That is specially important if the matrix are very large, since cache misses may hurt your performance way more than needing 1 or 2 extra math operations to access each address.

OldMan 2008-09-30 12:44:47

Answer 6

+1 A:

You could just as easily do vector< double >( n*m );

tfinniga 2008-09-30 17:37:05

Answer 7

A:

There is the uBLAS implementation in Boost. It is worth a look.

http://www.boost.org/doc/libs/1_36_0/libs/numeric/ublas/doc/matrix.htm

ceretullis 2008-09-30 18:51:05

Answer 8

+1 A:

You may want to look at the Eigen C++ template library at http://eigen.tuxfamily.org/ . It generates AltiVec or sse2 code to optimize the vector/matrix calculations.

lothar 2008-12-08 22:44:06

Answer 9

A:

Another related library is Blitz++: http://www.oonumerics.org/blitz/docs/blitz.html

Blitz++ is designed to optimize array manipulation.

2009-01-10 22:41:34

Answer 10

A:

I have done this some time back for raw images by declaring my own 2 dimensional array classes.

In a normal 2D array, you access the elements like:

array[2][3]. Now to get that effect, you'd have a class array with an overloaded [] array accessor. But, this would essentially return another array, thereby giving you the second dimension.

The problem with this approach is that it has a double function call overhead.

The way I did it was to use the () style overload.

So instead of array[2][3], change I had it do this style array(2,3).

That () function was very tiny and I made sure it was inlined.

See this link for the general concept of that: http://www.learncpp.com/cpp-tutorial/99-overloading-the-parenthesis-operator/

You can template the type if you need to.
The difference I had was that my array was dynamic. I had a block of char memory I'd declare. And I employed a column cache, so I knew where in my sequence of bytes the next row began. Access was optimized for accessing neighbouring values, because I was using it for image processing.

It's hard to explain without the code but essentially the result was as fast as C, and much easier to understand and use.

Matt H 2010-08-12 04:26:55

ansaurus

tags:

views:

answers:

Optimising C++ 2-D arrays

related questions