tags:

views:

60

answers:

1

Hi there.

I'm putting together some scientific code in Fortran 77, and I am having a debate on what would be faster.

Basically, I have an MxN matrix, let's call it A. M is larger than N. Later on in the code, I need to multiply transpose(A) by a bunch of vectors.

My question is, would it be faster to take A, transpose it on my own and store that, or when I call BLAS, just give it the transpose flag?

Thanks! -Patrick

A: 

my gut feeling tells me to use transpose flag. in that case you doing lots of dot products with stride of one.

In reality, it's very hard to tell without actually running codes. modern blas employs cache blocking techniques which make simple analysis difficult at best.

aaa
Thanks! I had a feeling there would be no straightforward answer on this one. It won't be too hard to test it out, probably depends on the size of A and how many times I end up multiplying A'...
Patrick
@Patrick if you do test, you can put your results as answer.
aaa