views:

448

answers:

1

Hi,

I have a Fortran subroutine which uses BLAS' subroutines dgemm, dgemv and ddot, which calculate matrix * matrix, matrix * vector and vector * vector. I have m * m matrices and m * 1 vectors. In some cases m=1. It seems that those subroutines doesn't work well in those cases. They doesn't give errors, but there seems to be some numerical unstability in results. So I have to write something like:

if(m>1) then 
  vtuni(i,t) = yt(i,t) - ct(i,t) - ddot(m, zt(i,1:m,(t-1)*tvar(3)+1), 1, arec, 1)
else 
   vtuni(i,t) = yt(i,t) - ct(i,t) - zt(i,1,(t-1)*tvar(3)+1)*arec(1)

So my actual question is, am I right that those BLAS' subroutines doesn't work properly when m=1 or is there just something wrong in my code? Can the compiler affect this? I'm using gfortran.

+4  A: 

BLAS routines are supposed to behave correctly with objects of size 1. I don't think it can depend on compiler, but it could possible depend on the implementation of BLAS you're relying on (though I'd consider it a bug of the implementation). The reference (read: not target-optimised) implementation of BLAS, which can be found on Netlib, handles that case fine.

I've done some testing on both arrays of size 1, and size-1 slices of larger array (as in your own code), and they both work fine:

 $ cat a.f90 
 implicit none
 double precision :: u(1), v(1)
 double precision, external :: ddot
 u(:) = 2
 v(:) = 3
 print *, ddot(1, u, 1, v, 1)
 end
 $ gfortran a.f90 -lblas && ./a.out
  6.0000000000000000     

 $ cat b.f90                       
 implicit none
 double precision, allocatable :: u(:,:,:), v(:)
 double precision, external :: ddot
 integer :: i, j
 allocate(u(3,1,3),v(1))
 u(:,:,:) = 2
 v(:) = 3
 i = 2
 j = 2
 print *, ddot(1, u(i,1:1,j), 1, v, 1)
 end
 $ gfortran b.f90 -lblas && ./a.out
  6.0000000000000000

Things I'd consider to debug this problem further:

  • Check that your ddot definition is correct
  • Substitute the reference BLAS to your optimised one, to check if it changes anything (you can just compile and link in the ddot.f file I linked to earlier in my answer)
FX