I've been starting to use Fortran (95) for some numerical code (generating python modules). Here is a simple example:
subroutine bincount (x,c,n,m)
implicit none
integer, intent(in) :: n,m
integer, dimension(0:n-1), intent(in) :: x
integer, dimension(0:m-1), intent(out) :: c
integer :: i
c = 0
do i = 0, n-1
c(x(i)) = c(x(i)) + 1
end do
end
I've found that this performs very well in 32 bit, but when compiled as x86_64 it is about 5x slower (macbook pro core2duo, snow leopard, gfortran 4.2.3 from r.research.att.com). I finally realised this might be due to using 32bit integer type instead of the native type, and indeed when I replace with integer*8, the 64 bit performance is only 25% worse than the 32bit one.
Why is using a 32 bit integer so much slower on a 64 bit machine? Are there any implicit casts going on with the indexing that I might not be aware of?
Is it always the case that 64 bit will be slower than 32 bit for this type of code (I was surprised at this) - or is there a chance I could get the 64 bit compiled version running the same speed or faster?
(main question) Is there any way to declare a (integer) variable to be the 'native' type... ie 32 bit when compiled 32 bit, 64 bit when compiled 64 bit in modern fortran. Without this it seems like it is impossible to write portable fortran code that won't be much slower depending on how its compiled - and I think this means I will have to stop using fortran for my project. I have looked at kind and selected_kind but not been able to find anything that does this.
[Edit: the large performance hit was from the f2py wrapper copying the array to cast it from 64 bit int to 32 bit int, so nothing inherent to the fortran.]