views:

512

answers:

2

I've been starting to use Fortran (95) for some numerical code (generating python modules). Here is a simple example:

subroutine bincount (x,c,n,m)
  implicit none
  integer, intent(in) :: n,m
  integer, dimension(0:n-1), intent(in) :: x
  integer, dimension(0:m-1), intent(out) :: c
  integer :: i

  c = 0
  do i = 0, n-1
    c(x(i)) = c(x(i)) + 1 
  end do
end

I've found that this performs very well in 32 bit, but when compiled as x86_64 it is about 5x slower (macbook pro core2duo, snow leopard, gfortran 4.2.3 from r.research.att.com). I finally realised this might be due to using 32bit integer type instead of the native type, and indeed when I replace with integer*8, the 64 bit performance is only 25% worse than the 32bit one.

Why is using a 32 bit integer so much slower on a 64 bit machine? Are there any implicit casts going on with the indexing that I might not be aware of?

Is it always the case that 64 bit will be slower than 32 bit for this type of code (I was surprised at this) - or is there a chance I could get the 64 bit compiled version running the same speed or faster?

(main question) Is there any way to declare a (integer) variable to be the 'native' type... ie 32 bit when compiled 32 bit, 64 bit when compiled 64 bit in modern fortran. Without this it seems like it is impossible to write portable fortran code that won't be much slower depending on how its compiled - and I think this means I will have to stop using fortran for my project. I have looked at kind and selected_kind but not been able to find anything that does this.

[Edit: the large performance hit was from the f2py wrapper copying the array to cast it from 64 bit int to 32 bit int, so nothing inherent to the fortran.]

+1  A: 

Hi

The answer to your 'main question' is to select the correct compiler option to have the default integer declared with 32 or 64 bits. I never use gfortran (I prefer g95, even better a paid-for compiler) so I Googled and it seems that -fdefault-integer-8 is the option you need.

Like you I'm surprised that the 64 bit version is slower than the 32 bit version. I don't have anything illuminating on that point.

Regards

Mark

High Performance Mark
thanks - I had looked for something like that but hadn't been able to find it. Unfortunately it doesn't really solve my problem - I'm wrapping the subroutine with f2py and when I use that option it chokes (bus error) since I guess f2py generated the integerface for 32 bit integers. So if I do this I still have to manually edit the generated interface for each platform which is what I wanted to avoid (just wanted to give it to people). I really want something like kind=7 from http://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/Kind-Notation.html but it seems to be out of date (doesn't work on gfortran)
thrope
I haven't been able to get g95 working on snow leopard but I am looking at getting a paid for one (intel)
thrope
+1  A: 

While I haven't done careful studies, I haven't seen such large speed differences.

I suggest trying a newer version of gfortran. Version 4.2 is earlier (gfortran started with 4.0) and considered obsolete. 4.3 and 4.4 are much improved and have more features. 4.4 is the current non-beta version. An easy way to obtain them on a Mac is via MacPorts: the gcc43 and gcc44 packages include gfortran. The compilers are installed as gcc-mp-4.3, gfortran-mp-4.3, etc., so as not to conflict with other versions. Or you can try the latest build of 4.5 from the gfortran wiki page.

Intel fortran is sometimes significantly faster than gfortran.

M. S. B.