views:

1139

answers:

3

I have this 'simplified' fortran code

real B(100, 200) 
real A(100,200)

... initialize B array code.

do I = 1, 100
  do J = 1, 200
    A(J,I) = B(J,I)
  end do
end do

One of the programming gurus warned me, that fortran accesses data efficiently in column order, while c accesses data efficiently in row order. He suggested that I take a good hard look at the code, and be prepared to switch loops around to maintain the speed of the old program.

Being the lazy programmer that I am, and recognizing the days of effort involved, and the mistakes I am likely to make, I started wondering if there might a #define technique that would let me convert this code safely, and easily.

Do you have any suggestions?

+3  A: 

Are you sure your FORTRAN guys did things right?

The code snippet you originally posted is already accessing the arrays in row-major order (which is 'inefficient' for FORTRAN, 'efficient' for C).

As illustrated by the snippet of code and as mentioned in your question, getting this 'correct' can be error prone. Worry about getting the FORTRAN code ported to C first without worrying about details like this. When the port is working - then you can worry about changing column-order accesses to row-order accesses (if it even really matters after the port is working).

Michael Burr
i will correct it.thanks.
EvilTeach
+2  A: 

One of my first programming jobs out of college was to fix a long-running C app that had been ported from FORTRAN. The arrays were much larger than yours and it was taking something around 27 hours per run. After fixing it, they ran in about 2.5 hours... pretty sweet!

(OK, it really wasn't assigned, but I was curious and found a big problem with their code. Some of the old timers didn't like me much despite this fix.)

It would seem that the same issue is found here.

real B(100, 200) 
real A(100,200)

... initialize B array code.

do I = 1, 100
  do J = 1, 200
    A(I,J) = B(I,J)
  end do
end do

Your looping (to be good FORTRAN) would be:

real B(100, 200) 
real A(100,200)

... initialize B array code.

do J = 1, 200
  do I = 1, 100
    A(I,J) = B(I,J)
  end do
end do

Otherwise you are marching through the arrays in row-major, which could be highly inefficient.

At least I believe that's how it would be in FORTRAN - it's been a long time.


Saw you updated the code...

Now, you'd want to swap the loop control variables so that you iterate on the rows and then inside that iterate on the columns if you are converting to C.

itsmatt
This is a contrived example, so as to keep things less complex.I was hoping not to have to change the loops, in thousands of lines of code, through some macro trickery.
EvilTeach
+1  A: 

In C, multi-dimensional arrays work like this:

#define array_length(a) (sizeof(a)/sizeof((a)[0]))
float a[100][200];
a[x][y] == ((float *)a)[array_length(a[0])*x + y];

In other words, they're really flat arrays and [][] is just syntactic sugar.

Suppose you do this:

#define at(a, i, j) ((typeof(**(a)) *)a)[(i) + array_length((a)[0])*(j)]
float a[100][200];
float b[100][200];
for (i = 0; i < 100; i++)
    for (j = 0; j < 200; j++)
        at(a, j, i) = at(b, j, i);

You're walking sequentially through memory, and pretending that a and b are actually laid out in column-major order. It's kind of horrible in that a[x][y] != at(a, x, y) != a[y][x], but as long as you remember that it's tricked out like this, you'll be fine.

Edit

Man, I feel dumb. The intention of this definition is to make at(a, x, y) == at[y][x], and it does. So the much simpler and easier to understand

#define at(a, i, j) (a)[j][i]

would be better that what I suggested above.

ephemient
It looks viable.thank you.
EvilTeach