in certain applications, I have need to collapse nested loops into one while retaining individual index information.
for j in N:
for i in M:
... A(i,j) ...
// Collapse the loops
for ij in MN:
... A(i,j) ...
so have looked at the obvious ways to recover i,j from ij using division/modulo (expensive operation) and using if statements (breaks vectorization, branch prediction problems).in the end i came up with the following (using C-style comparisons):
j += (i == m)
i *= (i != m)
++i, ++ij
is there perhaps a even better way to do that? thanks