I have a very large nested for loop in which some multiplications and additions are performed on floating point numbers.
for (int i = 0; i < length1; i++)
{
double aa = 0;
for(int h = 0; h < 10; h++)
{
aa += omega[i][outsideGeneratedAddress[h]];
}
double alphaOld = alpha;
alpha = Math.Sqrt(alpha * alpha + aa * aa);
s = -aa / alpha;
c = alphaOld / alpha;
for(int j = 0; j <= i; j++)
{
double oldU = u[j];
u[j] = c * oldU + s * omega[i][j];
omega[i][j] = c * omega[i][j] - s * oldU;
}
}
This loop is taking up the majority of my processing time and is a bottleneck.
Would I be likely to see any speed improvements if I rewrite this loop in C and interface to it from C#?
EDIT: I updated the code to show how s and c are generated. Also the inner loop actually goes from 0 to i, though it probably doesn't make much difference to the question
EDIT2: I implemented the algorithm in VC++ and linked it with C# through a dll and saw a 28% speed boost over C# when all optimisations are enabled. The argument to enable SSE2 works particularly well. Compiling with MinGW and gcc4.4 only gave a 15% speed boost. Just tried the Intel compiler and saw a 49% speed boost for this code.