views:

286

answers:

0

I’ve run my code in a variety of circumstances which has resulted in what I believe to be odd behavior. My testing was on a dual core intel xeon processor with HT.

No OpenMP '#pragma' statement, total runtime = 507 seconds

With OpenMP '#pragma' statement specifying 1 core, total runtime = 117 seconds

With OpenMP '#pragma' statement specifying 2 core, total runtime = 150 seconds

With OpenMP '#pragma' statement specifying 3 core, total runtime = 157 seconds

With OpenMP '#pragma' statement specifying 4 core, total runtime = 144 seconds

I guess I can’t figure out why commenting out my openmp line makes the program slow down so much between 1 thread without openmp and 1 thread WITH openmp.

All I am changing is between:

//#pragma omp parallel for shared(segs) private(i, j, p_hough) num_threads(1) schedule(guided)

and...

#pragma omp parallel for shared(segs) private(i, j, p_hough) num_threads(1,2,3,4) schedule(guided)

Anyways, if anyone has any idea why this may be happening, please let me know!

Thanks for any help,

Brett

EDIT: I'll address some of the comments here

I am using num_threads(1), num_threads(2), etc..

With further investigation, it turns out that my results are inconsistent based upon whether or not the "schedule(guided)" line is included in the code.

-When I'm utilizing the schedule(guided) line, I generate the fastest solution, regardless of the number of threads. -When I'm using the default scheduler, my results are significantly slower and different values -With schedule(guided) improvement is not gained with increased threads -Without the schedule(guided) I gain improvement with addition of threads

I guess I haven't found a good enough description of what schedule(guided) does for me, I do understand that it tries to split up the loop so that the most time intensive iterations happen first, which should have an effect of the least amount of time that one thread waits for the others to complete their iterations.

It appears that for my ~900 iteration loop, when I use schedule(guided), I'm only processing ~200 iterations, where as without the schedule(guided) I'm processing all 900 iterations. Any thoughts?