I'm using the threaded version of FFTW (a FFT library) to try to speed up some code on a dual CPU machine. Here is the output of time w/ only 1 thread:
131.838u 1.979s 2:13.91 99.9%
Here it is with 2 threads:
166.261u 30.392s 1:52.67 174.5%
The user times and the CPU load percentages seem to indicate that it is threading pretty effectively, but the wallclock time (which is what I really care about) tells me (I think) that it is taking around 28 extra seconds to deal with the threads. Is that an accurate way to describe the situation? If so, is it fairly normal, or do I probably have something configured incorrectly? Thanks for any light.