views:

170

answers:

3

Hi.

I'm writing a ray tracer.

Recently, I added threading to the program to exploit the additional cores on my i5 Quad Core.

In a weird turn of events the debug version of the application is now running slower, but the optimized build is running faster than before I added threading.

I'm passing the "-g -pg" flags to gcc for the debug build and the "-O3" flag for the optimized build.

Host system: Ubuntu Linux 10.4 AMD64.

I know that debug symbols add significant overhead to the program, but the relative performance has always been maintained. I.e. a faster algorithm will always run faster in both debug and optimization builds.

Any idea why I'm seeing this behavior?

Debug version is compiled with "-g3 -pg". Optimized version with "-O3".

Optimized no threading:        0m4.864s
Optimized threading:           0m2.075s

Debug no threading:            0m30.351s
Debug threading:               0m39.860s
Debug threading after "strip": 0m39.767s

Debug no threading (no-pg):    0m10.428s
Debug threading (no-pg):       0m4.045s

This convinces me that "-g3" is not to blame for the odd performance delta, but that it's rather the "-pg" switch. It's likely that the "-pg" option adds some sort of locking mechanism to measure thread performance.

Since "-pg" is broken on threaded applications anyway, I'll just remove it.

+1  A: 

You are talking about two different things here. Debug symbols and compiler optimization. If you use the strongest optimization settings the compiler has to offer, you do so at the consequence of losing symbols that are useful in debugging.

Your application is not running slower due to debugging symbols, its running slower because of less optimization done by the compiler.

Debugging symbols are not 'overhead' beyond the fact that they occupy more disk space. Code compiled at maximum optimization (-O3) should not be adding debug symbols. That's a flag that you would set when you have no need for said symbols.

If you need debugging symbols, you gain them at the expense of losing compiler optimization. However, once again, this is not 'overhead', its just the absence of compiler optimization.

Tim Post
I know what the difference is between debugging symbols and optimization. The problem I'm seeing is that the relative performance of two algorithms are different in the optimized version and the debug version.
fluffels
@fluffels - I don't see why this is unexpected?
Tim Post
I was expecting that the debug symbols / optimizations would have more or less the same effect on both algorithms, maintaining the relative performance between them.In retrospect that seems naive.But since the second algorithm is threaded and the problem is highly parallel, I would've still expected the relative performance to remain the same.I just want to know *why* I was wrong ;)
fluffels
+4  A: 

What do you get without the -pg flag? That's not debugging symbols (which don't affect the code generation), that's for profiling (which does).

It's quite plausible that profiling in a multithreaded process requires additional locking which slows the multithreaded version down, even to the point of making it slower than the non-multithreaded version.

caf
That's a good idea, I'm going to go test this.
fluffels
+1  A: 

Is the profile code inserting instrumentation calls in enough functions to hurt you?
If you single-step at the assembly language level, you'll find out pretty quick.

Mike Dunlavey
Good idea, thanks. I'll go check it out.
fluffels