views:

288

answers:

6

I have been using Yourkit 8.0 to profile a mathematically-intensive application running under Mac OS X (10.5.7, Apple JDK 1.6.0_06-b06-57), and have noticed some strange behavior in the CPU profiling results.

For instance - I did a profiling run using sampling, which reported that 40% of the application's 10-minute runtime was spent in the StrictMath.atan method. I found this puzzling, but I took it at it's word and spent a bit of time replacing atan with an extremely simply polynomial fit.

When I ran the application again, it took almost exactly the same time as before (10 minutes) - but my atan replacement showed up nowhere in the profiling results. Instead, the runtime percentages of the other major hotspots simply increased to make up for it.

To summarize:

RESULTS WITH StrictMath.atan (native method)
Total runtime: 10 minutes
Method 1: 20%
Method 2: 20%
Method 3: 20%
StrictMath.atan: 40%

RESULTS WITH simplified, pure Java atan
Total runtime: 10 minutes
Method 1: 33%
Method 2: 33%
Method 3: 33%

(Methods 1,2,3 do not perform any atan calls)

Any idea what is up with this behavior? I got the same results using EJ-Technologies' JProfiler. It seems like the JDK profiling API reports inaccurate results for native methods, at least under OS X.

+2  A: 

This can happen because of inconsistencies of when samples are taken. So for example, if a method uses a fair amount of time, but doesn't take very long to execute, it is possible for the sampling to miss it. Also, I think garbage collection never happens during a sample, but if some code causes a lot of garbage collection it can greatly contribute to a slowdown without appearing in the sample.

In similar situation I've found it very helpful to run twice, once with tracing as well as once with sampling. If a method appears in both it is probably using a lot of CPU, otherwise it could well just be an artifact of the sampling process.

Nick Fortescue
A: 

I find YourKit greatly exaggerates the cost of calling sub-methods (due to its logging method, I assume). If you only follow the advice that the profile gives you you'll end up just merging functions with no real gain as the HotSpot usually does excellently for this.

Therefore, I'd highly advise to test batches completely outside profilers too, to get a better idea whether changes are really beneficial (it may seem obvious but this cost me some development time).

Pool
A: 

Since you're using a Mac, you might try Apple's Shark profiler (free download from ADC) which has Java support and Apple's performance group has put a fair amount of time into the tool.

As Nick pointed out, sampling can be misleading if the sample interval is close enough to the function's execution time and the profiler rarely checks when the function is actually executing. I don't know whether YourKit supports this but in Shark you can change the sampling interval to something other than the default 10ms and see if the results are substantially different. There's also a separate call-tracing mode which will record every function enter/return - this completely avoids the possibility of aliasing errors but collects a ton of data and higher overhead, which could matter if your app is doing any sort of real-time processing.

Chris Adams
A: 

You may want to look at the parameters that are being passed into the three methods. It may be that the time is being spent generating return values or in methods that are creating a lot of temporary objects.

Dan
A: 

Profilers can be like that.

This is the method I use.

Works every time.

And this is why.

Mike Dunlavey
A: 

It worth noting that Java methods can be inlined if they are small enough, however native methods are inlined under different rules. If a method is inlined it doesn't appear in the profiler (certainly not YourKit anyway)

Peter Lawrey