views:

400

answers:

4

I'm using TPTP to profile some slow running Java code an I came across something interesting. One of my private property getters has a large Base Time value in the Execution Time Analysis results. To be fair, this property is called many many times, but I never would have guessed a property like this would take very long:

public class MyClass{
    private int m_myValue;    
    public int GetMyValue(){
        return m_myValue;
    }
}

Ok so there's obviously more stuff in the class, but as you can see there is nothing else happening when the getter is called (just return an int). Some numbers for you:

  • About 30% of the Calls of the run are on the getter (I'm working to reduce this)
  • About 25% of the base time of the run is spent in this getter
  • Average base time is 0.000175s

For comparison, I have another method in a different class that uses this getter:

private boolean FasterMethod(MyClass instance, int value){
    return instance.GetMyValue() > m_localInt - value;
}

Which has a much lower average base time of 0.000018s (one order of magnitude lower).

What's the deal here? I assume there is something that I don't understand or something I'm missing:

  1. Does returning a local primitive really take longer than returning a calculated value?
  2. Should I look at metric other than Base Time?
  3. Are these results misleading and I need to consider some other profiling tool?

Edit 1: Based on some suggestions below, I marked the method as final and re-ran the test, but I got the same results.

Edit 2: I installed a demo version of YourKit to re-run my performance tests, and the YourKit results look much closer to what I was expecting. I will continue to test YourKit and report back what I find.

Edit 3: Changing to YourKit seems to have resolved my issue. I was able to use YourKit to determine the actual slow points in my code. There are some excellent comments and posts below (upvoted appropriately), but I'm accepting the first person to suggest YourKit as "correct." (I am not affiliated with YourKit in any way / YMMV)

+2  A: 

It does sound slightly misleading - perhaps the profiler is removing some optimizations?

Just for kicks, try making the method final, which will make it easier to inline. That may well be the difference between the property and FasterMethod. In real use, HotSpot will inline even virtual methods until the first time they're overridden (IIRC).

EDIT: Responding to Brian's comment: Yes, it's usually the case that making something final won't help performance (although it may be a good thing in terms of design :) because Hotspot will normally work out whether it can inline or not based on whether it's overridden or not. I was suggesting this profiler may have messed with that.

EDIT: I've now managed to reproduce the way that HotSpot "undoes" optimisation of classes which haven't been extended yet (or methods which haven't been overridden). This was harder to do for the server VM than the client, but I've done it :)

public class Test
{
    public static void main(String[] args)
        throws Exception
    {
        final long iterations = 1000000000L;
        Base b = new Base();
        // Warm up Hotspot
        time(b, 1000);

        // Before we load Derived
        time(b, iterations);

        // Load Derived and use it quickly
        // (Just loading is enough to make the client VM
        // undo its optimizations; the server VM needs more effort)
        Base d = (Base) Class.forName("Derived").newInstance();
        time(d, 1);

        // Time it again with Base
        time(b, iterations);
    }

    private static void time(Base b, long iterations)
    {
        long total = 0;
        long start = System.currentTimeMillis();
        for (long i = 0; i < iterations; i++)
        {
            total += b.getValue();
        }
        long end = System.currentTimeMillis();
        System.out.println("Time: " + (end-start));
        System.out.println("Total: " + total);
    }
}

class Base
{
    public int getValue() { return 1; }
}

class Derived extends Base
{
    @Override
    public int getValue() { return 2; }
}
Jon Skeet
I'd be interested if final makes a difference. In theory the JVM can safely cache that value and not have to read (the slow) main memory.
Steve Kuo
See Brian Goetz's info here re. finalisation - http://www.ibm.com/developerworks/java/library/j-jtp04223.html
Brian Agnew
FYI - I marked the method as final and re-ran the test, but I got the same results.
Mark
Right. Ah well. I'm editing my answer with an example of overriding making a difference anyway, just because it's interesting :)
Jon Skeet
Yes, Hotspot can inline things and then undo the inlining if a class is loaded that breaks the inlining. C++ cannot (easily) inline virtual methods (ther are compilers that can do it apparently, but racking runtime usage). A good example for future Java is slow debates :-)
TofuBeer
@TofuBeer: Yes, it's one area where Java can win over .NET, too, as .NET just JITs once.
Jon Skeet
oops, racking = tracking. Good to know about .NET.
TofuBeer
If you look at the source HotSpot on uses final for looking up itself whether the method is overridden - it's not used outside of compilation and validation at all (it is only extra work for the validator).
Tom Hawtin - tackline
+1  A: 

That sounds very peculiar. You're not calling an overriding method by mistake, are you ?

I would be tempted to download a demo version of YourKit. It's trivial to set up, and it should give an indication as to what's really occurring. If both TPTP and YourKit agree, then something peculiar is happening (and I know that's not a lot of help!)

Brian Agnew
Good question, but there is no overriding method.
Mark
+4  A: 

If possible try using another profiler (the Netbeans one works well). This may be hard to do depending on how your code is setup.

It is possible that, just like many other tools, a different profiler will result in different information.

Does returning a local primitive really take longer than returning a calculated value?

Returning an instance variable takes longer than returning an local variable (VM dependent). The getter that you have is simple so it should be inlined, so it becomes as fast as accessing a public instance variable (which, again, is slower than accessing a local variable).

But you don't have a local value (local meaning in the method as opposed to in the class).

What exactly do you mean by "local"?

Should I look at metric other than Base Time?

I have not used the Eclipse tools, so I am not sure how it works... it might make a difference if it is a tracing or a sampling profiler (the two can give different results for things like this).

Are these results misleading and I need to consider some other profiling tool?

I would consider another tool, just to see if the result is the same.

Edit based on comments:

If it is a sampling profiler what happens, essentially, that every "n-time units" the program is sampled to see where the program is. If you are calling the one method way more than the other it will show up as being in the method that is called more (it is simply more likely that that method is being run).

A tracing profiler adds code to your program (a process known as instrumentation) to essentially log what is going on.

Tracing profilers are slower but more accurate, they also require that the program be changed (the instrumentation process) which could potentially introduce bugs (not that I have heard of it happening... but I am sure it does at least while they are developing the profiler).

Sampling profilers are faster but less accurate (they just guess at how often a line of code is executed).

So, if Eclipse uses a sampling profiler you could see what you consider to be strange behaviour. Changing to a tracing profiler would show more accurate results.

If Eclipse uses a tracing profiler then chaning profilers should show the same result (however they new profiler may make it more obvious to you as to what is going on).

TofuBeer
You are correct: I typed the wrong thing. Instead of "local" variable, I should have said "instance" variable. Part of my confusion comes in on the FasterMethod() method. I use both local (passed-in) and instance variables, but it is faster (according to TPTP) than use using an instance variable.
Mark
Digging around I _think_ that Eclipse uses a sampling profiler. It also sounds like it is possible to make it perform tracing as well, either through plugins or options (sorry, not an Eclipse user).
TofuBeer
A: 

Something that used to make a lot of difference to performance of these sort of methods (although this may be to some extent historical) is that the size of the calling method can be an issue. HotSpot (and serious rivals) will happily inline small methods (some may choke on synchronized/try-finally). However, if the calling method is large, then it may not. This was particularly a problem with old versions of the HotSpot C1/client which had a really bad register allocation algorithm (it now has an algorithm that is both quite good and fast).

Tom Hawtin - tackline