views:

208

answers:

5

Look at this piece of code:

MessageParser parser = new MessageParser();
for (int i = 0; i < 10000; i++) {
    parser.parse(plainMessage, user);
}

For some reason, it runs SLOWER (by about 100ms) than

for (int i = 0; i < 10000; i++) {
    MessageParser parser = new MessageParser();
    parser.parse(plainMessage, user);
}

Any ideas why? The tests were repeated a lot of times, so it wasn't just random. How could creating an object 10000 times be faster than creating it once?

A: 

There may be some logic to clean up the internal state on a subsequent call to parser.

Did the GC run during your benchmark? It's fairly cheap to instantiate a new object, and it's not a fair comparison if you don't count the time to dispose of all of the objects you created in the faster case.

Eric J.
+9  A: 

Because Java has 'generational garbage collection' and can quickly identify (in the loop) that it doesn't re-use the same object/memory space, so the GC cost is virtually nil. On the other hand, your long-lived object will survive a generational pass over the nursery generation and have to be moved out to the main generation.

In summary, you can't really assume about performance without doing tests to measure it in place.

AlBlue
+1 for not assuming about performance.
Tim
Later JVMs also do escape analysis: they work out which objects do not escape from a block or method and allocate those on the stack, not on the heap. The MessageParser declared inside the for loop is an easy candidate for that optimisation. Depending on how complex the method is, the MessageParser declared outside the for loop might not be.
Nat
No assumptions being made... this isn't even my actual use-case. I posted it because I curious as to why :D
Sudhir Jonathan
@AlBlue so then the cost of checking if an object is garbage collectible is actually less than re-creating it?
Sudhir Jonathan
Objects that are dead (not reachable) in the nursery generation can be thrown away en-masse rather than one at a time. What happens is that the objects which survive a GC gen pass will be promoted into the main heap, which can take some time. In your first example, that's what will happen which may explain the extra hit. (In both cases, it will need to find out if the object is reachable or not anyway). You can see what's happening with java -verbose:gc if you're interested.
AlBlue
A: 

I have no idea what MessageParser does or where it comes from. It may be "leaking" internally. Another possibility is that the object become further away from the data created during the parsing. This means you are likely to get a TLA miss. Also if the the MessageParser keeps internal state, and moves into the tenured generation the GC mechanics of noting that it references new data can be a problem ("card scoring" is jargon that pops to mind).

Tom Hawtin - tackline
The parser has no internal state... just runs a checks if the given string starts with '/' (using startsWith) and returns.
Sudhir Jonathan
A: 

What happens if you benchmark the first example while restricting the scope of parser, ie

{
    MessageParser parser = new MessageParser();
    for (int i = 0; i < 10000; i++) {
        parser.parse(plainMessage, user);
    }
}
// `parser` no longer visible

I'd expect this to be fastest as only one object has to be created, and the VM still knows that parser can be gc'd immediately after the loop.

Christoph
Thats how it is right now in both test cases... both are in their own block.
Sudhir Jonathan
A: 

how did you measure the running time of your program? I would like to know

Prasanth