tags:

views:

681

answers:

3

O Groovy Gurus,

This code snippet runs in around 1 second

 for (int i in (1..10000000)) {
   j = i;
 }

while this one takes almost 9 second

 for (int i = 1; i < 10000000; i++) {
   j = i;
 }

Why is it so?

+1  A: 

In your testing, be sure to "warm" the JVM up before taking the measure, otherwise you may wind up triggering various startup actions in the platform (class loading, JIT compilation). Run your tests many times in a row too. Also, if you did the second test while a garbage collect was going on, that might have an impact. Try running each of your tests 100 times and print out the times after each test, and see what that tells you.

Jim Ferrans
+1  A: 

If you can eliminate potential artifacts from startup time as Jim suggests, then I'd hazard a guess that the Java-style for loop in Groovy is not so well implemented as the original Groovy-style for loop. It was only added as of v1.5 after user requests, so perhaps its implementation was a bit of an afterthought.

Have you taken a look at the bytecode generated for your two examples to see if there are any differences? There was a discussion about Groovy performance here in which one of the comments (from one 'johnchase') says this:

I wonder if the difference you saw related to how Groovy uses numbers (primitives) - since it wraps all primitives in their equivalent Java wrapper classes (int -> Integer), I’d imagine that would slow things down quite a bit. I’d be interested in seeing the performance of Java code that loops 10,000,000 using the wrapper classes instead of ints.

So perhaps the original Groovy for loop does not suffer from this? Just speculation on my part really though.

Xiaofu
Xiaofu, these are good points. The thing I find confusing is that the bottom loop should be the faster one, since it deals with ints only, not Integers (though j's type is not specified). The top loop should be slower given the sequence.
Jim Ferrans
+4  A: 

Ok. Here is my take on why?

If you convert both scripts to bytecode, you will notice that

  1. ForInLoop uses Range. Iterator is used to advance during each loop. Comparison (<) is made directly to int (or Integer) to determine whether the exit condition has been met or not
  2. ForLoop uses traditional increment, check condition, and perform action. For checking condition i < 10000000 it uses Groovy's ScriptBytecodeAdapter.compareLessThan. If you dig deep into that method's code, you will find both sides of comparison is taken in as Object and there are so many things going on, casting, comparing them as object, etc.

ScriptBytecodeAdapter.compareLessThan --> ScriptBytecodeAdapter.compareTo --> DefaultTypeTransformation.compareTo

There are other classes in typehandling package which implements compareTo method specifically for math data types, not sure why they are not being used, (if they are not being used)

I am suspecting that is the reason second loop is taking longer. Again, please correct me if I am wrong or missing something...

peacefulfire
That almost certainly explains it. I know the Java-style for loop offers more flexibility in what you can do, but surely they could have applied some optimization to its most basic (and most commonly used) form so that it performs as well as the for..in loop? That's a small performance trap for people coming over from Java or C#...
Xiaofu
No one ever expects this much disparity in these two operations. Areas where groovy could improve, especially since similar code in Java executes in 300 mS.
rest_day
+1 for looking at the bytecode.
Leonel