tags:

views:

205

answers:

4

See the following snippet:

    Long first_begin = System.currentTimeMillis();

    // first nested loops
    for (int i = 0; i < 10; i++) {
        for (int j = 0; j < 1000000; j++) {
            // do some stuff
        }
    }
    System.out.println(System.currentTimeMillis() - first_begin);
    // second nested loops
    Long seconde_begin = System.currentTimeMillis();
    for (int i = 0; i < 1000000; i++) {
        for (int j = 0; j < 10; j++) {
            // do some stuff
        }
    }
    System.out.println(System.currentTimeMillis() - seconde_begin);

I am wondering why the first nested loops is running slower than the second one?

Regards!

Important Note!: I am sorry i made the variable j beginning with 1 accidentally when this question is first asked, i have made the correction.

Update:there is not any specific logic within the loops, i am just doing some test, actually this is a question asked during an interview and the interviewer hint me to change the order of loops to achieve better performance. BTW, I am using JDK1.5. after some test i am more confused now, because the result of program is not consistent---sometime the first loop running faster than the second one, but most of the time it's running slower than second one.

+1  A: 

The question shifted. These are not the droids you seek...

Because you are doing ~1000000 times more work in the first example. ;-)

Sky Sanders
did you mean 1000000 ?
Mitch Wheat
is it 1000000 times ?
NightCoder
It's a bit more than 10 000 but good spotting anyway
too much php
Are you pointing out the same thing as my answer starts with? To me, "10000 times more work" suggests that you think that if the second loop took 1 second, the first loop should take 10000s... whereas in terms of iterations, it's a much smaller margin - it's doing about 10% more work than the second.
Jon Skeet
I'd give this a +1 if you had explained your answer. As it is, it's not very helpful. Jon's answer is much more complete.
zombat
@ALL - the answer to all of your questions is... Yes. fixing now. ;-)
Sky Sanders
@Jon - I think we are on the same page but I see an implied workload `//do some stuff`. I think your comment is correct when addressing an empty loop but with a workload, the difference would be ~1000000, right?
Sky Sanders
Sorry I made a mistake about my snippet, the variable i,j in above nested loop are all started with 0
didxga
@Sky: The workload was in the inner loop in both cases. There would be a difference of about a million iterations *in total*, but the first loop wouldn't be a million *times* slower.
Jon Skeet
@Jon- gotcha. I guess my one line answer wasn't very clear. It will be a million times the duration of doSomeStuff longer.
Sky Sanders
@Sky: Yes. That's what I suspected you might mean, but the wording wasn't clear.
Jon Skeet
+6  A: 

EDIT: Original answer is below. Now that you've fixed the example so that all loop variables start at 0, we're back to simply not having enough information. It seems likely that it's a cache coherency / locality of reference issue - but we're just guessing. If you could provide a short but complete program which demonstrates the problem, that would help... as would telling us which language/platform we're talking about to start with!


The first loop has 10 * 999999 = 9999990 iterations. The second loop has 1000000 * 9 = 9000000 iterations. I would therefore expect (all other things being equal) the first loop to take longer.

However, you haven't indicated what work you're doing or what platform this is on. There are many things which could affect things:

  • The second loop may hit a cache better
  • If you're using a JIT-compiled platform, the JIT may have chosen to optimise the second loop more heavily.
  • The operations you're performing may themselves have caching or something like that
  • If you're performing a small amount of work but it first needs to load and initialize a bunch of types, that could cause the first loop to be slower
Jon Skeet
Thanks Skeet, but i am very Sorry I made a mistake about my snippet, the variable i,j in above nested loop are all started with 0.
didxga
+2  A: 

If you look at the generated byte code, the two loops are almost identical. EXCEPT that when it does the while-condition for the 10 loop, Java gets the 10 as an immediate value from within the instruction, but when it does the while-condition for the 1000000 loop, Java loads the 1000000 from a variable. I don't have any info on how long it takes to execute each instruction, but it seems likely that an immediate load will be faster than a load from a variable.

Note, then, that in the first loop, the compare against 1000000 must be done 10 million times while in the second loop it is only done 1 million times. Of course the compare against 10 is done much more often in the second loop, but if the variable load is much slower than the immediate load, that would explain the results you are seeing.

Jay
Thanks Jay, for your insightful explanation.
didxga
+1  A: 

This answer is for the updated question:

  • If you're accessing two dimensional array such as int[][], the one with the larger value in the inner loop should be slower. Not by much but still. To somewhat understand the problem, read about Shlemiel the street painter in one of Joel's blog posts.
  • The reason you're getting inconsistent results is that you're not performing any JVM warmup. JVM constantly analyzes the bytecode that is run and optimizes it, usually only after 30 to 50 iterations it runs at optimal speed. Yes, this means you need to run the code first a couple of dozen times and then benchmark it from an average of another couple dozen runs because of Garbage Collector which will slow couple of runs.
  • General note, using Long object instead of long primitive is just dumb, JVM most likely optimizes it by replacing it with the primitive one if it can and if it can't, there's bound to be some (albeit extremely minor) constant slowdown from using it.
Esko
Thanks Esko, your explanation is very helpful. I have some insight about this problem now.
didxga
Hmm, perhaps you could expain point #1. Why would the speed of accessing an array element depend on how I generated the indexes? Perhaps you mean that retrieving from an array defined as int[10][1000000] would be faster than if it was int[100000][10]? If that's what you meant, I think you are just incorrect. Java stores a two dimensional array as an array of arrays. So accessing an element is "index into the big array to get a small array, then index into the small array to get the element". Maybe there's some subtle technical complexity I'm missing, but it looks to me like the relative ...
Jay
... size of the indexes should make no differences.
Jay
RE point #2: When I do performance tests, I always enclose them in a loop that runs at least 10 or 20 times and outputs the time for each. I also do a gc after calculating the time for one trial but before starting the next. I still get very inconsistent results, but at least you can see a general pattern.
Jay

related questions