ansaurus

Question

Efficiency of nested Loop

Answer 1

+1 A:

The question shifted. These are not the droids you seek...

~~Because you are doing ~1000000 times more work in the first example. ;-)~~

Sky Sanders 2010-03-31 06:07:41

did you mean 1000000 ?

Mitch Wheat 2010-03-31 06:09:31

is it 1000000 times ?

NightCoder 2010-03-31 06:11:19

It's a bit more than 10 000 but good spotting anyway

too much php 2010-03-31 06:14:02

Are you pointing out the same thing as my answer starts with? To me, "10000 times more work" suggests that you think that if the second loop took 1 second, the first loop should take 10000s... whereas in terms of iterations, it's a much smaller margin - it's doing about 10% more work than the second.

Jon Skeet 2010-03-31 06:15:58

I'd give this a +1 if you had explained your answer. As it is, it's not very helpful. Jon's answer is much more complete.

zombat 2010-03-31 06:18:05

@ALL - the answer to all of your questions is... Yes. fixing now. ;-)

Sky Sanders 2010-03-31 06:23:59

@Jon - I think we are on the same page but I see an implied workload `//do some stuff`. I think your comment is correct when addressing an empty loop but with a workload, the difference would be ~1000000, right?

Sky Sanders 2010-03-31 06:27:54

Sorry I made a mistake about my snippet, the variable i,j in above nested loop are all started with 0

didxga 2010-03-31 06:28:44

@Sky: The workload was in the inner loop in both cases. There would be a difference of about a million iterations *in total*, but the first loop wouldn't be a million *times* slower.

Jon Skeet 2010-03-31 06:56:45

@Jon- gotcha. I guess my one line answer wasn't very clear. It will be a million times the duration of doSomeStuff longer.

Sky Sanders 2010-03-31 07:03:15

@Sky: Yes. That's what I suspected you might mean, but the wording wasn't clear.

Jon Skeet 2010-03-31 07:13:33

Answer 2

+6 A:

EDIT: Original answer is below. Now that you've fixed the example so that all loop variables start at 0, we're back to simply not having enough information. It seems likely that it's a cache coherency / locality of reference issue - but we're just guessing. If you could provide a short but complete program which demonstrates the problem, that would help... as would telling us which language/platform we're talking about to start with!

The first loop has 10 * 999999 = 9999990 iterations. The second loop has 1000000 * 9 = 9000000 iterations. I would therefore expect (all other things being equal) the first loop to take longer.

However, you haven't indicated what work you're doing or what platform this is on. There are many things which could affect things:

The second loop may hit a cache better
If you're using a JIT-compiled platform, the JIT may have chosen to optimise the second loop more heavily.
The operations you're performing may themselves have caching or something like that
If you're performing a small amount of work but it first needs to load and initialize a bunch of types, that could cause the first loop to be slower

Jon Skeet 2010-03-31 06:14:17

Thanks Skeet, but i am very Sorry I made a mistake about my snippet, the variable i,j in above nested loop are all started with 0.

didxga 2010-03-31 06:28:21

Answer 3

+2 A:

If you look at the generated byte code, the two loops are almost identical. EXCEPT that when it does the while-condition for the 10 loop, Java gets the 10 as an immediate value from within the instruction, but when it does the while-condition for the 1000000 loop, Java loads the 1000000 from a variable. I don't have any info on how long it takes to execute each instruction, but it seems likely that an immediate load will be faster than a load from a variable.

Note, then, that in the first loop, the compare against 1000000 must be done 10 million times while in the second loop it is only done 1 million times. Of course the compare against 10 is done much more often in the second loop, but if the variable load is much slower than the immediate load, that would explain the results you are seeing.

Jay 2010-03-31 07:36:18

Thanks Jay, for your insightful explanation.

didxga 2010-03-31 08:06:10

Answer 4

+1 A:

This answer is for the updated question:

If you're accessing two dimensional array such as int[][], the one with the larger value in the inner loop should be slower. Not by much but still. To somewhat understand the problem, read about Shlemiel the street painter in one of Joel's blog posts.
The reason you're getting inconsistent results is that you're not performing any JVM warmup. JVM constantly analyzes the bytecode that is run and optimizes it, usually only after 30 to 50 iterations it runs at optimal speed. Yes, this means you need to run the code first a couple of dozen times and then benchmark it from an average of another couple dozen runs because of Garbage Collector which will slow couple of runs.
General note, using Long object instead of long primitive is just dumb, JVM most likely optimizes it by replacing it with the primitive one if it can and if it can't, there's bound to be some (albeit extremely minor) constant slowdown from using it.

Esko 2010-03-31 07:47:19

Thanks Esko, your explanation is very helpful. I have some insight about this problem now.

didxga 2010-03-31 07:56:32

Hmm, perhaps you could expain point #1. Why would the speed of accessing an array element depend on how I generated the indexes? Perhaps you mean that retrieving from an array defined as int[10][1000000] would be faster than if it was int[100000][10]? If that's what you meant, I think you are just incorrect. Java stores a two dimensional array as an array of arrays. So accessing an element is "index into the big array to get a small array, then index into the small array to get the element". Maybe there's some subtle technical complexity I'm missing, but it looks to me like the relative ...

Jay 2010-03-31 16:43:07

... size of the indexes should make no differences.

Jay 2010-03-31 16:43:42

RE point #2: When I do performance tests, I always enclose them in a loop that runs at least 10 or 20 times and outputs the time for each. I also do a gc after calculating the time for one trial but before starting the next. I still get very inconsistent results, but at least you can see a general pattern.

Jay 2010-03-31 16:45:22

ansaurus

tags:

views:

answers:

Efficiency of nested Loop

The question shifted. These are not the droids you seek...

related questions