For example:
a) int [x][y][z] vs b) int[x*y*z]
Initially thought i'd go with a) for simplicity
I know that Java doesn't store arrays linearly in memory like C does. But what implications does this have for my program?
For example:
a) int [x][y][z] vs b) int[x*y*z]
Initially thought i'd go with a) for simplicity
I know that Java doesn't store arrays linearly in memory like C does. But what implications does this have for my program?
If you choose the latter route then you're going to have to perform arithmetic for every single array access. That's going to be a pain and error prone (unless you wrap it in a class providing this functionality).
I don't believe that there's any (significant) optimisation in choosing your flat array (especially given the arithmetic taken to index into it). As always with optimisations, you would need to perform some measurements and determine if it's really worth it.
Use first variant (3-dimensional) because it's easier for understanding and there are less chances to make some logical error (especially if you're using it for modeling 3-dimensional space)
Usually the best thing to do when searching anwers for such questions is to see how the choices are compiled into JVM bytecode:
multi = new int[50][50];
single = new int[2500];
this is translated into
BIPUSH 50
BIPUSH 50
MULTIANEWARRAY int[][] 2
ASTORE 1
SIPUSH 2500
NEWARRAY T_INT
ASTORE 2
so as you can see the JVM already knows that we are talking about a multi dimensional array..
keeping it further:
for (int i = 0; i < 50; ++i)
for (int j = 0; j < 50; ++j)
{
multi[i][j] = 20;
single[i*50+j] = 20;
}
this is translated (skipping the cycles) into:
ALOAD 1: multi
ILOAD 3: i
AALOAD
ILOAD 4: j
BIPUSH 20
IASTORE
ALOAD 2: single
ILOAD 3: i
BIPUSH 50
IMUL
ILOAD 4: j
IADD
BIPUSH 20
IASTORE
So as you can see the multi-dimensional array is treated internally into the VM, no overhead generated by useless instructions, while using a sigle one uses more instructions since offset is calculated by hand.
I don't think that performance will be such an issue..
EDIT:
I did some simple benchmarks to see what's going down here. I chose try different examples: linear read, linear write and random access. Time are expressed in millisecs (and calculated using System.nanoTime()
.. here are the results:
Linear write
Linear read
Random read
But random one is a little bit misleading since it generates 2 random numbers for multi-dimensional array while just one for single dimensional (and PNRGs may consume some CPU).
Mind that I tried to let JIT work by benchmarking only after the 20th run of the same loop. For completeness my java VM is the following:
java version "1.6.0_17" Java(TM) SE Runtime Environment (build 1.6.0_17-b04) Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
On current CPUs, non-cached memory access is hundreds of times slower than arithmetics (see this presentation). The a) option will result in about 3 memory lookups whereas the b) option will result in about 1 memory lookup. So the b) option can be faster in some situations (it's a hot spot and the array does not fit into the CPU's cache). How much faster? - that will depend on the application.
Personally I would first use the a) option, because it will result in simpler code. If a profiler shows that array access to be a bottleneck, then I would convert it to the b) option, so that there is a pair of helper methods for reading and writing array values (that way the messy code will be restricted to those two methods).
UPDATE:
I made a benchmark for comparing 3-dimensional int arrays ("Multi" column) to the equivalent 1-dimensional int arrays ("Single" column). The code is here and tests here. I ran it on 64-bit jdk1.6.0_18, Windows 7 x64, Core 2 Quad Q6600 @ 3.0 GHz, 4 GB DDR2, using the JVM options -server -Xmx3G -verbose:gc -XX:+PrintCompilation
(I have removed the debug output from the following results). The results were:
Out of 20 repeats, the minimum time in milliseconds is reported.
Array dimensions: 100x100x100 (1000000)
Multi Single
Seq Write 1 1
Seq Read 1 1
Random Read 99 90 (of which generating random numbers 59 ms)
Array dimensions: 200x200x200 (8000000)
Multi Single
Seq Write 14 13
Seq Read 11 8
Random Read 1482 1239 (of which generating random numbers 474 ms)
Array dimensions: 300x300x300 (27000000)
Multi Single
Seq Write 53 46
Seq Read 34 24
Random Read 5915 4418 (of which generating random numbers 1557 ms)
Array dimensions: 400x400x400 (64000000)
Multi Single
Seq Write 123 111
Seq Read 71 55
Random Read 16326 11144 (of which generating random numbers 3693 ms)
This shows the 1-dimensional array to be faster. Probably the reason is that the 3-dimensional array requires more memory access.
I did also some measurements to estimate the overhead of generating the random numbers in the Random Read benchmark by replacing preventOptimizingAway += array.get(x, y, z);
with preventOptimizingAway += x * y * z;
and added the measurements to the above results table by hand. Generating the random numbers takes 1/3 or less of the total time of the Random Read benchmark, so the memory access dominates the benchmark as expected. It would be interesting to repeat this benchmark with arrays of 4 and more dimensions. Probably it would make the speed difference bigger, because the multidimensional array's topmost levels will fit into the CPU's cache, and only the other levels will require a memory lookup.