ansaurus

Question

Benchmarking small Arrays vs. Lists in Java: Is my benchmarking code wrong?

Answer 1

+1 A:

Your benchmark is only valid if your actual use case matches the benchmark code, i.e. very few operations on each element, so that execution time is largely determined by access time rather than the operations themselves. If that is the case then yes, you should be using arrays if performance is critical. If however your real use case involves a lot more actual computation per element, then the access time per element will become a lot less significant.

Paul R 2010-02-09 09:33:24

Thanks Paul, understood and thanks for that: yes, the real cases like this are all over the app. In fact, addition is even heavier than what I'm doing usually, which is to ask the element if it wants the handle the current op or not (filters). This is a pattern (what's it called) that I use all over the app.

Yar 2010-02-09 09:36:40

Answer 2

A:

It is probably not valid. If I understand the way that JIT compilers work, compiling a method won't affect a call to that method that is already executing. Since the main method is only called once, it will end up being interpreted, and since most of the work is done in the body of that method, the numbers you get won't be particularly indicative of normal execution.

JIT compilation effects may go some way to explain why the no collections case was slower that the arrays case. That result is counter-intuitive, and it places a doubt on the other benchmark result that you reported.

Stephen C 2010-02-09 10:03:50

Real applications don't do what kind of thing? Call small methods on a huge list of objects?

Yar 2010-02-09 10:06:48

No ... they don't repeatedly apply the same method to the elements of a 2 element array / list. In the array case, the compiler might even be unrolling the inner loop.

Stephen C 2010-02-09 10:22:12

Thinking about it hours later, that's a reasonable concern. I was initially thinking about very short lists being iterated very frequently, but it's true that this test would have nothing to say about long lists. Just tried it with longer lists and the difference is about the same. However, what you say about the static `main` may be true.

Yar 2010-02-09 14:28:46

Answer 3

+1 A:

Microbenchmarks are very, very hard to get right on a platform like Java. You definitely have to extract the code to be benchmarked into separate methods, run them a few thousand times as warmup and then measure. I've done that (code below) and the result is that direct access through references is then three times as fast as through an array, but the collection is still slower by a factor of 2.

These numbers are based on the JVM options -server -XX:+DoEscapeAnalysis. Without -server, using the collection is drastically slower (but strangely, direct and array access are quite a bit faster, indicating that there is something weird going on). -XX:+DoEscapeAnalysis yields another 30% speedup for the collection, but it's very much questionabled whether it will work as well for your actual production code.

Overall my conclusion would be: forget about microbenchmarks, they can too easily be misleading. Measure as close to production code as you can without having to rewrite your entire application.

import java.util.ArrayList;

public class ArrayTest {

    static int RUNS_INNER = 1000;
    static int RUNS_WARMUP = 10000;
    static int RUNS_OUTER = 100000;

    public static void main(String[] args) {
        long t1;
        long t2;

        Test test1 = new Test();
        test1.thing = (int)Math.round(100*Math.random());
        Test test2 = new Test();
        test2.thing = (int)Math.round(100*Math.random());

        for(int i=0; i<RUNS_WARMUP; i++)
        {
            testRefs(test1, test2);            
        }
        t1 = System.nanoTime();
        for(int i=0; i<RUNS_OUTER; i++)
        {
            testRefs(test1, test2);            
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1)/1000000.0 + " How long NO collection");

        ArrayList<Test> list = new ArrayList<Test>(1);
        list.add(test1);
        list.add(test2);
        // tried this too: helps a tiny tiny bit 
        list.trimToSize();

        for(int i=0; i<RUNS_WARMUP; i++)
        {
            testColl(list);
        }
        t1= System.nanoTime();

        for(int i=0; i<RUNS_OUTER; i++)
        {
            testColl(list);
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1)/1000000.0 + " How long collection");


        Test[] array = new Test[2];
        list.toArray(array);

        for(int i=0; i<RUNS_WARMUP; i++)
        {
            testArr(array);            
        }
        t1= System.nanoTime();

        for(int i=0; i<RUNS_OUTER; i++)
        {
            testArr(array);
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1)/1000000.0 + " How long array ");

    }

    private static void testArr(Test[] array)
    {
        for (int i=0; i<RUNS_INNER; i++) {
            for (Test test : array) {
                test.changeThing(i);
            }
        }
    }

    private static void testColl(ArrayList<Test> list)
    {
        for (int i=0; i<RUNS_INNER; i++) {
            for (Test eachTest : list) {
                eachTest.changeThing(i);
            }
        }
    }

    private static void testRefs(Test test1, Test test2)
    {
        for (int i=0; i<RUNS_INNER; i++) {
            test1.changeThing(i);
            test2.changeThing(i);
        }
    }
}

class Test {
    int thing;
    int thing2;
    public void changeThing(int addThis) {
        thing2 = addThis + thing;
    }
}

Michael Borgwardt 2010-02-09 10:29:29

This is awesome, thanks Michael, just the guidance I was looking for. I knew that "microbenchmarks are misleading" but with your example in hand, I can begin to see how my example mislead me.

Yar 2010-02-09 10:49:17

The issue is that I won't be running with -client since it's a desktop app. Or is that not right?

Yar 2010-02-09 14:26:36

@yar: you mean -server, right? There's nothing that prevents you from running a desktop app with the server VM. It will take longer to start up and probably use more memory, but if execution speed is your priority, you should use the server VM.

Michael Borgwardt 2010-02-09 14:31:31

Thanks Michael, I'll need to check the memory footprint. If it's not ridiculous than -server might work. Though for right now I'm pre-optimizing, it'll be a nice trick to keep in my hat for later.

Yar 2010-02-09 16:35:52

ansaurus

tags:

views:

answers:

Benchmarking small Arrays vs. Lists in Java: Is my benchmarking code wrong?

related questions