views:

102

answers:

3

Disclaimer: I have looked through this question and this question but they both got derailed by small details and general optimization-is-unnecessary concerns. I really need all the performance I can get in my current app, which is receiving-processing-spewing MIDI data in realtime. Also it needs to scale up as well as possible.

I am comparing array performance on a high number of reads for small lists to ArrayList and also to just having the variables in hand. I'm finding that an array beats ArrayList by a factor of 2.5 and even beats just having the object references.

What I would like to know is:

  1. Is my benchmark okay? I have switched the order of the tests and number of runs with no change. I've also used milliseconds instead of nanoseconds to no avail.
  2. Should I be specifying any Java options to minimize this difference?
  3. If this difference is real, in this case shouldn't I prefer Test[] to ArrayList<Test> in this situation and put in the code necessary to convert them? Obviously I'm reading a lot more than writing.

JVM is Java 1.6.0_17 on OSX and it is definitely running in Hotspot mode.

  public class ArraysVsLists {

    static int RUNS = 100000;

    public static void main(String[] args) {
        long t1;
        long t2;

        Test test1 = new Test();
        test1.thing = (int)Math.round(100*Math.random());
        Test test2 = new Test();
        test2.thing = (int)Math.round(100*Math.random());

        t1 = System.nanoTime();

        for (int i=0; i<RUNS; i++) {
            test1.changeThing(i);
            test2.changeThing(i);
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1) + " How long NO collection");

        ArrayList<Test> list = new ArrayList<Test>(1);
        list.add(test1);
        list.add(test2);
        // tried this too: helps a tiny tiny bit 
        list.trimToSize();

        t1= System.nanoTime();

        for (int i=0; i<RUNS; i++) {
            for (Test eachTest : list) {
                eachTest.changeThing(i);
            }
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1) + " How long collection");


        Test[] array = new Test[2];
        list.toArray(array);

        t1= System.nanoTime();

        for (int i=0; i<RUNS; i++) {
            for (Test test : array) {
                test.changeThing(i);
            }
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1) + " How long array ");

    }
}

class Test {
    int thing;
    int thing2;
    public void changeThing(int addThis) {
        thing2 = addThis + thing;
    }
}
+1  A: 

Your benchmark is only valid if your actual use case matches the benchmark code, i.e. very few operations on each element, so that execution time is largely determined by access time rather than the operations themselves. If that is the case then yes, you should be using arrays if performance is critical. If however your real use case involves a lot more actual computation per element, then the access time per element will become a lot less significant.

Paul R
Thanks Paul, understood and thanks for that: yes, the real cases like this are all over the app. In fact, addition is even heavier than what I'm doing usually, which is to ask the element if it wants the handle the current op or not (filters). This is a pattern (what's it called) that I use all over the app.
Yar
A: 

It is probably not valid. If I understand the way that JIT compilers work, compiling a method won't affect a call to that method that is already executing. Since the main method is only called once, it will end up being interpreted, and since most of the work is done in the body of that method, the numbers you get won't be particularly indicative of normal execution.

JIT compilation effects may go some way to explain why the no collections case was slower that the arrays case. That result is counter-intuitive, and it places a doubt on the other benchmark result that you reported.

Stephen C
Real applications don't do what kind of thing? Call small methods on a huge list of objects?
Yar
No ... they don't repeatedly apply the same method to the elements of a 2 element array / list. In the array case, the compiler might even be unrolling the inner loop.
Stephen C
Thinking about it hours later, that's a reasonable concern. I was initially thinking about very short lists being iterated very frequently, but it's true that this test would have nothing to say about long lists. Just tried it with longer lists and the difference is about the same. However, what you say about the static `main` may be true.
Yar
+1  A: 

Microbenchmarks are very, very hard to get right on a platform like Java. You definitely have to extract the code to be benchmarked into separate methods, run them a few thousand times as warmup and then measure. I've done that (code below) and the result is that direct access through references is then three times as fast as through an array, but the collection is still slower by a factor of 2.

These numbers are based on the JVM options -server -XX:+DoEscapeAnalysis. Without -server, using the collection is drastically slower (but strangely, direct and array access are quite a bit faster, indicating that there is something weird going on). -XX:+DoEscapeAnalysis yields another 30% speedup for the collection, but it's very much questionabled whether it will work as well for your actual production code.

Overall my conclusion would be: forget about microbenchmarks, they can too easily be misleading. Measure as close to production code as you can without having to rewrite your entire application.

import java.util.ArrayList;

public class ArrayTest {

    static int RUNS_INNER = 1000;
    static int RUNS_WARMUP = 10000;
    static int RUNS_OUTER = 100000;

    public static void main(String[] args) {
        long t1;
        long t2;

        Test test1 = new Test();
        test1.thing = (int)Math.round(100*Math.random());
        Test test2 = new Test();
        test2.thing = (int)Math.round(100*Math.random());

        for(int i=0; i<RUNS_WARMUP; i++)
        {
            testRefs(test1, test2);            
        }
        t1 = System.nanoTime();
        for(int i=0; i<RUNS_OUTER; i++)
        {
            testRefs(test1, test2);            
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1)/1000000.0 + " How long NO collection");

        ArrayList<Test> list = new ArrayList<Test>(1);
        list.add(test1);
        list.add(test2);
        // tried this too: helps a tiny tiny bit 
        list.trimToSize();

        for(int i=0; i<RUNS_WARMUP; i++)
        {
            testColl(list);
        }
        t1= System.nanoTime();

        for(int i=0; i<RUNS_OUTER; i++)
        {
            testColl(list);
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1)/1000000.0 + " How long collection");


        Test[] array = new Test[2];
        list.toArray(array);

        for(int i=0; i<RUNS_WARMUP; i++)
        {
            testArr(array);            
        }
        t1= System.nanoTime();

        for(int i=0; i<RUNS_OUTER; i++)
        {
            testArr(array);
        }

        t2 = System.nanoTime();
        System.out.println((t2-t1)/1000000.0 + " How long array ");

    }

    private static void testArr(Test[] array)
    {
        for (int i=0; i<RUNS_INNER; i++) {
            for (Test test : array) {
                test.changeThing(i);
            }
        }
    }

    private static void testColl(ArrayList<Test> list)
    {
        for (int i=0; i<RUNS_INNER; i++) {
            for (Test eachTest : list) {
                eachTest.changeThing(i);
            }
        }
    }

    private static void testRefs(Test test1, Test test2)
    {
        for (int i=0; i<RUNS_INNER; i++) {
            test1.changeThing(i);
            test2.changeThing(i);
        }
    }
}

class Test {
    int thing;
    int thing2;
    public void changeThing(int addThis) {
        thing2 = addThis + thing;
    }
}
Michael Borgwardt
This is awesome, thanks Michael, just the guidance I was looking for. I knew that "microbenchmarks are misleading" but with your example in hand, I can begin to see how my example mislead me.
Yar
The issue is that I won't be running with -client since it's a desktop app. Or is that not right?
Yar
@yar: you mean -server, right? There's nothing that prevents you from running a desktop app with the server VM. It will take longer to start up and probably use more memory, but if execution speed is your priority, you should use the server VM.
Michael Borgwardt
Thanks Michael, I'll need to check the memory footprint. If it's not ridiculous than -server might work. Though for right now I'm pre-optimizing, it'll be a nice trick to keep in my hat for later.
Yar