views:

1125

answers:

12

I have a program where I need to make 100,000 to 1,000,000 random-access reads to a List-like object in as little time as possible (as in milliseconds) for a cellular automata-like program. I think the update algorithm I'm using is already optimized (keeps track of active cells efficiently, etc). The Lists do need to change size, but that performance is not as important. So I am wondering if the performance from using Arrays instead of ArrayLists is enough to make a difference when dealing with that many reads in such short spans of time. Currently, I'm using ArrayLists.

Edit: I forgot to mention: I'm just storing integers, so another factor is using the Integer wrapper class (in the case of ArrayLists) versus ints (in the case of arrays). Does anyone know if using ArrayList will actually require 3 pointer look ups (one for the ArrayList, one for the underlying array, and one for the Integer->int) where as the array would only require 1 (array address+offset to the specific int)? Would HotSpot optimize the extra look ups away? How significant are those extra look ups?

Edit2: Also, I forgot to mention I need to do random access writes as well (writes, not insertions).

A: 

An Array will be faster simply because at a minimum it skips a function call (i.e. get(i)).

If you have a static size, then Arrays are your friend.

Will Hartung
The function call will be inlined with modern JVM's.
Thorbjørn Ravn Andersen
A: 

ArrayLists are slower than Arrays, but most people consider the difference to be minor. In your case could matter though, since you're dealing with hundreds of thousands of them.

By the way, duplicate: http://stackoverflow.com/questions/716597/array-or-list-in-java-which-is-faster

James Skidmore
Apologies; I checked to see if this question had been asked before and missed that. However, he's talking about storing thousands of Strings whereas I'm talking about a million or so ints.
Bryan Head
A: 

If you're not going to be doing a lot more than reads from this structure, then go ahead and use an array as that would be faster when read by index.

However, consider how you're going to get the data in there, and if sorting, inserting, deleting, etc, are a concern at all. If so, you may want to consider other collection based structures.

Sev
Adding to the end and deleting both need to happen, but optimizations like doing adding n elements at the same so that the array only needs to be copied once are easy. Oh, I'm doing reads and writes btw.
Bryan Head
+5  A: 

Try both, but measure.

Most likely you could hack something together to make the inner loop use arrays without changing all that much code. My suspicion is that HotSpot will already inline the method calls and you will see no performance gain.

Also, try Java 6 update 14 and use -XX:+DoEscapeAnalysis

Kevin Peterson
+1 for "measure".
Thorbjørn Ravn Andersen
A: 

I would go with Kevin's advise.

Stay with the lists first and measure your performance if your programm is to slow compare it to a version with an array. If that gives you a measurable performance boost go with the arrays, if not stay with the lists because they will make your life much much easier.

Janusz
Ya, I've been using ArrayLists, but a lot of people have been requesting speed improvements.
Bryan Head
Same thing :) get a profiler to measure the speed of your program and look for the real bottlenecks and then optimize them. Many people I know recommend the Netbeans Profiler for Java.
Janusz
A: 

One possibility would be to re-implement ArrayList (it's not that hard), but expose the backing array via a lock/release call cycle. This gets you convenience for your writes, but exposes the array for a large series of read/write operations that you know in advance won't impact the array size. If the list is locked, add/delete is not allowed - just get/set.

for example:

  SomeObj[] directArray = myArrayList.lockArray();
  try{
    // myArrayList.add(), delete() would throw an illegal state exception
    for (int i = 0; i < 50000; i++){
      directArray[i] += 1;
    }
  } finally {
    myArrayList.unlockArray();
  }

This approach continues to encapsulate the array growth/etc... behaviors of ArrayList.

Kevin Day
This is clever and not too hard. Especially because I'm using ints, the re-implementation could get a speed boost from using primitives instead of wrapper classes. Do most jvms optimize away the performance loss in using wrapper classes for primitives?
Bryan Head
AFAIK, no they don't. In fact I don't think that they could. The fact that you are talking about "int[]" versus "ArrayList<Integer>" significantly changes the answers.
Stephen C
@stephen C - Exactly... arrays clearly win when dealing with primitives due to the object wrapper overhead required by ArrayList.
jsight
A: 

Java uses double indirection for its objects so they can be moved about in memory and have its references still be valid, this means every reference lookup is equivalent to two pointer lookups. These extra lookups cannot be optimised away completely.

Perhaps even worse is your cache performance will be terrible. Accessing values in cache is goings to be many times faster than accessing values in main memory. (perhaps 10x) If you have an int[] you know the values will be consecutive in memory and thus load into cache readily. However, for Integer[] the Integers individual objects can appear randomly across your memory and are much more likely to be cache misses. Also Integer use 24 bytes which means they are much less likely to fit into your caches than 4 byte values.

If you update an Integer, this often results in a new object created which is many orders of magnitude than updating an int value.

Peter Lawrey
Rubbish. No reasonable Java implementation has used handles for many years (IIRC, very early versions of HotSpot did reintroduce handles, but that was around 1.2.2 - best part of a decade ago).
Tom Hawtin - tackline
Are all uses of the wrapper classes optimized away then? What is the performance penalty of using Integer instead of int?
Bryan Head
Hi @Tom, you may be right but I would be interested in how this is achieved. Do you know of any documents which explain how this is achieved without double indirection?
Peter Lawrey
Have a look at this presentation, http://www.azulsystems.com/events/javaone_2009/session/2009_J1_HardwareCrashCourse.pdf page 65, Perhaps I mis-interpreted what it means.
Peter Lawrey
A: 

There will be an overhead from using an ArrayList instead of an array, but it is very likely to be small. In fact, the useful bit of data in the ArrayList can be stored in registers, although you will probably use more (List size for instance).

You mention in your edit that you are using wrapper objects. These do make a huge difference. If you are typically using the same value repeatedly, then a sensible cache policy may be useful (Integer.valueOf gives the same results for -128 to 128). For primitives, primitive arrays usually win comfortably.

As a refinement, you might want to make sure the adjacent cells tend to be adjacent in the array (you can do better than rows of columns with a space filling curve).

Tom Hawtin - tackline
A: 

If you're creating the list once, and doing thousands of reads from it, the overhead from ArrayList may well be slight enough to ignore. If you're creating thousands of lists, go with the standard array. Object creation in a loop quickly goes quadratic, simply because of all the overhead of instantiating the member variables, calling the constructors up the inheritance chain, etc.

Because of this -- and to answer your second question -- stick with standard ints rather than the Integer class. Profile both and you'll quickly (or, rather, slowly) see why.

rtperson
+1  A: 

Now that you've mentioned that your arrays are actually arrays of primitive types, consider using the collection-of-primitive-type classes in the Trove library.

Stephen C
A: 

Primitives are much (much much) faster. Always. Even with JIT escape analysis, etc. Skip wrapping things in java.lang.Integer. Also, skip the array bounds check most ArrayList implementations do on get(int). Most JIT's can recognize simple loop patterns and remove the loop, but there isn't much reason to much with it if you're worried about performance.

You don't have to code primitive access yourself - I'd bet you could cut over to using IntArrayList from the COLT library - see http://acs.lbl.gov/~hoschek/colt/ - "Colt provides a set of Open Source Libraries for High Performance Scientific and Technical Computing in Java") - in a few minutes of refactoring.

A: 

which is faster.... arraylist or linked list... im having a hard time doing my program

daryl