views:

489

answers:

7

Hi, I have to create a large list of n elements (could be up to 100,000). each element in the list is an integer equivalent to the index of the list. After this I have to call Collections.shuffle on this list. My question is, which list implementation (either java collections or apache collections) should be used. My gut feeling is ArrayList can well be used here. All thoughts are appreciated. Thanks!

Thanks for the inputs. I think I am sticking to the ArrayList. I am currently using the ArrayList constructor with the initialCapacity param and I pass the size of the list. So if the original list is 100000, I create this new list with new ArrayList(100000); Hence I think I don't have the create an array and do an asList since there won't be any resizing. Also, most of the apache collections Lists like GrowthList & LazyList do not implement RandomAccess. This for sure would slow down the shuffle (as per javadocs). FastArrayList does implement RandomAccess but apache has a note for this class saying "This class is not cross-platform. Using it may cause unexpected failures on some architectures".

+8  A: 

ArrayList most probably has the least overhead per list element, so should be the best choice. It might be a worse choice if you frequently need to delete items in the middle of the list.

unwind
Actually, int[] would have less overhead, what to choose depends on what he needs it for.
rsp
@rsp: int[] doesn't implement List - you'd need a wrapper round it.
Jon Skeet
Only if you insist in using Collections.Shuffle :-) but point taken.
rsp
+1  A: 

ArrayList<T> would probably be fine, yes - but what criteria are you using for "best" anyway? And just how good does it have to be anyway? What are your trade-offs between complexity and "goodness" in whatever those criteria are?

Jon Skeet
+6  A: 

Quoted from the Collections.shuffle javadoc:

This method runs in linear time. If the specified list does not implement the RandomAccess interface and is large, this implementation dumps the specified list into an array before shuffling it, and dumps the shuffled array back into the list. This avoids the quadratic behavior that would result from shuffling a "sequential access" list in place.

So if you have no toher needs i would go with ArrayList which implements RandomAccess.

pgras
A: 

ArrayList will be the best List for this. As the array backing will be very efficent for swapping elements used in shuffle.

But if you are really chacing performance you may wish to consider using an int[] or a custom list based in int[] as with all the List and List standard implementations you will be boxing and unboxing ints to Integers.

This will not be an issue on the suffle as this will just be reordering pointers but you will be creating 100,000 objects when you may not need to. Assuming you know the size of your list before creation you can quite easily create a new List class that wraps a primitive array. If used as a java.util.List you will still need to box the return from any get method.

David Waters
+2  A: 

Making an Integer array and then wrapping it with Arrays.asList gives you even less overhead than a regular ArrayList.

List<Integer> makeList(int size){
    if (size < 0) throw new IllegalArgumentException();
    Integer[] arr = new Integer[size];
    for (int i = 0; i < arr.length; ++i) arr[i] = i;
    List<Integer> list = Arrays.asList(arr);
    Collection.shuffle(list);
    return list;
}

You save one entire int worth of space (... which admittedly is absolutely nothing in this context), but it does perform fewer range checks than the "real" ArrayList, so accessing will be slightly faster. Probably nothing you'll notice, though :)

gustafc
Presumably when you use a `List`, you don't allocate an array yourself. You just, well, use a `List`.
Pavel Minaev
Uhm, that depends on how you do it, which happens to be the essence of this question - how to create a large list with little overhead.
gustafc
+1  A: 

Javolution claims to have fastest List implementation in java. But I couldn't find any shuffle implementation in this library so you will have to do it by hand.

tulskiy
+1  A: 

Google's Guava library has some really nice primitive handling, including an Ints.asList() method returning a list that may be shuffled.

The Guava project is still at a preliminary stage of deployment, though the code has been carefully reviewed and heavily used at Google. You'll need to retrieve the code from SVN and build the com.google.common.primitive classes.

Jared Levy