views:

85

answers:

6

Hi,

Assume you need to store/retrieve items in a Collection, don't care about ordering, and duplicates are allowed, what type of Collection do you use?

By default, I've always used ArrayList, but I remember reading/hearing somewhere that a Queue implementation may be a better choice. A List allows items to be added/retrieved/removed at arbitrary positions, which incurs a performance penalty. As a Queue does not provide this facility it should in theory be faster when this facility is not required.

I realise that all discussions about performance are somewhat meaningless, the only thing that really matters is measurement. Nevertheless, I'm interested to know what others use for a Collection, when they don't care about ordering, and duplicates are allowed, and why?

Thanks, Don

A: 

As a default, I tend to prefer LinkedList to ArrayList. Obviously, I use them not through the List interface, but rather through the Collection interface.

Over the time, I've indeed found out that when I need a generic collection, it's more or less to put some things in, then iterate over it. If I need more evolved behaviour (say random access, sorting or unicity checks), I will then maybe change the used implementation, but before that I will change the used interface to the most appropriated. This way, I can ensure feature is provided before to concentrate on optimization and implementation.

Riduidel
Searching on a `LinkedList` can be expensive, or getting a value at a particular index - you need to traverse the list every time.
Noel M
Well, searching in a LinkedList is O(n), as is searching in an ArrayList, as none of these are sorted.And as I stated before I use them through a Collection interface, and as a consequence I don't have any access to the indexed access methods (which is, in that case, a feature).
Riduidel
+1  A: 

ArrayList basicly contains an array inside (that's why it is called ArrayList). And operations like addd/remove at arbitrary positions are done in a straightforward way, so if you don't use them - there is no harm to performance.

falagar
Straightforward means shifting. That is, add/remove at arbitrary position is `O(N)` with `ArrayList`, not `O(1)`. Just clarifying.
polygenelubricants
A: 

If ordering and duplicates is not a problem and case is only for storing,

I use ArrayList, As it implements all the list operations. Never felt any performance issues with these operations (Never impacted my projects either). Actually using these operations have simple usage & I don't need to care how its managed internally.

Only if multiple threads will be accessing this list I use Vector because its methods are synchronized.

Also ArrayList and Vector are collections which you learn first :).

YoK
For most usages, I believe CopyOnWriteArrayList is a better thread-safe implementation of List
Don
"I use ArrayList, As it implements all the list operations.". But the use case I'm describing doesn't require any List operations, just the Collection methods.
Don
Ya I made it bold by mistake. removed it. I meant with this line was that other than Collections methods these methods are useful.
YoK
Why do you prefer Vector to Collections.synchronizedList transformation over ArrayList ?
Riduidel
+3  A: 

"It depends". The question you really need to answer first is "What do I want to use the collection for?"

If you often insert / remove items on one of the ends (beginning, end) a Queue will be better than a ArrayList. However in many cases you create a Collection in order to just read from it. In this case a ArrayList is far more efficient: As it is implemented as an array, you can iterate over it quite efficient (same applies for a LinkedList). However a LinkedList uses references to link single items together. So if you do not need random removals of items (in the middle), a ArrayList is better: An ArrayList will use less memory as the items don't need the storage for the reference to the next/prev item.

To sum it up:

ArrayList = good if you insert once and read often (random access or sequential)

LinkedList = good if you insert/remove often at random positions and read only sequential

ArrayDeque (java6 only) = good if you insert/remove at start/end and read random or sequential

redCube
A: 

It depends on what you know about it.

If I have no clue, I tend to go for a linked list, since the penalty for adding/removing at the end is constant. If I have a rough idea of the maximum size of it, I go for an arraylist with the capacity specified, because it is faster if the estimation is good. If I really know the exact size I tend to go for a normal array; although that isn't really a collection type.

Joeri Hendrickx
A: 

I realise that all discussions about performance are somewhat meaningless, the only thing that really matters is measurement.

That's not necessarily true.

If your knowledge of how the application is going to work tells you that certain collections are going to be very large, then it is a good idea to pick the right collection type. But the right collection type depends crucially on how the collections are going to be used; i.e. on the algorithms.

For example, if your application is likely to be dominated by testing if a collection holds a given object, the fact that Collection.contains(Object) is O(N) for both LinkedList<T> and ArrayList<T> might mean that neither is an appropriate collection type. Instead, maybe you should represent the collection as a HashMap<T, Integer>, where the Integer represents the number of occurrences of a T in the "collection". That will give you O(1) testing and removal, at the cost of more space overheads and slower (though still O(1)) insertion.

But the thing to stress is that if you are likely to be dealing with really large collections, there should be no such thing as a "default" collection type. You need to think about the collection in the context of the algorithms. (And the flip side is that if the collections are always going to be small, it probably makes little difference which collection type you pick.)

Stephen C