views:

1833

answers:

3

In my code I default to using ArrayList for all Lists, HashMap for all maps, HashSet for all sets.

From a practical standpoint how much am I losing in flexibility, scalability, readability and performance by choosing the wrong implementation? When does it make sense to spend time to decide to use one rather than another?

I certainly see a very clear cut case for why someone would use a LinkedList instead of an ArrayList given certain circumstances. When does someone feel that it is critical they use a HashMap rather than a TreeMap or a HashTable? What about Sets?

Questions:

  1. What is the cost of choosing poorly?
  2. Does anyone have an disaster stories about choosing the wrong implementation and the datacenter catching fire?
  3. Any good rules of thumb?
  4. Are there any obscure collections implementations you can't live without?

I've read through:

I found this question to be related from a theoretical point of view, but I'm more interested in a real world, down in the trenches answer.

+4  A: 

This is a very general question, but i´ll throw in a couple of thougts.

If you are programming oriented to interfaces, then flexibility wont take a great hit. For example

void foo(List<E> list);

The cost of choosing poorly could be seen in performance penalties. For example, choosing a LinkedList when direct access (as in ArrayList) is what you are looking for.

Sets have a similiar issue. If you want to keep sorted collections with no duplicates, a SortedSet would be a wiser choice over a HashSet. In the latter one, you´d have to sort the entire Set manually (this is, a call to Collections.sort())

<EDIT>

As for maps, there are a lot of different implementations. Each one has a different purpose. For example, there´s SortedMap, analog to SortedSet. Then, there´s WeakHashMap, that doesn´t work like a HashMap, in the sense that keys can be removed by the garbage collector. As you can imagine, the choice between a HashMap and a WeakHashMap is not trivial. As always, depends on what you want to implement with them.

</EDIT>

Regarding the story, in my current project, we replaced a HashSet with a SortedSet because performance was being affected. DataCenter didn´t caught fire though.

My two cents.

Tom
+1  A: 

So long as you follow the good OO practice of depending on an abstract type, what does it matter?

If for example you find you've used the wrong Map you simply change the implementation you are using and because all of your dependencies are on Map everything works as before just with the different performance characteristics.

Nick Holt
I could see a problem arising if you relied on some of the performance characteristics of a ArrayList, but realized that you needed an LinkedList. You might have to rewrite the parts of your code that are now inefficient with an LinkedList.
e5
+1  A: 

I think you're fine to use HashMap, HashSet, and ArrayList as your primary implementations. When you need a sorted set, it's good to know that TreeSet is available; when you're doing recursive kinds of things, similarly, it's nice to have LinkedList in your back pocket. But program to interfaces and then you can swap the implementations out as needed. And if the same collection needs processing as (for instance) both a LinkedList and an ArrayList, it's no big deal to construct one from the other.

Work with the default implementations you've listed. When there are performance problems, and there's reason to believe an alternate implementation would be better, swap it out - and measure the difference. When you need special behavior (sorted sets, for instance), use the special classes.

This approach hasn't burned me yet.

Carl Manaster