views:

319

answers:

8

Why do some collection data structures not maintain the order of insertion? What is the special thing achieved compared to maintaining order of insertion? Do we gain something if we don't maintain the order?

+10  A: 

Performance. If you want the original insertion order there are the LinkedXXX classes, which maintain an additional linked list in insertion order. Most of the time you don't care, so you use a HashXXX, or you want a natural order, so you use TreeXXX. In either of those cases why should you pay the extra cost of the linked list?

EJP
Just want to add again that strictly speaking *Mappings* aren't `Collection` s.
FK82
@FK82 they are collections of key value pairs
josefx
@josefx: yes, but they don't inherit the `Collection` interface. I just wanted to add that, since more or less all posters seem to take the interface `Map` for an implementation/extension of the interface `Collection`, which it is not.
FK82
@FK82 +1 that's true.
josefx
What do these comments have to do with my post?
EJP
You are talking about classes that inherit the `Map` interface, not about those inheriting the `Collection` interface.
FK82
Complete and utter rubbish. I am talking about no such thing. Please read what I *actually* wrote before you comment next time, and next time someone queries you please double-check instead of just reiterating your mistake. I carefully and deliberately didn't talk about *any* specific classes at all.
EJP
Tone down. I wasn't aware that there are `HashSet`, `TreeSet`, `HashList` and even `TreeList` classes in the JDK. So I was assuming you were talking about `HashMap` and `TreeMap` which of course don't inherit from `Collection`.
FK82
I can't help that. These classes have existed for over ten years, apart from Hashlist which doesn't exist at all. There is an obligation to be accurate, and to attribute correctly.
EJP
+1  A: 

Depends on what you need the implementation to do well. Insertion order usually is not interesting so there is no need to maintain it so you can rearrange to get better performance.

For Maps it is usually HashMap and TreeMap that is used. By using hash codes, the entries can be put in small groups easy to search in. The TreeMap maintains a sorted order of the inserted entries at the cost of slower search, but easier to sort than a HashMap.

Thorbjørn Ravn Andersen
+1  A: 

When you use a HashSet (or a HashMap) data are stored in "buckets" based on the hash of your object. This way your data is easier to access because you don't have to look for this particular data in the whole Set, you just have to look in the right bucket.

This way you can increase performances on specific points.

Each Collection implementation have its particularity to make it better to use in a certain condition. Each of those particularities have a cost. So if you don't really need it (for example the insertion order) you better use an implementation which doesn't offer it and fits better to your requirements.

Colin Hebert
A: 

I can't cite a reference, but by design the List and Set implementations of the Collection interface are basically extendable Arrays. As Collections by default offer methods to dynamically add and remove elements at any point -- which Arrays don't -- insertion order might not be preserved. Thus, as there are more methods for content manipulation, there is a need for special implementations that do preserve order.

Another point is performance, as the most well performing Collection might not be that, which preserves its insertion order. I'm however not sure, how exactly Collections manage their content for performance increases.

So, in short, the two major reasons I can think of why there are order-preserving Collection implementations are:

  1. Class architecture
  2. Performance
FK82
Note that `Arrays` is an actual class, while arrays are a special type of container objects. I'm also pretty sure `LinkedList` actually does use a linked list but I haven't read the code. :-)
wds
Ok, point taken, I edited my post. About your `LinkedList` comment: where is the contradiction to what I posted?
FK82
To clarify: A `LinkedList` afaik is a `List` (read extendable `Array`) whose insertion order is maintained in another `List` (the two of which are *linked*, hence the name). Or, am I wrong on that one?
FK82
A very confused and confusing post. A LinkedList is not an extensible array, and neither is a List: it depends on the implementation. Neither of them contains 'another List'. I don't know what you mean by 'which Arrays don't.' Your second paragraph is basically meaningless. Your conclusions don't follow from your premisses.
EJP
Spiteful, are we? I'm not drawing conclusions and there can't be anything confusing about asking a question unless you don't understand it.
FK82
Also, the `Array` object does not offer methods to dynamically remove and add elements. That's why `List` exists in the first place. My second paragraph says what you say in your post. Don't flame away on your first impression, buddy.
FK82
'So' introduces your conclusion. You didn't ask a question. There is no 'Array object'. There is an ArrayList class which *does* offer those methods. My other statements remain true.
EJP
+5  A: 
  • The insertion order is inherently not maintained in hash tables - that's just how they work (read the linked-to article to understand the details). It's possible to add logic to maintain the insertion order (as in the LinkedHashMap), but that takes more code, and at runtime more memory and more time. The performance loss is usually not significant, but it can be.
  • For TreeSet/Map, the main reason to use them is the natural iteration order and other functionality added in the SortedSet/Map interface.
Michael Borgwardt
+1 For mentioning "but that takes more code".
Helper Method
Just on a quick side note: strictly speaking `Map` implementations are not `Collection`s as they do not implement the `Collection` interface. They do have similar methods, but that's it. Check: http://download.oracle.com/javase/1.4.2/docs/guide/collections/overview.html (#Collection Interfaces) Most likely the OP's question addresses maps too though.
FK82
+1  A: 

Why is it necessary to maintain the order of insertion? If you use HashMap, you can get the entry by key. It does not mean it does not provide classes that do what you want.

fastcodejava
+1  A: 

The collections don't maintain order of insertion. Some just default to add a new value at the end. Maintaining order of insertion is only useful if you prioritize the objects by it or use it to sort objects in some way.

As for why some collections maintain it by default and others don't, this is mostly caused by the implementation and only sometimes part of the collections definition.

  • Lists maintain insertion order as just adding a new entry at the end or the beginning is the fastest implementation of the add(Object ) method.

  • Sets The HashSet and TreeSet implementations don't maintain insertion order as the objects are sorted for fast lookup and maintaining insertion order would require additional memory. This results in a performance gain since insertion order is almost never interesting for Sets.

  • ArrayDeque a deque can used for simple que and stack so you want to have ''first in first out'' or ''first in last out'' behaviour, both require that the ArrayDeque maintains insertion order. In this case the insertion order is maintained as a central part of the classes contract.

josefx
A: 

Theres's a section in the O'Reilly Java Cookbook called "Avoiding the urge to sort" The question you should be asking is actually the opposite of your original question ... "Do we gain something by sorting?" It take a lot of effort to sort and maintain that order. Sure sorting is easy but it usually doesn't scale in most programs. If you're going to be handling thousands or tens of thousands of requests (insrt,del,get,etc) per second whether not you're using a sorted or non sorted data structure is seriously going to matter.

brown.2179