Why do some collection data structures not maintain the order of insertion? What is the special thing achieved compared to maintaining order of insertion? Do we gain something if we don't maintain the order?
Performance. If you want the original insertion order there are the LinkedXXX classes, which maintain an additional linked list in insertion order. Most of the time you don't care, so you use a HashXXX, or you want a natural order, so you use TreeXXX. In either of those cases why should you pay the extra cost of the linked list?
Depends on what you need the implementation to do well. Insertion order usually is not interesting so there is no need to maintain it so you can rearrange to get better performance.
For Maps it is usually HashMap and TreeMap that is used. By using hash codes, the entries can be put in small groups easy to search in. The TreeMap maintains a sorted order of the inserted entries at the cost of slower search, but easier to sort than a HashMap.
When you use a HashSet (or a HashMap) data are stored in "buckets" based on the hash of your object. This way your data is easier to access because you don't have to look for this particular data in the whole Set, you just have to look in the right bucket.
This way you can increase performances on specific points.
Each Collection implementation have its particularity to make it better to use in a certain condition. Each of those particularities have a cost. So if you don't really need it (for example the insertion order) you better use an implementation which doesn't offer it and fits better to your requirements.
I can't cite a reference, but by design the List
and Set
implementations of the Collection
interface are basically extendable Array
s. As Collections
by default offer methods to dynamically add and remove elements at any point -- which Array
s don't -- insertion order might not be preserved.
Thus, as there are more methods for content manipulation, there is a need for special implementations that do preserve order.
Another point is performance, as the most well performing Collection
might not be that, which preserves its insertion order. I'm however not sure, how exactly Collections
manage their content for performance increases.
So, in short, the two major reasons I can think of why there are order-preserving Collection
implementations are:
- Class architecture
- Performance
- The insertion order is inherently not maintained in hash tables - that's just how they work (read the linked-to article to understand the details). It's possible to add logic to maintain the insertion order (as in the
LinkedHashMap
), but that takes more code, and at runtime more memory and more time. The performance loss is usually not significant, but it can be. - For
TreeSet/Map
, the main reason to use them is the natural iteration order and other functionality added in theSortedSet/Map
interface.
Why is it necessary to maintain the order of insertion? If you use HashMap
, you can get the entry by key
. It does not mean it does not provide classes that do what you want.
The collections don't maintain order of insertion. Some just default to add a new value at the end. Maintaining order of insertion is only useful if you prioritize the objects by it or use it to sort objects in some way.
As for why some collections maintain it by default and others don't, this is mostly caused by the implementation and only sometimes part of the collections definition.
Lists maintain insertion order as just adding a new entry at the end or the beginning is the fastest implementation of the add(Object ) method.
Sets The HashSet and TreeSet implementations don't maintain insertion order as the objects are sorted for fast lookup and maintaining insertion order would require additional memory. This results in a performance gain since insertion order is almost never interesting for Sets.
ArrayDeque a deque can used for simple que and stack so you want to have ''first in first out'' or ''first in last out'' behaviour, both require that the ArrayDeque maintains insertion order. In this case the insertion order is maintained as a central part of the classes contract.
Theres's a section in the O'Reilly Java Cookbook called "Avoiding the urge to sort" The question you should be asking is actually the opposite of your original question ... "Do we gain something by sorting?" It take a lot of effort to sort and maintain that order. Sure sorting is easy but it usually doesn't scale in most programs. If you're going to be handling thousands or tens of thousands of requests (insrt,del,get,etc) per second whether not you're using a sorted or non sorted data structure is seriously going to matter.