views:

75

answers:

4

There are different collections are used in Java like hashtable, hashset, vector, treeset, treemap and hashmap. How are they implemented internally? What are the actual datastructure used for these collection? And, What is order?

It would be good if we discuss just a little bit about implementation of collection.

Thanks, -Abhishek

A: 

The collections implementation classes are named exactly after their data structures. So HashSet and HashMap used a hash as the underlying data structure, while TreeSet and TreeMap use a binary tree.

More info is available in the JavaDoc of those classes. For example, for TreeSet it states:

This implementation provides guaranteed log(n) time cost for the basic operations (add, remove and contains).

FRotthowe
A: 

Related to Order I find something useful:

I found below post very useful in respect to Big O notation: http://stackoverflow.com/questions/559839/big-o-summary-for-java-collections-framework-implementations

and in precise below webpage http://simplenotions.wordpress.com/2009/05/13/java-standard-data-structures-big-o-notation/

below details about experiment result on each collection http://java.sun.com/docs/books/performance/1st_edition/html/JPAlgorithms.fm.html

Abhishek Jain
A: 

You might find this illuminating.

bmargulies
+1  A: 

There is this method Collections.sort() which internally calls Arrays.sort(). This uses the merge sort to sort data which has the order O(nlogn). And one more information is, if the number of elements to be sorted is lesser than 7, it uses the insertion sort.

As with Vector and ArrayList following are the complexities since it uses a simple array

  • get(index) - O(1)
  • add(Object) - O(n)
  • insertAt(int pos, Object value) - O(n)
  • remove(Object) - O(n)

One more thing is that HashSet internally uses a HashMap only worrying about the keys of a map and the objects being ignored or unused so that there are no two different implementations. And HashMap uses the collision free or the ideal hashing by generating unique hashcodes for every object. So, the order would be ideally 1 for insertion and retrieval of data.

Often in Java Collections, especially in Lists, there is this usual comparison between its two concrete implementations, the ArrayList and LinkedList. The order of the LinkedList are as follows.

  • get(index) - O(n)
  • add(Object) - O(1) (provided a readymade last pointer is maintained in the linked list)
  • insertAt(int pos, Object value) - O(n)
  • remove(Object) - O(n)
Bragboy