views:

74

answers:

2

All,

Can anyone please let me know exactly what are the performance issues between the 2? The site : CodeRanch provides a brief overview of the internal calls that would be needed when using keySet() and get(). But it would be great if anyone can provide exact details about the flow when keySet() and get() methods are used. This would help me understand the performance issues better.

+7  A: 

The most common case where using entrySet is preferable over keySet is when you are iterating through all of the key/value pairs in a Map.

This is more efficient:

for (Map.Entry entry : map.entrySet()) {
    Object key = entry.getKey();
    Object value = entry.getValue();
}

than:

for (Object key : map.keySet()) {
    Object value = map.get(key);
}

Because in the second case, for every key in the keySet the map.get() method is called, which - in the case of a HashMap - requires that the hashCode() and equals() methods of the key object be evaluated in order to find the associated value*. In the first case that extra work is eliminated.

Edit: This is even worse if you consider a TreeMap, where a call to get is O(log2(n)), i.e. the comparator for will may need to run log2(n) times (n = size of the Map) before finding the associated value.

*Some Map implementations have internal optimisations that check the objects' identity before the hashCode() and equals() are called.

Michael Barker
Additionally, if the map is a TreeMap rather than a HashMap, `get()` is a `O(log(n))` operation.
ILMTitan
@ILMTitan, Good spot.
Michael Barker
@ILMIian and Michael: Why the difference between TreeMap and HashMap?
darkie15
TreeMap and HashMap are different data structures, TreeMap is based on a Red/Black tree. A HashMap is a bucket and list hashtable. In both cases the call to get() is not free and its cost depends on the type of data structure.
Michael Barker
+2  A: 

First of all, this depends entirely on which type of Map you're using. But since the JavaRanch thread talks about HashMap, I'll assume that that's the implementation you're referring to. And lets assume also that you're talking about the standard API implementation from Sun/Oracle.

Secondly, if you're concerned about performance when iterating through your hash map, I suggest you have a look at LinkedHashMap. From the docs:

Iteration over the collection-views of a LinkedHashMap requires time proportional to the size of the map, regardless of its capacity. Iteration over a HashMap is likely to be more expensive, requiring time proportional to its capacity.

HashMap.entrySet()

The source-code for this implementation is available. The implementation basically just returns a new HashMap.EntrySet. A class which looks like this:

private final class EntrySet extends AbstractSet<Map.Entry<K,V>> {
    public Iterator<Map.Entry<K,V>> iterator() {
        return newEntryIterator(); // returns a HashIterator...
    }
    // ...
}

and a HashIterator looks like

private abstract class HashIterator<E> implements Iterator<E> {
    Entry<K,V> next;    // next entry to return
    int expectedModCount;   // For fast-fail
    int index;      // current slot
    Entry<K,V> current; // current entry

    HashIterator() {
        expectedModCount = modCount;
        if (size > 0) { // advance to first entry
            Entry[] t = table;
            while (index < t.length && (next = t[index++]) == null)
                ;
        }
    }

    final Entry<K,V> nextEntry() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        Entry<K,V> e = next;
        if (e == null)
            throw new NoSuchElementException();

        if ((next = e.next) == null) {
            Entry[] t = table;
            while (index < t.length && (next = t[index++]) == null)
                ;
        }
    current = e;
        return e;
    }

    // ...
}

So there you have it... Thats the code dictating what will happen when you iterate through an entrySet. It walks through the entire array which is as long as the maps capacity.

HashMap.keySet() and .get()

Here you first need to get hold of the set of keys. This takes time proportional to the capacity of the map (as opposed to size for the LinkedHashMap). After this is done, you call get() once for each key. Sure, in the average case, with a good hashCode-implementation this takes constant time. However, it will inevitably require lots of .hashCode and .equals calls, which will obviously take more time than just doing a entry.value() call.

aioobe