tags:

views:

223

answers:

4

Hi! I have a HashMap where the key is a word and the value is a number of occurrences of that string in a text. Now I'd like to reduce this HashMap to only 15 most used words (with greatest numbers of occurrences). Do you have any idea to do this efficiently?

A: 

You can use a LinkedHashMap and remove the least recently used items.

saugata
"the least recently used items", `LinkedHashMap` will not change element order when reinserting an entry. This will not work.
Pindatjuh
what if the repeats are bunched at the end?
GregS
+2  A: 

One way I think of to tackle this, but it's probably not the most efficient, is:

  • Create an array of hashMap.entrySet().toArray(new Entry[]{}).
  • Sort this using Arrays.sort, create your own Comparator which will compare only on Entry.getValue() (which casts it to an Integer). Make it order descending, i.e. most/highest first, less/lowest latest.
  • Iterate over the sorted array and break when you've reached the 15th value.
Pindatjuh
Dear downvoter, please explain.
Pindatjuh
+1 for the blueprint, some downvoters are ...
stacker
+3  A: 

Using an array instead of ArrayList as suggested by Pindatjuh could be better,

public class HashTest {
        public static void main(String[] args) {
            class hmComp implements Comparator<Map.Entry<String,Integer>> {
                public int compare(Entry<String, Integer> o1,
                        Entry<String, Integer> o2) {
                    return o2.getValue() - o1.getValue();
                }
            }
            HashMap<String, Integer> hm = new HashMap<String, Integer>();
            Random rand = new Random();
            for (int i = 0; i < 26; i++) {
                hm.put("Word" +i, rand.nextInt(100));
            }
            ArrayList list = new ArrayList( hm.entrySet() );
            Collections.sort(list, new hmComp() );
            for ( int i = 0  ; i < 15 ; i++ ) {
                System.out.println( list.get(i) );
            }

        }
    }

EDIT reversed sorting order

stacker
+1 for implementation! If I could do more, I would do +2.
Pindatjuh
A: 
Map<String, Integer> map = new HashMap<String, Integer>();

    // --- Put entries into map here ---

    // Get a list of the entries in the map
    List<Map.Entry<String, Integer>> list = new Vector<Map.Entry<String, Integer>>(map.entrySet());

    // Sort the list using an annonymous inner class implementing Comparator for the compare method
    java.util.Collections.sort(list, new Comparator<Map.Entry<String, Integer>>(){
        public int compare(Map.Entry<String, Integer> entry, Map.Entry<String, Integer> entry1)
        {
            // Return 0 for a match, -1 for less than and +1 for more then
            return (entry.getValue().equals(entry1.getValue()) ? 0 : (entry.getValue() > entry1.getValue() ? 1 : -1));
        }
    });

    // Clear the map
    map.clear();

    // Copy back the entries now in order
    for (Map.Entry<String, Integer> entry: list)
    {
        map.put(entry.getKey(), entry.getValue());
    }

Use first 15 entries of map. Or modify last 4 lines to put only 15 entries into map

Artic