views:

22

answers:

1

I have a hash in Ruby that is storing the word frequency of a string, with the word as the key and the frequency as the value.

words = a_string.split(/ /)
freqs = Hash.new(0)
words.each { |word| freqs[word] += 1 }
freqs = freqs.sort_by {|x,y| y }
freqs.reverse!
freqs.each do |word, freq|
    puts word+' '+freq.to_s
end

I've read that hash iterators return the hash in a random order, but this seems to work so far.

Now I need to freqs hash to only contain the 20 most frequent words. How can I do this? Thanks for reading.

+2  A: 

As you've probably seen, the reason it stays sorted is because your call to Enumerable#sort_by changes your Hash into an Array. From the documentation for that method:

The current implementation of sort_by generates an array of tuples containing the original collection element and the mapped value.

Once you have your sorted array, you could just use Array#first to get the top 20:

top20 = freqs.first(20)

If you want, you could use to_hash on the result to return it back to a Hash, but it will no longer be sorted.

Daniel Vandersluis
@Daniel - Ah, that makes sense. Thanks for your help.
ben