views:

499

answers:

9

I got this far:

def most_frequent(string):
    d = dict()
    for key in string:
        if key not in d:
            d[key] = 1
        else:
            d[key] += 1
    return d

print most_frequent('aabbbc')

Returning:

{'a': 2, 'c': 1, 'b': 3}

Now I need to:

  1. reverse the pair
  2. sort by number by decreasing order
  3. only print the letters out

Should I convert this dictionary to tuples or list?

+5  A: 

This should do it nicely.

def frequency_analysis(string):
    d = dict()
    for key in string:
        d[key] = d.get(key, 0) + 1
    return d

def letters_in_order_of_frequency(string):
    frequencies = frequency_analysis(string)
    # frequencies is of bounded size because number of letters is bounded by the dictionary, not the input size 
    frequency_list = [(freq, letter) for (letter, freq) in frequencies.iteritems()]
    frequency_list.sort(reverse=True)
    return [letter for freq, letter in frequency_list]

string = 'aabbbc'
print letters_in_order_of_frequency(string)
Jerub
+13  A: 

Here's a one line answer

sortedLetters = sorted(d.iteritems(), key=lambda (k,v): (v,k))
chills42
Wow... very nice. I didn't even think to look for a built-in that does the job directly!
D.Shawley
Note that reversing is still necessary here. Frequencies also need to be trimmed. If you import operator, you can do all of this in one line: map(operator.itemgetter(0), sorted(d.iteritems(), key=operator.itemgetter(1), reverse=True)).
Stephan202
@Stephan: How about adding your own post so I can vote for it?
Aaron Digulla
+3  A: 

Here is something that returns a list of tuples rather than a dictionary:

import operator

if __name__ == '__main__':

    test_string = 'cnaa'

    string_dict = dict()
    for letter in test_string:
        if letter not in string_dict:
            string_dict[letter] = test_string.count(letter)

    # Sort dictionary by values, credits go here http://stackoverflow.com/questions/613183/sort-a-dictionary-in-python-by-the-value/613218#613218
    ordered_answer = sorted(string_dict.items(), key=operator.itemgetter(1), reverse=True)
    print ordered_answer
hyperboreean
Dammit! I was just typing a response using operator! :-) +1 for being faster with the best answer
Jarret Hardie
Note that it's possible to give reverse=True as an argument to sorted(). Arguably that would be a little more efficient.I'd say that the use of count() here is less efficient than the alternative used by other answers, because this requires multiple iterations over the string.
Stephan202
+1  A: 

chills42 lambda function wins, I think but as an alternative, how about generating the dictionary with the counts as the keys instead?

def count_chars(string):
    distinct = set(string)
    dictionary = {}
    for s in distinct:
        num = len(string.split(s)) - 1
        dictionary[num] = s
    return dictionary

def print_dict_in_reverse_order(d):
    _list = d.keys()
    _list.sort()
    _list.reverse()
    for s in _list:
        print d[s]
mavnn
what if 2 chars has same count, your code will give incorrect result
Learner
Why? The order the two letters with the same count are printed in is not defined, either in the question or my solution.
mavnn
+2  A: 

EDIT This will do what you want. I'm stealing chills42 line and adding another:

sortedLetters = sorted(d.iteritems(), key=lambda (k,v): (v,k))
sortedString = ''.join([c[0] for c in reversed(sortedLetters)])

------------original answer------------

To print out the sorted string add another line to chills42 one-liner:

''.join(map(lambda c: str(c[0]*c[1]), reversed(sortedLetters)))

This prints out 'bbbaac'

If you want single letters, 'bac' use this:

''.join([c[0] for c in reversed(sortedLetters)])
jcoon
+2  A: 
from collections import defaultdict

def most_frequent(s):
    d = defaultdict(int)
    for c in s:
        d[c] += 1

    return "".join([
        k for k, v in sorted(
        d.iteritems(), reverse=True, key=lambda (k, v): v)
    ])

EDIT:

here is my one liner:

def most_frequent(s):
    return "".join([
        c for frequency, c in sorted(
            [(s.count(c), c) for c in set(s)], reverse=True
        )
    ])
Toni Ruža
Thanks for pointing out s.count()! upvoted just for that piece of information!
Daren Thomas
To be fair, hyperboreean mentioned the count method first. Personally, I don't like that approach because it iterates through the string for every letter. Although, it helps for making an elegant one liner :)
Toni Ruža
+1  A: 
def reversedSortedFrequency(string)
   from collections import defaultdict
   d = defaultdict(int)
   for c in string:
     d[c]+=1
   return sorted([(v,k) for k,v in d.items()], key=lambda (k,v): -k)
vartec
+1  A: 

Here is the fixed version (thank you for pointing out bugs)

def frequency(s):
    return ''.join(
        [k for k, v in
        sorted(
            reduce(
                lambda d, c: d.update([[c, d.get(c, 0) + 1]]) or d, 
                list(s), 
                dict()).items(),
            lambda a, b: cmp(a[1], b[1]),
            reverse=True)])

I think the use of reduce makes the difference in this sollution compared to the others...

In action:

>>> from frequency import frequency
>>> frequency('abbbccddddxxxyyyyyz')
'ydbxcaz'

This includes extracting the keys (and counting them) as well!!! Another nice property is the initialization of the dictionary on the same line :)

Also: no includes, just builtins.

The reduce function is kinda hard to wrap my head around, and setting dictionary values in a lambda is also a bit cumbersome in python, but, ah well, it works!

Daren Thomas
How can this be voted up? You sort by alphabet, not frequency. I do not have enough reputation to vote this down...
Stephan202
oops! sorting on the string was only done to tokenize it. Should have used list instead, so as not to confuse you and myself, since the output is random order, namely keys().
Daren Thomas
Fixed the function, adding sorting to the dictionary containing frequencies. This might also be a prime example of why one-liners are both fun to write and bad to have in production code!
Daren Thomas
A: 

Here's the code for your most_frequent function:

>>> a = 'aabbbc'
>>> {i: a.count(i) for i in set(a)}
{'a': 2, 'c': 1, 'b': 3}

this particular syntax is for py3k, but it's easy to write something similar using syntax of previous versions. it seems to me a bit more readable than yours.

SilentGhost