tags:

views:

577

answers:

6

Hi folks,

I have a set

set(['booklet', '4 sheets', '48 sheets', '12 sheets'])

After sorting I want it to look like

4 sheets,
12 sheets,
48 sheets,
booklet

Any idea please

+1  A: 
>>> a = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
>>> def ke(s):
    i, sp, _ = s.partition(' ')
    if i.isnumeric():
        return int(i)
    return float('inf')

>>> sorted(a, key=ke)
['4 sheets', '12 sheets', '48 sheets', 'booklet']
SilentGhost
+1  A: 

sets are inherently un-ordered. You'll need to create a list with the same content and sort that.

Rakis
Not true - the sorted() built-in will take any sequence and return a sorted list.
Paul McGuire
So instead of creating a list and sorting it, you instead use a builtin to create a sorted list.... Yeah, I was way off.
Rakis
+1  A: 

Based on SilentGhost's answer:

In [4]: a = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])

In [5]: def f(x):
   ...:     num = x.split(None, 1)[0]
   ...:     if num.isdigit():
   ...:         return int(num)
   ...:     return x
   ...: 

In [6]: sorted(a, key=f)
Out[6]: ['4 sheets', '12 sheets', '48 sheets', 'booklet']
draebek
+10  A: 

Jeff Atwood talks about natural sort and gives an example of one way to do it in Python. Here is my variation on it:

import re 

def sorted_nicely( l ): 
    """ Sort the given iterable in the way that humans expect.""" 
    convert = lambda text: int(text) if text.isdigit() else text 
    alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
    return sorted(l, key = alphanum_key)

Use like this:

s = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
for x in sorted_nicely(s):
    print(x)

Output:

4 sheets
12 sheets
48 sheets
booklet

One advantage of this method is that it doesn't just work when the strings are separated by spaces. It will also work for other separators such as the period in version numbers (for example 1.9.1 comes before 1.10.0).

Mark Byers
Hi Jeff, Thank you very much. That is exactly what I was looking for.Good luck.
mmrs151
A: 

A simple way is to split up the strings to numeric parts and non-numeric parts and use the python tuple sort order to sort the strings.

import re
tokenize = re.compile(r'(\d+)|(\D+)').findall
def natural_sortkey(string):          
    return tuple(int(num) if num else alpha for num, alpha in tokenize(string))

sorted(my_set, key=natural_sortkey)
Ants Aasma
+3  A: 

Short and sweet:

sorted(data, key=lambda item: (int(item.partition(' ')[0])
                               if item[0].isdigit() else float('inf'), item))

This version:

  • Works in Python 2 and Python 3, because:
    • It does not assume you compare strings and integers (which won't work in Python 3)
    • It doesn't use the cmp parameter to sorted (which doesn't exist in Python 3)
  • Will sort on the string part if the quantities are equal

If you want printed output exactly as described in your example, then:

data = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
r = sorted(data, key=lambda item: (int(item.partition(' ')[0])
                                   if item[0].isdigit() else float('inf'), item))
print ',\n'.join(r)
Daniel Stutzbach
Great answer Daniel
mmrs151