views:

93

answers:

2

I've got lots of address style strings and I want to sort them in a rational way.

I'm looking to pad all the numbers in a string so that: "Flat 12A High Rise" becomes "Flat 00012A High Rise", there may be multiple numbers in the string.

So far I've got:

def pad_numbers_in_string(string, padding=5):
    numbers = re.findall("\d+", string)
    padded_string = ''
    for number in numbers:
        parts = string.partition(number)
        string = parts[2]
        padded_string += "%s%s" % (parts[0], parts[1].zfill(padding))
    padded_string += string

return padded_string

Can that be improved - looks pugly to me!

+8  A: 

Instead of changing your data to accommodate your sorting algorithm, change your sorting algorithm to accommodate your data.

See Sorting For Humans: Natural Sort Order on Coding Horror:

import re 

def sort_nicely( l ): 
  """ Sort the given list in the way that humans expect. 
  """ 
  convert = lambda text: int(text) if text.isdigit() else text 
  alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
  l.sort( key=alphanum_key ) 
John Kugelman
John - thats excellent and exactly what I initially was after but didn't know how to! Thank you!
Ross
+6  A: 

How about this?

re.sub('\d+', lambda x:x.group().zfill(padding), s)

Example:

>>> s = "Flat 12A High Rise 101B"
>>> padding = 5
>>> re.sub('\d+', lambda x:x.group().zfill(padding), s)
'Flat 00012A High Rise 00101B'
>>> 
Nick D
Excellent much cleaner!
Ross