I have a string that kind of looks like this:
"stuff . // : /// more-stuff .. .. ...$%$% stuff -> DD"
and I want to strip off all punctuation, make everything uppercase and collapse all whitespace so that it looks like this:
"STUFF MORE STUFF STUFF DD"
Is this possible with one regex or do I need to combine more than two? This is what I have so far:
def normalize(string):
import re
string = string.upper()
rex = re.compile(r'\W')
rex_s = re.compile(r'\s{2,}')
result = rex.sub(' ', string) # this produces a string with tons of whitespace padding
result = rex.sub('', result) # this reduces all those spaces
return result
The only thing that doesn't work is the whitespace collapsing. Any ideas?