views:

2563

answers:

4

I'm a beginner with both Python and RegEx, and I would like to know how to make a string that takes symbols and replaces them with spaces. Any help is great.

For example:

how much for the maple syrup? $20.99? That's ricidulous!!!

into:

how much for the maple syrup 20 99 That s ridiculous
+11  A: 

One way, using regular expressions:

>>> s = "how much for the maple syrup? $20.99? That's ricidulous!!!"
>>> re.sub(r'[^\w]', ' ', s)
'how much for the maple syrup   20 99  That s ricidulous   '
  • \w will match alphanumeric characters and underscores

  • [^\w] will match anything that's not alphanumeric or underscore

dF
It should be noted that ^\w outside of brackets means 'match an alphanumeric character at the beginning of a line'. It's only within the brackets ( [^\w] ) that the caret symbol means 'ignore every character in here'
cmptrgeekken
@cmptrgeekken: Thanks, fixed.
dF
in stead of [^\w] you can also use \W, which is the opposite of \w.
Ikke
+1  A: 

My advice is to read the documentation for the re library. It includes some pretty good examples.

Jason Baker
A: 

I often just open the console and look for the solution in the objects methods. Quite often it's already there:

a = "hello ' s"

dir(a)

[ (....) 'partition', 'replace' (....)]

a.replace("'", " ")

'hello s'

Short answer: Use string.replace()

buster
+1  A: 

Sometimes it takes longer to figure out the regex than to just write it out in python:

import string
s = "how much for the maple syrup? $20.99? That's ricidulous!!!"
for char in string.punctuation:
    s = s.replace(char, ' ')

If you need other characters you can change it to use a white-list or extend your black-list.

Sample white-list:

whitelist = string.letters + string.digits + ' '
new_s = ''
for char in s:
    if char in whitelist:
        new_s += char
    else:
        new_s += ' '
monkut