tags:

views:

71

answers:

3

i have a string like this

somestring='in this/ string / i have many. interesting.occurrences of {different chars} that need     to .be removed  '

here is the result i want:

somestring='in this string i have many interesting occurrences of different chars that need to be removed'

i started to manually do all kinds of .replace, but there are so many different combinations that i think there must be a simpler way. perhaps there's a library that already does this?

does anyone know how i can clean up this string>?

+3  A: 

I would use regular expression to replace all non-alphanumerics to spaces:

>>> import re
>>> somestring='in this/ string / i have many. interesting.occurrences of {different chars} that need     to .be removed  '
>>> rx = re.compile('\W+')
>>> res = rx.sub(' ', somestring).strip()
>>> res
'in this string i have many interesting occurrences of different chars that need to be removed'
KennyTM
wowowow!! this is pretty amazing!! where can i read about this library?
i am a girl
@user: That is just a simple regular expression. The library is in http://docs.python.org/library/re.html. See http://www.regular-expressions.info/ for more about regex.
KennyTM
http://docs.python.org/library/re.html
leoluk
+1  A: 
re.sub('[\[\]/{}.,]+', '', somestring)
leoluk
Note that `interesting.occurrences` needs to become `interesting occurrences` with a space.
KennyTM
And multiple spaces `'need      to'` condensed to one `'need to'`
Nick T
Yes, you're all right, the above one is better.
leoluk
+1  A: 

You have two steps: remove the punctuation then remove the extra whitespace.

1) Use string.translate

import string
trans_table = string.maketrans( string.punctuation, " "*len(string.punctuation)
new_string = some_string.translate(trans_table)

This makes then applies a translation table that maps punctuation characters to whitespace.

2) Remove excess whitespace

new_string = " ".join(new_string.split())
Martin Thomas