views:

181

answers:

5

I'm trying to remove specific characters from a string using python. This is the code i'm using right now. Unfortunately it appears to do nothing to the string??

for char in line:
    if char in " ?.!/;:":
        line.replace(char,'')
+1  A: 

Strings are immutable in Python. The replace method returns a new string after the replacement. Try:

for char in line:
    if char in " ?.!/;:":
        line = line.replace(char,'')
Greg Hewgill
How can you iterate over line and modify it at the same time?
eumiro
@eumiro: The iteration proceeds over the *original* `line`.
Greg Hewgill
@Greg, good to know! So if I iterate over an array, I iterate over an original array. Iteration over an iterator wouldn't be possible.
eumiro
+9  A: 

Strings in python are immutable (can't be changed). Because of this, the effect of line.replace(...) is just to create a new string, rather than changing the old one. You need to rebind (assign) it to line in order to have that variable take the new value, with those characters removed.

Also, the way you are doing it is going to be kind of slow, relatively. It's also likely to be a bit confusing to experienced pythonators, who will see a doubly-nested structure and think for a moment that something more complicated is going on.

You can instead use str.translate:

line = line.translate(None, '!@#$')

— which only works on Python 2.6 and newer * —

or regular expression replacement with re.sub

import re
line = re.sub('[!@#$]', '', line)

The characters enclosed in brackets constitute a character class. Any characters in line which are in that class are replaced with the second parameter to sub: an empty string.


* for compatibility with earlier Pythons, you can create a "null" translation table to pass in place of None:

import string
line = line.translate(string.maketrans('', ''), '!@#$')

Here string.maketrans is used to create a translation table, which is just a string containing the characters with ordinal values 0 to 255.


As kevpie mentions in a comment on one of the answers, , and as noted in the documentation for str.translate, things work differently with Unicode strings.

When calling the translate method of a unicode string, you cannot pass the second parameter that we used up above. You also can't pass None as the first parameter, or even a translation table from string.maketrans. Instead, you pass a dictionary as the only parameter. This dictionary maps the ordinal values of characters (i.e. the result of calling ord on them) to the ordinal values of the characters which should replace them, or —usefully to us— None to indicate that they should be deleted.

So to do the above dance with a Unicode string you would call something like

translation_table = dict.fromkeys(map(ord, '!@#$'), None)
unicode_line = unicode_line.translate(translation_table)

Here dict.fromkeys and map are used to succinctly generate a dictionary containing

{ord('!'): None, ord('@'): None, ...}
intuited
+1  A: 
line = line.translate(None, " ?.!/;:")
Muhammad Alkarouri
+1 When using unicode it requires setting up a translation to delete instead of a delete string. http://docs.python.org/library/stdtypes.html#str.translate
kevpie
A: 
>>> line = "abc#@!?efg12;:?"
>>> ''.join( c for c in line if  c not in '?:!/;' )
'abc#@efg12'
ghostdog74
A: 

Am I missing the point here, or is it just the following:

>>> str = "ab1cd1ef"
>>> str.replace("1","")
'abcdef'
>>>

Put it in a loop:

>>>
>>> a = "a!b@c#d$"
>>> b = "!@#$"
>>> for i in range(0,len(b)):
...  a =a.replace(b[i],"")
...
>>> print a
abcd
>>>
Babil
oh, please, this is just embarrassing.
SilentGhost