ansaurus

Question

Answer 1

+2 A:

strip only strips characters from the very front and back of the string.

To delete a list of characters, you could use the string's translate method:

import string
name = "Barack (of Washington)"
table = string.maketrans( '', '', )
print name.translate(table,"(){}<>")
# Barack of Washington

unutbu 2010-10-10 11:15:03

Answer 2

+1 A:

I did a time test here, using each method 100000 times in a loop. The results surprised me. (The results still surprise me after editing them in response to valid criticism in the comments.)

Here's the script:

import timeit

bad_chars = '(){}<>'

setup = """import re
import string
s = 'Barack (of Washington)'
bad_chars = '(){}<>'
rgx = re.compile('[%s]' % bad_chars)"""

timer = timeit.Timer('o = "".join(c for c in s if c not in bad_chars)', setup=setup)
print "List comprehension: ",  timer.timeit(100000)


timer = timeit.Timer("o= rgx.sub('', s)", setup=setup)
print "Regular expression: ", timer.timeit(100000)

timer = timeit.Timer('for c in bad_chars: s = s.replace(c, "")', setup=setup)
print "Replace in loop: ", timer.timeit(100000)

timer = timeit.Timer('s.translate(string.maketrans("", "", ), bad_chars)', setup=setup)
print "string.translate: ", timer.timeit(100000)

Here are the results:

List comprehension:  0.631745100021
Regular expression:  0.155561923981
Replace in loop:  0.235936164856
string.translate:  0.0965719223022

Results on other runs follow a similar pattern. If speed is not the primary concern, however, I still think string.translate is not the most readable; the other three are more obvious, though slower to varying degrees.

JasonFruit 2010-10-10 11:17:10

Why don't u go for timeit !!!

Tumbleweed 2010-10-10 14:05:41

Should have. Didn't.

JasonFruit 2010-10-10 14:09:48

Two things: (1) Your replace in loop solution does not actually do anything (you have to write o = o.replace(c, ""). (2) To test regex you should cache the regex pattern: char_re = re.compile('[%s]' % bad_chars) ...

Mike Axiak 2010-10-10 17:14:13

Also, I suspect the translate will be by far the fastest (see unutbu's answer)

Mike Axiak 2010-10-10 17:14:39

My mistake, Mike. That was pretty weak, but it was within 20 minutes of waking. I'll edit.

JasonFruit 2010-10-10 20:48:18

There, @Mike Axiak; parental duties prevented me from finishing my edit until now.

JasonFruit 2010-10-11 03:13:17

thanks for this - educative question, not only did I learn that strip() doesn't do what I thought, I also learned three other ways to achieve what I wanted, and which was the fastest!

AP257 2010-10-11 15:53:40

Answer 3

+6 A:

Because that's not what strip() does. It removes leading and trailing characters that are present in the argument, but not those characters in the middle of the string.

You could do:

name= name.replace('(', '').replace(')', '').replace ...

or:

name= ''.join(c for c in name if c not in '(){}<>')

or maybe use a regex:

import re
name= re.sub('[(){}<>]', '', name)

bobince 2010-10-10 11:17:15

Answer 4

A:

Because strip() only strips trailing and leading characters, based on what you provided. I suggest:

>>> import re
>>> name = "Barack (of Washington)"
>>> name = re.sub('[\(\)\{\}<>]', '', name)
>>> print(name)
Barack of Washington

Ruel 2010-10-10 11:20:07

In a regex character class you don't need to escape anything, so '[(){}<>]' is fine

Mike Axiak 2010-10-10 17:16:04

ansaurus

tags:

views:

answers:

Python strip() multiple characters?

related questions