I need to replace non-numeric chars from a string.
For example, "8-4545-225-144" needs to be "84545225144"; "$334fdf890==-" must be "334890".
How can I do this?
I need to replace non-numeric chars from a string.
For example, "8-4545-225-144" needs to be "84545225144"; "$334fdf890==-" must be "334890".
How can I do this?
It is possible with regex.
import re
...
return re.sub(r'\D', '', theString)
Although a little more complicated to set up, using the translate() string method to delete the characters as shown below can as much as 4-6 times faster than using join() or re.sub() according to timing tests I performed -- so if it is something done many times, you might want to consider using this instead.
nonnumerics = ''.join(c for c in ''.join(chr(i) for i in range(256)) if not c.isdigit())
astring = '123-$ab #6789'
print astring.translate(None, nonnumerics)
# 1236789
I prefer regular expressions, so here's a way if you like
import re
myStr = '$334fdf890==-'
digts = re.sub('[^0-9]','',myStr)
This should replace all nonnumeric occurences with '' i.e. with nothing. So digts variable should be '334890'
Let's time the join and the re versions:
In [3]: import re
In [4]: def withRe(theString): return re.sub('\D', '', theString)
...:
In [5]:
In [6]: def withJoin(S): return ''.join(c for c in S if c.isdigit())
...:
In [11]: s = "8-4545-225-144"
In [12]: %timeit withJoin(s)
100000 loops, best of 3: 6.89 us per loop
In [13]: %timeit withRe(s)
100000 loops, best of 3: 4.77 us per loop
The join version is much nicer, compared to the re one, but unfortunately is 50% slower. So if the performance is an issue, the elegance might need to be sacrificed.
EDIT
In [16]: def withFilter(s): return filter(str.isdigit, s)
....:
In [19]: %timeit withFilter(s)
100000 loops, best of 3: 2.75 us per loop
It looks like filter is the performance and readability winner
filter(str.isdigit, s) is faster and IMO clearer than anything else listed here.
It will also throw a TypeError if s is a unicode type. Depending on what definition of "digits" you want, this can be more or less useful than the alternative filter(type(s).isdigit, s), slightly slower but still faster than the re and comprehension versions for me.
Edit: Although if you are a poor sucker stuck with Python 3, you will need to use "".join(filter(str.isdigit, s)) which puts you firmly in the realm of equivalently bad performance. Such progress.