I'm attempting to string match 5-digit coupon codes spread throughout a HTML web page. For example, 53232
, 21032
, 40021
etc... I can handle the simpler case of any string of 5 digits with [0-9]{5}
, though this also matches 6, 7, 8... n digit numbers. Can someone please suggest how I would modify this regular expression to match only 5 digit numbers?
views:
64answers:
6try this
^[0-9]{5}$
^ indicate the beginning of the string and $ the end
full string: ^[0-9]{5}$
within a string: [^0-9][0-9]{5}[^0-9]
A very simple way would be to match all groups of digits, like with r'\d+'
, and then skip every match that isn't five characters long when you process the results.
You probably want to match a non-digit before and after your string of 5 digits, like [^0-9]([0-9]{5})[^0-9]
. Then you can capture the inner group (the actual string you want).
>>> import re
>>> s="four digits 1234 five digits 56789 six digits 012345"
>>> re.findall(r"\D(\d{5})\D", s)
['56789']
if they can occur at the very beginning or the very end, it's easier to pad the string than mess with special cases
>>> re.findall(r"\D(\d{5})\D", " "+s+" ")
You could try
\D\d{5}\D
or maybe
\b\d{5}\b
I'm not sure how python treats line-endings and whitespace there though.
I believe ^\d{5}$
would not work for you, as you likely want to get numbers that are somewhere within other text.