tags:

views:

64

answers:

6

I'm attempting to string match 5-digit coupon codes spread throughout a HTML web page. For example, 53232, 21032, 40021 etc... I can handle the simpler case of any string of 5 digits with [0-9]{5}, though this also matches 6, 7, 8... n digit numbers. Can someone please suggest how I would modify this regular expression to match only 5 digit numbers?

+4  A: 

try this

^[0-9]{5}$

^ indicate the beginning of the string and $ the end

Xavier Combelle
That won't help much for searching such strings in a web page
sth
right the other answers are better
Xavier Combelle
+4  A: 

full string: ^[0-9]{5}$

within a string: [^0-9][0-9]{5}[^0-9]

Crayon Violent
+1  A: 

A very simple way would be to match all groups of digits, like with r'\d+', and then skip every match that isn't five characters long when you process the results.

sth
A: 

You probably want to match a non-digit before and after your string of 5 digits, like [^0-9]([0-9]{5})[^0-9]. Then you can capture the inner group (the actual string you want).

Bob
+6  A: 
>>> import re
>>> s="four digits 1234 five digits 56789 six digits 012345"
>>> re.findall(r"\D(\d{5})\D", s)
['56789']

if they can occur at the very beginning or the very end, it's easier to pad the string than mess with special cases

>>> re.findall(r"\D(\d{5})\D", " "+s+" ")
gnibbler
A: 

You could try

\D\d{5}\D

or maybe

\b\d{5}\b

I'm not sure how python treats line-endings and whitespace there though.

I believe ^\d{5}$ would not work for you, as you likely want to get numbers that are somewhere within other text.

Zaki