I need to find, process and remove (one by one) any substrings that match a rather long regex:
# p is a compiled regex
# s is a string
while 1:
m = p.match(s)
if m is None:
break
process(m.group(0)) #do something with the matched pattern
s = re.sub(m.group(0), '', s) #remove it from string s
The code above is not good for 2 reasons:
It doesn't work if m.group(0) happens to contain any regex-special characters (like *, +, etc.).
It feels like I'm duplicating the work: first I search the string for the regular expression, and then I have to kinda go look for it again to remove it.
What's a good way to do this?