tags:

views:

47

answers:

2

Hi
This is in continuation of my earlier question where I wanted to compile many patterns as one regular expression and after the discussion I did something like this

REGEX_PATTERN = '|'.join(self.error_patterns.keys())

where self.error_patterns.keys() would be pattern like

: error:  
: warning:
cc1plus: 
undefine reference to
Failure:  

and do

error_found = re.findall(REGEX_PATTERN,line) 

Now when I run it against some file which might contain one or more than one patterns, how do I know what pattern exactly matched? I mean I can anyway see the line manually and find it out, but want to know if after doing re.findall I can find out the pattern like re.group() or something

Thank you

A: 

re.findall will return all portions of text that matched your expression.

If that is not sufficient to identify the pattern unambiguously, you can still do a second re.match/re.find against the individual subpatterns you have join()ed. At the time of applying your initial regular expression, the matcher is no longer aware that you have composed it of several subpatterns however, hence it cannot provide more detailed information which subpattern has matched.

Another, equally unwieldy option would be to enclose each pattern in a group (...). Then, re.findall will return an array of None values (for all the non-matching patterns), with the exception of the one group that matched the pattern.

relet
A: 

MatchObject has a lastindex property that contains the index of the last capturing group that participated in the match. If you enclose each pattern in its own capturing group, like this:

(: error:)|(: warning:)

...lastindex will tell you which one matched (assuming you know the order in which the patterns appear in the regex). You'll probably want to use finditer() (which creates an iterator of MatchObjects) instead of findall() (which returns a list of strings). Also, make sure there are no other capturing groups in the regex, to throw your indexing out of sync.

Alan Moore