tags:

views:

54

answers:

2

In python, I'm compiling a regular expression pattern like so:

rule_remark_pattern = re.compile('access-list shc-[(in)(out)] [(remark)(extended)].*')

I would expect it to match any of the following lines:

access-list shc-in remark C883101 Permit http from UPHC outside to Printers inside
access-list shc-in extended permit tcp object-group UPHC-Group-Outside object-group PRINTER-Group-Inside object-group http-https 
access-list shc-out remark C890264 - Permit (UDP 123) from UPHC-Group-Inside to metronome.usc.edu
access-list shc-out extended permit udp object-group UPHC-Group-Inside host 68.181.195.12 eq ntp 

Unfortunately, it doesn't match any of them. However, if I write the regular expression as:

rule_remark_pattern = re.compile('access-list shc-in [(remark)(extended)].*')

It matches the first 2 just fine.

Similarly, if I write:

rule_remark_pattern = re.compile('access-list shc-out [(remark)(extended)].*')

It matches the last 2.

Anybody know what's going on here?

+3  A: 

My regex-fu is not Python-based, but assuming it is anything like standard, I think you are misunderstanding the use of '[' and ']'. They represent a character class and it seems like you need an alternation.

Try replacing your "[(word1)(word2)]" constructs with "(word1|word2)".

EDIT: Just checked the Python docs (here: http://docs.python.org/library/re.html) and I don't see any relevant differences between Python regexen and what I'm used to (ie nothing that should affect the accuracy of this answer.)

fd
+2  A: 

That's mainly because you completely mis-understood how "defining alternatives" works in regular expressions:

access-list shc-(in|out) (remark|extended).*

Your attempt creates character classes. Every character in a character class stands on its own, and the class itself really only matches a single character from the allowed list. So your try:

[(in)(out)]

really is the same as

[intou(())]

which actually is the same as [intou()] because repeated characters in a character class are ignored.

Tomalak