ansaurus

Question

Help building a regular expression in python using the re module

Answer 1

A:

I think you can solve this simply by reordering your checking to match equivalences first, and then implications. However, this seems to work :

>>> IMPLICATION = re.compile(r'\s*[^\<]\-\>\s*')
>>> EQUIVALENCE = re.compile(r'\s*\<\-\>\s*')

sykora 2009-02-05 04:00:42

Answer 2

+4 A:

As far as I can tell, your regular expressions are equivalent to the following:

# This is bad, because IMPLICATION also will match every
# string that EQUIVALENCE matches
IMPLICATION = re.compile("->")
EQUIVALENCE = re.compile("<->")

As you've written it, you're also matching for zero or more whitespace characters before the -> and <-> literal. But you're not capturing the spaces, so it's useless to specify "match whether spaces are present or not". Also, note that - and > do not need to be escaped in these regular expressions.

You have two options as I see it. The first is to make sure that IMPLICATION does not match the same strings as EQUIVALENCE

# This ought to work just fine.
IMPLICATION = re.compile("[^<]->")
EQUIVALENCE = re.compile("<->")

Another option is to use the maximal munch method; i.e., match against all regular expressions, and choose the longest match. This would resolve ambiguity by giving EQUIVALENCE a higher precedence than IMPLICATION.

Triptych 2009-02-05 04:59:24

+1 for cutting down on the cruft. since the op uses elif, though, simply reversing the order of the two tests should do the same trick (although implicitly, not explicitly)

hop 2009-02-05 11:22:14

ansaurus

tags:

views:

answers:

Help building a regular expression in python using the re module

related questions