tags:

views:

63

answers:

3

Hi,

Trying to use a reasonably long regex and it comes down to this small section that doesn't match how I'd expect it to.

>>> re.search(r'(foo)((?<==)bar)?', 'foo').groups()
('foo', None)

>>> re.search(r'(foo)((?<==)bar)?', 'foo=bar').groups()
('foo', None)

The first one is what I'm after, the second should be returning ('foo', 'bar').

I suspect I'm just misunderstanding how lookbehinds are meant to work, some some explanation would be great.

A: 

Why don't you just use:

(foo)=?(bar)?

Also the following expression seems to be more correct as it captures the '=' within the full match, but your original expression does not capture that at all:

(foo).?((?<==)bar)?
Superfilin
The first option though would also match "foobar" I believe, whereas I want key value pairs.
pjrharley
+1  A: 

The look behind target is never included in the match - it's supposed to serve as an anchor, but not actually be consumed by the regex.

The look behind pattern is only supposed to match if the current position is preceded by the target. In your case, after matching the "foo" in the string, the current position is at the "=", which is not preceded by a "=" - it's preceded by an "o".

Another way to see this is by looking at the re documentation and reading

Note that patterns which start with positive lookbehind assertions will never match at the beginning of the string being searched;

After you match the foo, your look behind is trying to match at the beginning of (the remainder of) the string - this will never work.

Others have suggested regexes that may probably serve you better, but I think you're probably looking for

>>> re.search('(foo)(=(bar))?', 'foo=bar').groups()
('foo', '=bar', 'bar')

If you find the extra group is a little annoying, you could omit the inner "()"s and just chop the first character off the matched group...

Blair Conrad
I've actually just used a non matching group, but I now understand that I basically needed "=?" between the two groups to make what I was doing work. I did temporarily have it just matching the = and chopping it off the start, but it seemed a bit hackish seeing as I was already doing a massive regex, I might as well get it to match what I really want!
pjrharley
Of course! Non-matching group. Mind like a sieve, doncha know.
Blair Conrad
+1  A: 

You probably just want (foo)(?:=(bar))? using a non-capturing group (?:).

A lookbehind assertion just looks to the left of the current position and checks whether the supplied expression matches or not. Hence your expression matches foo and then checks if the input to the left - the second o in foo - matches =. This of course always fails.

Daniel Brückner
Don't know why I didn't think of this, the regex already has tons of the same construct in it! Does end up looking rather messy, but I guess thats just regexs! Afraid I can't vote you up yet though...
pjrharley