tags:

views:

25

answers:

2

Hi

I have problem with lookahead assertion (?=). For example, I have expression:

/Win(?=2000)/

It match Win, if expression is like Win2000, Win2000fgF. I have next expression:

^(?=.*\d)(?=.*[a-z]).*$

It match for digit and lower case letter, for example: 45dF, 4Dd. But I don't know, why it works and match all characters :) I haven't characters, which are before (?=.*\d). I think, only this expression should work:

^.\*(?=.*\d)(?=.*[a-z]).*$

(with \* before expression).

Could you explain it?

Regards

A: 

lookaheads don't match, they assert.

this means that if you use a lookahead, you need something that will match what you want if you want to go any further.

sreservoir
+2  A: 

Let's say we are the regex engine and apply the regex ^(?=.*\d)(?=.*[a-z]).*$ to the string 2a.

Starting at position 0 (before the first character):

  1. ^: Make sure we're at the start of the string: OK
  2. (?=: Let's check if the following regex could match...
  3. .*: match any number of characters -> 2a. OK.
  4. \d: Nope, we're already at the end. Let's go back one character: a --> No, doesn't match. Go back another one: 2 --> MATCH!
  5. ): End of lookahead, match successful. We're still at position 0!
  6. (?=: Let's check if the following regex could match...
  7. .*: match any number of characters -> 2a. OK.
  8. [a-z]: Nope, we're already at the end. Let's go back one character: a --> MATCH!
  9. ): End of lookahead, match successful. We're still at position 0!
  10. .*: match any number of characters -> 2a --> MATCH!
  11. $: Let's see - are we at the end of the string? Yes, we are! --> MATCH!
  12. Hey, we've reached the end of the regex. Great. Match completed!
Tim Pietzcker
Thanks for great explanation. Could you tell me, why ^(?=.*\d)(?=.*[a-z])$ (without .* at the end) doesn't work for 2a? This regex should match?
luk4443
Well, imagine you leave out step 10 - the regex engine is still at position 0, so it fails in matching the `$`.
Tim Pietzcker
Ok, thank you :)
luk4443