tags:

views:

68

answers:

3

Let's say I'm trying to match /dog.*lab/ against this text:

"I have a dog. My dog is a black lab. He was created in a laboratory."

Greedily, it would match "dog. My dog is a black lab. He was created in a lab".

I want to find the smallest possible match from both sides. If I use the ungreedy modifier like
/dog.*?lab/ or /dog.*lab/U it will match less but still too much:
"dog. My dog is a black lab"

Is there a way to make my search ungreedy from the left also, thus matching only "dog is a black lab"?

Much thanks. Sorry for the contrived example.

+1  A: 

An idea might be to try to use a negated character set, like [^.!?], which would match all characters except ., ? and !, and therefore you can be sure that it is within the same sentence:

$string = "I have a dog. My dog is a black lab. He was created in a laboratory.";
preg_match('/dog[^.!?]*?lab/', $string, $match);
echo $match[0]; // Echoes "dog is a black lab"
Frxstrem
Thanks. This is true, but I cannot guarantee in my application that the requisite match will be within one sentence or line.
Wiseguy
+2  A: 

This works for me:

$str = "I have a dog. My dog is a black lab. He was created in a laboratory.";
if(preg_match('/.*(dog.*?lab)/',$str,$m)) {
    var_dump($m);
}
codaddict
I'm not sure why this was down-voted. This idea is sound. To force the non-greedy match to be as small as possible, put a greedy search immediately in front of it and then inspect the capture group.
Andrew
@Andrew, @codaddict: the downvote didn't come from me, but refer to latest comment from OP about wanting multiple matches. This will only find the last match.
polygenelubricants
The last match isn't always the smallest match. It is for *this* test string, but it's not difficult to come up with a string in which the smallest match is actually in the middle or at the beginning.
Frank Farmer
+7  A: 

You could use a look-ahead assertion that excludes the occurrence of dog between dog and lab:

/dog(?:(?!dog).)*?lab/
Gumbo
@polygenelubricants: Oh yes, of course. Thanks for the remark.
Gumbo
This takes a bit more preprocessing logic to prepare my keyword(s) for regex search, but it works perfectly. Thank you for the info.
Wiseguy