tags:

views:

114

answers:

3

first, this is using preg.

String I'm trying to match:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa b c d xp

My regex and their matches:

(\S*\s*){0,1}\S*p = "d xp"
(\S*\s*){0,2}\S*p = "c d xp"
(\S*\s*){0,3}\S*p = NO MATCH (expecting "b c d xp"
(\S*\s*){0,4}\S*p = entire string
(\S*\s*){0,5}\S*p = entire string

Oddly, if I remove a single "a" it works. Also, (\S*\s*){0,3}\Sp or (\S*\s){0,3}\S*p both work.

Can someone explain why the third case results in no matches instead of "b c d xp"?

TIA!

A: 

WFM.

What language are you using?

jitter
sorry, that info would have been helpful. :\
robgmills
+7  A: 

Good question.

I tried another language that also has Perl RE syntax, Ruby, and it returned the expected string:

$ irb
>> s='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa b c d xp'
=> "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa b c d xp"
>> s[/(\S*\s*){0,3}\S*p/]
=> "b c d xp"

This made me think you found an interpreter bug...

But no, thanks to stereofrog, we now know that

  • Your RE was correct, as was your expectation of its results
  • PHP has a limit on backtracks, and the problem was your expression hit the limit. Ruby just doesn't check, or has a different limit.
DigitalRoss
+2  A: 

preg_last_error() returns PREG_BACKTRACK_LIMIT_ERROR, so increasing backtrack limit should probably fix the issue. Try

 ini_set('pcre.backtrack_limit', 500000);
stereofrog
Aha, someone who knows the real answer...
DigitalRoss
awesome. Thank you very much! this was driving me nuts.
robgmills