If I enter the expression in Regexbuddy, it presents following message
The match attempt was aborted early
because the regular expression is too
complex. The regex engine you plan to
use it with may not be able to handle
it at all and crash. Look up
"catastrophic backtracking" in the
help file to learn how to avoid this
situation.
Looking up catastrophic backtracking gives the following explanation
Runaway Regular Expressions: Catastrophic Backtracking
Consider the regular expression (x+x+)+y.
Before you scream in horror and say
this contrived example should be
written as xx+y to match exactly the
same without those terribly nested
quantifiers: just assume that each "x"
represents something more complex,
with certain strings being matched by
both "x". See the section on HTML
files below for a real example.
Let's see what happens when you apply
this regex to xxxxxxxxxxy. The first
x+ will match all 10 x characters. The
second x+ fails. The first x+ then
backtracks to 9 matches, and the
second one picks up the remaining x.
The group has now matched once. The
group repeats, but fails at the first
x+. Since one repetition was
sufficient, the group matches. y
matches y and an overall match is
found. The regex is declared
functional, the code is shipped to the
customer, and his computer explodes.
Almost.
The above regex turns ugly when the y
is missing from the subject string.
When y fails, the regex engine
backtracks. The group has one
iteration it can backtrack into. The
second x+ matched only one x, so it
can't backtrack. But the first x+ can
give up one x. The second x+ promptly
matches xx. The group again has one
iteration, fails the next one, and the
y fails. Backtracking again, the
second x+ now has one backtracking
position, reducing itself to match x.
The group tries a second iteration.
The first x+ matches but the second is
stuck at the end of the string.
Backtracking again, the first x+ in
the group's first iteration reduces
itself to 7 characters. The second x+
matches xxx. Failing y, the second x+
is reduced to xx and then x. Now, the
group can match a second iteration,
with one x for each x+. But this
(7,1),(1,1) combination fails too. So
it goes to (6,4) and then (6,2)(1,1)
and then (6,1),(2,1) and then
(6,1),(1,2) and then I think you start
to get the drift.
If you try this regex on a 10x string
in RegexBuddy's debugger, it'll take
2558 steps to figure out the final y
is missing. For an 11x string, it
needs 5118 steps. For 12, it takes
10238 steps. Clearly we have an
exponential complexity of O(2^n) here.
At 21x the debugger bows out at 2.8
million steps, diagnosing a bad case
of catastrophic backtracking.
RegexBuddy is forgiving in that it
detects it's going in circles, and
aborts the match attempt. Other regex
engines (like .NET) will keep going
forever, while others will crash with
a stack overflow (like Perl, before
version 5.10). Stack overflows are
particularly nasty on Windows, since
they tend to make your application
vanish without a trace or explanation.
Be very careful if you run a web
service that allows users to supply
their own regular expressions. People
with little regex experience have
surprising skill at coming up with
exponentially complex regular
expressions.
I assume you are going to have to handle it in code. I'd suggest you contact the author of Regexbuddy and ask what is needed to detect this scenario.