tags:

views:

52

answers:

1

Hi,

I'm having the following problem and wondered whether anyone could see why this is crashing my .net worker process (aspnet_wp.exe):

  Dim pattern As String = "\{\{IF\(((?!\)}})(.))+,,,((\s)*(?!\)}})(.))+\)}}"
  Dim mc As RegularExpressions.MatchCollection = Regex.Matches(txtContent.Text, pattern)

Works absolutely fine if a match is found, e.g

<h3>Title</h3>
<p>Top paragraph.</p>

{{IF(1=2,,, <p></p>)}}

But if no match is found it seems to nail my cpu and run for quite a while, e.g if the last bracket is missing before the final two curly brakcets:

<h3>Title</h3>
<p>Top paragraph.</p>

{{IF(1=2,,, <p></p>}}

Is it too greedy that it searches forever!? Thanks!

+3  A: 

The problem is easily identified: "Catastrophic Backtracking".

Every time you see the "if a match exists, it works, if no match exists, it takes forever" phenomenon, you can be sure that's the cause.

I suggest a different regex that does less backtracking. Atomic grouping can help keeping backtracking steps at a minimum:

Dim pattern As String = "\{\{IF\((?>(?:(?!,,,).)+),,,(?>(?:(?!\)\}\}).)+)\}\}"
Dim mc As RegularExpressions.MatchCollection = Regex.Matches(txtContent.Text, pattern)

The pattern (don't know if I capture everything you need - add parentheses where you see fit):

\{\{IF\(                # "{{IF("
(?>(?:(?!,,,).)+)       # atomic group: any char up to the ",,,"
,,,                     # ",,,"
(?>(?:(?!\)\}\}).)+)    # atomic group: any char up to the ")}}"
\)\}\}                  # ")}}"
Tomalak
+1, great article
JaredPar
Thanks Tomalak, i'll read through now!
stibstibstib