tags:

views:

55

answers:

1

Hi,

I would like to know if:

/.*(Set-Cookie: (.*))?;.*(<\?xml.*)/

is an exponential regexp.

Thanks

+4  A: 

It really depends on the regex engine, but in most engines, that pattern is probably polynomial of a high degree (maybe cubic or higher) when there's no match.

You can use e.g. RegexBuddy to see how many steps it takes to match, and more importantly, to not match certain input. You can use this to benchmark how complex the backtracking process may be in certain engines.

It's not clear exactly what you are trying to do, but that pattern really doesn't do much with the Set-Cookie subpattern allowed to be optional (e.g. the group may not match that string even if it exists, since it's optional to begin with).

If you are trying to parse XML, then please, please, please do not use regular expressions. There are many XML parsers available in most modern languages, and they would not only be appropriate for the job, but they'd also be correct and much more pleasant to work with than regex.

References

Related questions


The pattern, debunked

To point out why that pattern doesn't "work" (which would render irrelevant whether or not it's fast or slow), consider the following input:

Set-Cookie: NOMNOMNOM;<?xml

With the pattern /.*(Set-Cookie: (.*))?;.*(<\?xml.*)/, the entire string is a match, but group 1 doesn't capture Set-Cookie: NOMNOMNOM, and group 2 doesn't capture NOMNOMNOM (as seen on rubular.com). That's because the leading .* gobbled up the cookie, and since the cookie subpattern is optional, it's still a match anyway.

We can try to "fix" this by making the leading .* reluctant as .*?. Now, group 1 can match Set-Cookie, which is perhaps the intent all along (as seen on rubular.com).

However, this is hardly an improvement. You really do not want to go down this direction. There are still many problems with the regex, and just getting it to work right will be very difficult, if not nearly impossible.

It should be noted that the pattern as given does match ";<?xml" (as seen on rubular.com). That is, as long as there's a ; anywhere in the string, and then later a <?xml, the pattern will match. It's not clear if this pattern really does anything useful.

Related questions

polygenelubricants
Thanks for your response. Here what we try to to do is to fetch the cookie header and the content of the query. No XMl parsing. At end we manage to do it with `/.*?(?:Set-Cookie: (.*);)?.*?(<\?xml.*)/`
mathk