ansaurus

Question

Regex matching too much

Answer 1

+1 A:

This happens because the RE is greedy; it will always try to produce the largest possible match.

It should be possible to make your RE engine non-greedy, see the linked document for tips on what to try.

unwind 2009-09-03 15:07:28

Answer 2

+3 A:

If you don't care about nested tags, you can do that :

(\[[cC]=)(#?([a-fA-F0-9]{3}){1,2})\](.*?)\[/[cC]\]
//                                     ^- lazy match

If you want to handle nested tags with regex, check this article on code project.

ybo 2009-09-03 15:13:47

Answer 3

+2 A:

Dot matches newline characters if you set the option RegexOptions.Singleline (more on that here).

acezanne 2009-09-03 15:19:03

Answer 4

A:

You need a lazy regular expression to not pick up all of the [c] tags

Try this

\[c=(#?.*?)\](.*?)\[/c\] or
\[c=(#?\w*?)\](\w*?)\[/c\]

You should set the options on your regex object to ingnore case.

skyfoot 2009-09-03 15:34:16

Answer 5

A:

Regex is a quick an dirty way to do this, and the solution here is to use .*? rather than just .*. However, if you want a more robust solution is probably easier without regex. In C# you happen to be able to do nested structures, but that doesn't mean it's actually easy. It would be better to use a lexical parser and construct a DOM. Most likely the code will be easier to read and maintain.

Adam Luter 2009-09-03 15:42:03

ansaurus

tags:

views:

answers:

Regex matching too much

related questions