tags:

views:

115

answers:

3

i've got regex which was alright, but as it camed out doesn't work well in some situations

Keep eye on message preview cause message editor do some tricky things with "\"

[\[]?[\^%#\$\*@\-;].*?[\^%#\$\*@\-;][\]]

its task is to find pattern which in general looks like that

[ABA]

  • A - char from set ^,%,#,$,*,@,-,;
  • B - some text
  • [ and ] are included in pattern

is expected to find all occurences of this pattern in test string

Black fox [#sample1#] [%sample2%] - [#sample3#] eats blocks.

but instead of expected list of matches

  • "[#sample1#]"
  • "[%sample2%]"
  • "[#sample3#]"

I get this

  • "[#sample1#]"
  • "[%sample2%]"
  • "- [#sample3#]"

And it seems that this problem will occur also with other chars in set "A". So could somebody suggest changes to my regex to make it work as i need?

and less important thing, how to make my regex to exclude patterns which look like that

[ABC]

  • A - char from set ^,%,#,$,*,@,-,;
  • B - some text
  • C - char from set ^,%,#,$,*,@,-,; other than A
  • [ and ] are included in pattern

for example

[$sample1#] [%sample2@] [%sample3;]

thanks in advance

MTH

+1  A: 

Why the first "?" in "[[]?"

\[[\^%#\$\*@\-;].*?[\^%#\$\*@\-;]\]

would detect your different strings just fine

To be more precise:

\[([\^%#\$\*@\-;])([^\]]*?)(?=\1)([\^%#\$\*@\-;])\]

would detect [ABA]

\[([\^%#\$\*@\-;])([^\]]*?)(?!\1)([\^%#\$\*@\-;])\]

would detect [ABC]

VonC
well it seems that i was making so much changes that i missed this
MoreThanChaos
+1  A: 

You have an optional matching of the opening square bracket:

[\]]?

For the second part of you question (and to perhaps simplify) try this:

\[\%[^\%]+\%\]|\[\#[^\#]+\#\]|\[\$[^\$]+\$\]

In this case there is a sub pattern for each possible delimiter. The | character is "OR", so it will match if any of the 3 sub expressions match.

Each subexpression will:

  • Opening bracket
  • Special Char
  • Everything that is not a special char (1)
  • Special char
  • Closing backet

(1) may need to add extra exclusions like ']' or '[' so it doesn't accidently match across a large body of text like:

[%MyVar#] blah blah [$OtherVar%]

Rob

Robert Wagner
+3  A: 
\[([%#$*@;^-]).+?\1\]

applied to text:

Black fox [#sample1#] [%sample2%] - [#sample3#] [%sample4;] eats blocks.

matches

  • [#sample1#]
  • [%sample2%]
  • [#sample3#]
  • but not [%sample4;]

EDIT

This works for me (Output as expected, regex accepted by C# as expected):

Regex re = new Regex(@"\[([%#$*@;^-]).+?\1\]");
string s = "Black fox [#sample1#] [%sample2%] - [#sample3#] [%sample4;] eats blocks.";

MatchCollection mc = re.Matches(s);
foreach (Match m in mc)
{
  Console.WriteLine(m.Value);
}
Tomalak
Yes but how would you detect ABC without using lookahead ? [^\1] does not work...
VonC
well c# RegEx engine seems not to like this expression, prhaps something was wrongly interpreted by message editor on this page?
MoreThanChaos
Well, reading it again - *Not* matching ABC is a requirement. My regex matches ABA exclusively. No lookahead needed.
Tomalak
Ofcourse you're right, in testing app i've had "ExplicitCapture" on for regex. So numbers of groups was just not right, diagnostic message didn't gaved me clue, so i reviewed my code and changed it, Now all works just fine, Thanks for your help
MoreThanChaos
So is that what you've been after or is it just one more way to do it?
Tomalak