views:

71

answers:

4

i have strings in the form [abc].[some other string].[can.also.contain.periods].[our match]

i now want to match the string "our match" (i.e. without the brackets), so i played around with lookarounds and whatnot. i now get the correct match, but i don't think this is a clean solution.

(?<=\.?\[)     starts with '[' or '.['
([^\[]*)      our match, i couldn't find a way to not use a negated character group
              `.*?` non-greedy did not work as expected with lookarounds,
              it would still match from the first match
              (matches might contain escaped brackets)
(?=\]$)       string ends with an ]

language is .net/c#. if there is an easier solution not involving a regex i'd be also happy to know

what really irritates me is the fact, that i cannot use (.*?) to capture the string, as it seems non-greedy does not work with lookbehinds.

i also tried: Regex.Split(str, @"\]\.\[").Last().TrimEnd(']');, but i'm not really pround of this solution either

A: 

With String.Split():

string input = "[abc].[some other string].[can.also.contain.periods].[our match]";
char[] seps = {'[',']','\\'};
string[] splitted = input.Split(seps,StringSplitOptions.RemoveEmptyEntries);

you get "out match" in splitted[7] and can.also.contain.periods is left as one string (splitted[4])

Edit: the array will have the string inside [] and then . and so on, so if you have a variable number of groups, you can use that to get the value you want (or remove the strings that are just '.')

Edited to add the backslash to the separator to treat cases like '\[abc\]'

Edit2: for nested []:

string input = @"[abc].[some other string].[can.also.contain.periods].[our [the] match]";
string[] seps2 = { "].["};
string[] splitted = input.Split(seps2, StringSplitOptions.RemoveEmptyEntries);

you our [the] match] in the last element (index 3) and you'd have to remove the extra ]

Rox
will this work with escaped brackets inside the last element?
knittl
@knittl: edited the answer. If you add the backslash to the separators, it will return the clean value at the same index if you meant something like '\[abc\]'
Rox
@rox, i mean `[…].[…].[this is \[the\] last value]`, then i want `"this is \[the\] last value"`
knittl
@knittl: added the code - I've split it by '].[' as the inside brackets don't have the . and now the split returns the entire value.
Rox
well, this is basically what i have with my `Regex.Split(…).Last().TrimEnd(']');` but i think there is still some misunderstanding of string.split from my side
knittl
A: 

You have several options:

  • RegexOptions.RightToLeft - yes, .NET regex can do this! Use it!
  • Match the whole thing with greedy prefix, use brackets to capture the suffix that you're interested in
    • So generally, pattern becomes .*(pattern)
    • In this case, .*\[([^\]]*)\], then extract what \1 captures (see this on rubular.com)

References

polygenelubricants
+1  A: 

Assuming you can guarantee the input format, and it's just the last entry you want, LastIndexOf could be used:

string input = "[abc].[some other string].[can.also.contain.periods].[our match]";

int lastBracket = input.LastIndexOf("[");
string result = input.Substring(lastBracket + 1, input.Length - lastBracket - 2);
David_001
+2  A: 

The following should do the trick. Assuming the string ends after the last match.

string input = "[abc].[some other string].[can.also.contain.periods].[our match]";

var search = new Regex("\\.\\[(.*?)\\]$", RegexOptions.RightToLeft);

string ourMatch = search.Match(input).Groups[1]);
Lillemanden