ansaurus

Question

Answer 1

+5 A:

Dale Halliwell 2009-09-29 09:50:00

Answer 2

+1 A:

Try this instead:

new Regex(@"\[" + tagKey + @":[^\]+]");

The only thing I changed was to add + to the [^\] pattern, meaning that you match one or more characters that are not a backslash.

Drew Noakes 2009-09-29 09:57:47

Answer 3

+2 A:

If you want to do this without a Regex it isn't difficult. You're already searching for a specific tag key, so you can just search for "[" + tagKey, then search from there for the closing "]", and remove everything between those offsets. Something like...

int posStart = message.IndexOf("[" + tagKey + ":");
if(posStart >= 0)
{
    int posEnd = message.IndexOf("]", posStart);
    if(posEnd > posStart)
    {
        message = message.Remove(posStart, posEnd - posStart);
    }
}

Is that better than a Regex solution? Since you're only looking for a specific key I think it probably is, on the grounds of simplicity. I love Regexes but they're not always the clearest answer.

Edit: Another reason the IndexOf() solution could be seen as better is that it means there is only one rule for finding the start of the tag, whereas the original code uses a Contains() which searches for something like '[tag:' and then uses a regex which uses a slightly different expression to do the substitution / removal. In theory you could have text which matches one criterion but not the other.

AAT 2009-09-29 11:04:52

It looks like the original question is trying to cater for escaped `[` or `]` within the tag too.

Drew Noakes 2009-09-29 11:59:28

I ended up with a variation of this because it turned out I needed more extension methods for other stuff (eg: Dictionary<string, string> GetTags(this message m), string GetTagValue(this message m, string tagKey), etc...). The refactoring meant that there was no need for regexes.

grenade 2009-09-29 12:10:23

@Drew Noakes: I don't think that 'Regex(@"\[" + tagKey + @":[^\]]");' is doing quite what you thought (maybe not what grenade thought!). If you imagine that tagKey is say "Zippy" then the regex becomes '\[Zippy:[^\]]' which means "match Zippy followed by a colon followed by one character which isn't a "]".' (Or as you noted in your answer you can put a + on the end to make it match one or more characters which aren't a "]".) But it doesn't do anything clever about avoiding escaped brackets as far as I can work out (using my mental regex parser...).

AAT 2009-09-29 12:34:09

Answer 4

+1 A:

I think this is the regex you're looking for:

string regex = @"\[" + tag + @":[^\]+]\]";

Also, you don't need to do a separate check to see if there are tags of that type. Just do a regex replace; if there are no matches, the original string is returned.

public static string RemoveTagWithKey(string message, string tagKey) {
    string regex = @"\[" + tag + @":[^\]+]\]";
    return Regex.Replace(message, regex, string.Empty);
}

You seem to be writing an extension method, but I wrote this as a static utility method to keep things simple.

Alan Moore 2009-09-29 12:57:00

ansaurus

tags:

views:

answers:

help with a tag removal regex

related questions