views:

186

answers:

2

Hi All,

I have this simple regex replace based routine, is there anyway to improve its performance (and maybe also its elegance?)

public static string stripshrapnel(string str)
{
        string newstr = str.Trim();
        newstr = Regex.Replace(newstr, @"-", "");
        newstr = Regex.Replace(newstr, @"'", "");
        newstr = Regex.Replace(newstr, @",", "");
        newstr = Regex.Replace(newstr, @"""", "");
        newstr = Regex.Replace(newstr, @"\?", "");
        newstr = Regex.Replace(newstr, @"\#", "");
        newstr = Regex.Replace(newstr, @"\;", "");
        newstr = Regex.Replace(newstr, @"\:", "");
        //newstr = Regex.Replace(newstr, @"\(", "");
        //newstr = Regex.Replace(newstr, @"\)", "");
        newstr = Regex.Replace(newstr, @"\+", "");
        newstr = Regex.Replace(newstr, @"\%", "");
        newstr = Regex.Replace(newstr, @"\[", "");
        newstr = Regex.Replace(newstr, @"\]", "");
        newstr = Regex.Replace(newstr, @"\*", "");
        newstr = Regex.Replace(newstr, @"\/", "");
        newstr = Regex.Replace(newstr, @"\\", "");
        newstr = Regex.Replace(newstr, @"&", "&");
        newstr = Regex.Replace(newstr, @"&amp", "&");
        newstr = Regex.Replace(newstr, @" ", " ");
        newstr = Regex.Replace(newstr, @"&nbsp", " ");
        return newstr;
}

Thank you, Matt

+9  A: 

You can combine most of the expressions until you end up with only three:

public static string stripshrapnel(string str)
{
        string newstr = str.Trim();
        newstr = Regex.Replace(newstr, @"[-',""?#;:+%[\]*/\\\\]", "");
        newstr = Regex.Replace(newstr, @"&?", "&");
        newstr = Regex.Replace(newstr, @" ?", " ");
        return newstr;
}
Gumbo
Fantastic Gumbo!, thank you, does anyone have any ideas how much faster this would be (a rough %?)?
WickedW
@WickedW: Is the current performance unacceptable? If so, have you profiled the application to determine if this is a bottleneck? It's generally best to avoid premature optimization. (Although I would definitely look into replacing your original code with Gumbo's.)
TrueWill
@WickedW: I don’t expect it to be that much faster since it does the same work but only in a different way. But why don’t you benchmark it yourself?
Gumbo
+3  A: 

Since you are using zero regex features maybe there is another way. It seems like C# has a Replace method for strings, use that instead, I imagine that there is a lot of extra power used when doing regex instead of a simple replace.

adamse
+1 for use replace over regex where possible
bguiz
String.Replace does not have char-sets and would need similar code as in the question. Gumbo's solution is much faster and more readable.
Henk Holterman