tags:

views:

203

answers:

2

I am creating a regex library to work with HTML (I'll post it on MSDN Code when it's done). One of the methods removes any whitespace before a closing tag.

<p>See the dog run </p>

It would eliminate the space before the closing paragraph. I am using this:

    public static string RemoveWhiteSpaceBeforeClosingTag(string text)
    {
        string pattern = @"(\s+)(?:</)";
        return Regex.Replace(text, pattern, "</", Singleline | IgnoreCase);
    }

As you can see I am replacing the spaces with </ since I cannot seem to match just the space and exclude the closing tag. I know there's a way - I just haven't figured it out.

+10  A: 
cletus
That was it - thanks.I wish there was an alternative to processing the HTML I'm getting. You should have seen the IndexOf and LastIndexOf code that this is replacing 8-\
Tony Basallo
+3  A: 

You want a lookahead (?=) pattern:

\s+(?=</)

That can be replaced with ""

Daniel Martin