views:

142

answers:

2

Is there a comprehensive Html cleaner/Anti-Xss library for .NET that also has a defined whitelist. I know that Microsofts Anti-Xss is a good place to start, but it needs a good whitelist for allowed html tags and css. Does anyone know of something?

+3  A: 

According to MSDN (see "Allowing Restricted HTML Input") the best way to sanitize HTML input is to call HttpUtility.HtmlEncode() on your input and then selectively replace the encoding on all your whitelist tags like so:

<%@ Page Language="C#" ValidateRequest="false"%>    
<script runat="server">    
  void submitBtn_Click(object sender, EventArgs e)
  {
    // Encode the string input
    StringBuilder sb = new StringBuilder(
                            HttpUtility.HtmlEncode(htmlInputTxt.Text));
    // Selectively allow  and <i>
    sb.Replace("&lt;b&gt;", "<b>");
    sb.Replace("&lt;/b&gt;", "");
    sb.Replace("&lt;i&gt;", "<i>");
    sb.Replace("&lt;/i&gt;", "");
    Response.Write(sb.ToString());
  }
</script>

See also this article.

Repo Man
A: 

What's wrong with Microsoft's Anti-XSS library (which you've mentioned)?

They've got comprehensive HTML sanitizing that filters the characters based on a white list parses the HTML, filters the nodes based on a white-list, and then regenerates the (safe) HTML. You can change the white lists (since the code is open), but I'm not sure you'd want to.

Usage is simple too:

var sanitizedHtml = AntiXss.GetSafeHtmlFragment(inputHtml);
orip