tags:

views:

434

answers:

2

Is there a good, actively maintained python library available for filtering malicious input such as XSS?

A: 

The Strip-o-Gram library looks quite nice. I haven't checked it out properly, but it looks like it does things well (i.e. can whitelist HTML tags you specify, as well as HTML-escaping anything nasty).

Here's the example usage snippet, quoted from that page:

  from stripogram import html2text, html2safehtml
  mylumpofdodgyhtml # a lump of dodgy html ;-)
  # Only allow <b>, <a>, <i>, <br>, and <p> tags
  mylumpofcoolcleancollectedhtml = html2safehtml(mylumpofdodgyhtml,valid_tags=("b", "a", "i", "br", "p"))
  # Don't process <img> tags, just strip them out. Use an indent of 4 spaces 
  # and a page that's 80 characters wide.
  mylumpoftext = html2text(mylumpofcoolcleancollectedhtml,ignore_tags=("img",),indent_width=4,page_width=80)

Hope that helps.

Noldorin
You can't just trust that attackers put in nice tags. Unless strip-o-gram works on heavily encoded tags (see rsnake's list: http://ha.ckers.org/xss.html) this won't work.
Mystic
+1  A: 

You can easily code XSS-defense in Python, see for example http://code.activestate.com/recipes/496942/ for an instructive and usable piece of code.

Alex Martelli