Is there a good, actively maintained python library available for filtering malicious input such as XSS?
A:
The Strip-o-Gram library looks quite nice. I haven't checked it out properly, but it looks like it does things well (i.e. can whitelist HTML tags you specify, as well as HTML-escaping anything nasty).
Here's the example usage snippet, quoted from that page:
from stripogram import html2text, html2safehtml
mylumpofdodgyhtml # a lump of dodgy html ;-)
# Only allow <b>, <a>, <i>, <br>, and <p> tags
mylumpofcoolcleancollectedhtml = html2safehtml(mylumpofdodgyhtml,valid_tags=("b", "a", "i", "br", "p"))
# Don't process <img> tags, just strip them out. Use an indent of 4 spaces
# and a page that's 80 characters wide.
mylumpoftext = html2text(mylumpofcoolcleancollectedhtml,ignore_tags=("img",),indent_width=4,page_width=80)
Hope that helps.
Noldorin
2009-05-23 12:10:36
You can't just trust that attackers put in nice tags. Unless strip-o-gram works on heavily encoded tags (see rsnake's list: http://ha.ckers.org/xss.html) this won't work.
Mystic
2010-07-23 14:02:39
+1
A:
You can easily code XSS-defense in Python, see for example http://code.activestate.com/recipes/496942/ for an instructive and usable piece of code.
Alex Martelli
2009-05-23 16:00:05