views:

1297

answers:

5

Hello!

I was wondering, and was as of yet, unable to find any answers online, how to accomplish the following.

Let's say I have a string that contains the following:

my_string = "Hello, I am a string." (in the preview window I see that this is actually formatting in BOLD and ITALIC instead of showing the "strong" and "i" tags)

Now, I would like to make this secure, using the html_escape() (or h()) method/function. So I'd like to prevent users from inserting any javascript and/or stylesheets, however, I do still want to have the word "Hello" shown in bold, and the word "string" shown in italic.

As far as I can see, the h() method does not take any additional arguments, other than the piece of text itself.

Is there a way to escape only certain html tags, instead of all? Like either White or Black listing tags?

Example of what this might look like, of what I'm trying to say would be:

h(my_string, :except => [:strong, :i]) # => so basically, escape everything, but leave "strong" and "i" tags alone, do not escape these.

Is there any method or way I could accomplish this?

Thanks in advance!

+3  A: 

Excluding specific tags is actually pretty hard problem. Especially the script tag can be inserted in very many different ways - detecting them all is very tricky.

If at all possible, don't implement this yourself.

hrnt
+1  A: 

Have you considered using RedCloth or BlueCloth instead of actually allowing HTML? These methods provide quite a bit of formatting options and manage parsing for you.

Edit 1: I found this message when browsing around for how to remove HTML using RedCloth, might be of some use. Also, this page shows you how version 2.0.5 allows you to remove HTML. Can't seem to find any newer information, but a forum post found a vulnerability. Hopefully it has been fixed since that was from 2006, but I can't seem to find a RedCloth manual or documentation...

Topher Fangio
RedCloth is great, but it will _not_ strip out any html tags; I can insert <script></script> and it will not be escaped. I'm not sure about how BlueCloth strips HTML; I've not used it before.
zgchurch
+1  A: 

Use the white list plugin or a modified version of it . It's superp! You can have a look Sanitize as well(Seems better, never tried it though).

khelll
A: 

I would second Sanitize for removing HTML tags. It works really well. It removes everything by default and you can specify a whitelist for tags you want to allow.

Kris
A: 

Preventing XSS attacks is serious business, follow hrnt's and consider that there is probably an order of magnitude more exploits than that possible due to obscure browser quirks. Although html_escape will lock things down pretty tightly, I think it's a mistake to use anything homegrown for this type of thing. You simply need more eyeballs and peer review for any kind of robustness guarantee.

I'm the in the process of evaluating sanitize vs XssTerminate at the moment. I prefer the xss_terminate approach for it's robustness—scrubbing at the model level will be quite reliable in a regular Rails app where all user input goes through ActiveRecord, but Nokogiri and specifically Loofah seem to be a little more peformant, more actively maintained, and definitely more flexible and Ruby-ish.

Update I've just implemented a fork of ActsAsTextiled called ActsAsSanitiled that uses Santize (which has recently been updated to use nokogiri by the way) to guarantee safety and well-formedness of the RedCloth output, all without needing any helpers in your templates.

dasil003