views:

143

answers:

6

We want to allow "normal" href links to other webpages, but we don't want to allow anyone to sneak in client-side scripting.

Is searching for "javascript:" within the HREF and onclick/onmouseover/etc. events good enough? Or are there other things to check?

+3  A: 

You'll have to use a whitelist of allowed protocols to be completely safe. If you use a blacklist, sooner or later you'll miss something like "telnet://" or "shell:" or some exploitable browser-specific thing you've never heard of...

Ant P.
+2  A: 

Nope, there's a lot more that you need to check.

First of the URL could be encoded (using HTML entities or URL encoding or a mixture of both).

Secondly you need to check for malformed HTML, which the browser might guess at and end up allowing some script in.

Thirdly you need to check for CSS based script, e.g. background: url(javascript:...) or width:expression(...)

There's probably more that I've missed - you need to be careful!

Greg
+4  A: 

It sounds like you're allowing users to submit content with markup. As such, I would recommend taking a look at a few articles about preventing cross-site scripting which would cover a bit more than simply preventing javascript from being inserted into an HREF tag. Below is one I found that might be useful:

http://weblogs.java.net/blog/gmurray71/archive/2006/09/preventing_cros.html

Anne Porosoff
A: 

You have to be extremely careful when taking user input. You'll want to do a whitelist as mentioned, but not just with the href. Example:

<img src="nosuchimage.blahblah" onerror="alert('Haxored!!!');" />

or

<a href="about:blank;" onclick="alert('Haxored again!!!');">click meh</a>
Timothy Khouri
A: 

one option would be to disallow html at all and use the same sort of formatting that some forums use. Just replace

[url="xxx"]yyy[/url]

with

<a href="xxx">yyy</a>

That'll get you around the issues with mouse over etc. Then just make sure the link starts off with a white-listed protocol, and doesn't have a quote in it (&quot; or some such that might be decrypted by php or the browser).

Ed Marty
A: 

Sounds like you're looking for the companion function to PHP's strip_tags, which is strip_attributes. Unfortunately, it hasn't been written yet. (Hint, hint.)

There is, however, an interesting-looking suggestion in the strip_tags documentation, here:

http://www.php.net/manual/en/function.strip-tags.php#85718

In theory this will strip anything that isn't an href, class, or ID from submitted links; seems like you probably want to lock it down even further and just take hrefs.

Kent Brewster