I need to check if user submitted HTML contains any javascript. Im using PHP for validation.
Thanks for any help!
I need to check if user submitted HTML contains any javascript. Im using PHP for validation.
Thanks for any help!
Scan for script tags, events (as Tom Haigh commented) and href="javascript:...".
You could remove the script tags as Pawka states using regular expressions. I found a thread on this here.
Basically it's:
$list=preg_replace('#<script[^>]*>.*?</script>#is','',$list);
Code is from that page, not written by me.
You'll need to scan for <script>
tags but you'll also need to scan for attributes like onclick=""
or onmouseover=""
etc... that can have javascript without the need for the script tags.
It might be better to take a different approach and use something like HTML Purifier to filter out anything that you don't want. I think it would be very difficult to safely remove any possibility of javascript without actually parsing the HTML properly.
If you want to protect yourself against Cross-Site Scripting (XSS), then you should better use a whitelist than a blacklist. Because there are too many aspects you need to consider when looking for XSS attacks.
Just make a list of all HTML tags and attributes you want to allow and remove/escape all other tags/attributes. And for those attributes that can be used for XSS attacks, validate the values to only allow harmless values.
OK, let's not all be naive here:
<script> "<!-- </script> -->"; document.write("hello world"); </script>
(should pass the filters suggested by regexadvice)
filtering-out javascript is a security-critical thing, which means you need to do it thoroughly and properly, not some quick hack.