I have text stored in SQL as HTML. I'm not guaranteed that this data is well-formed, as users can copy/paste from anywhere into the editor control I'm using, or manually edit the HTML that's generated.
The question is: what's the best way of going about removing or somehow ignoring <script/>
and <form/>
tags so that, when the user's text is displayed elsewhere in the Web Application, it doesn't disrupt the normal operation of the containing page.
I've toyed with the idea of simply doing a "Find and Replace" for <script>
/<form>
with <div>
(obviously taking into account whitespace and closing tags, if they exist). I'm also open to any way to somehow "ignore" certain tags. For all I know, there could be some built-in way of saying (in HTML, CSS, or JavaScript) "for all elements in <div id="MyContent">
, treat <form>
and <script>
as <div>
.
Any help or advice would be greatly appreciated!