views:

188

answers:

2

I am trying to remove all the html tags out of a string in Javascript. Heres what I have... I can't figure out why its not working....any know what I am doing wrong?

<script type="text/javascript">

var regex = "/<(.|\n)*?>/";
var body = "<p>test</p>";
var result = body.replace(regex, "");
alert(result);

</script>

Thanks a lot!

+3  A: 

Try this, noting that the grammar of HTML is too complex for regular expressions to be correct 100% of the time:

var regex = /(<([^>]+)>)/ig;
var body = "<p>test</p>";
var result = body.replace(regex, "");
alert(result);

If you're willing to use a library such as jQuery, you could simply do this:

alert($('<p>test</p>').text());
karim79
AWESOME, I didnt think about the jQuery option. That is way preferred! Thanks so much!
gmcalab
Why are you wrapping the regex in a string? var regex = /(<([^>]+)>)/ig;
brianary
@brianary - because I'm an idiot. Corrected.
karim79
This won't work. Specifically, it will fail on short tags: http://www.is-thought.co.uk/book/sgml-9.htm#SHORTTAG
Mike Samuel
A: 

For a proper HTML sanitizer in JS, see http://code.google.com/p/google-caja/wiki/JsHtmlSanitizer

Mike Samuel