Using PHP, given a string such as: this is a <strong>string</strong>
; I need a function to strip out ALL html tags so that the output is: this is a string
. Any ideas? Thanks in advance.
views:
250answers:
4
+11
A:
PHP has a built-in function that does exactly what you want: strip_tags
$text = '<b>Hello</b> World';
print strip_tags($text); // outputs Hello World
If you expect broken HTML, you are going to need to load it into a DOM parser and then extract the text.
Paolo Bergantino
2009-08-10 17:32:30
+1 but be careful that strip_tags may not strip invalid HTML tags, so depending on the application you may need to do some extra processing afterwards..
Miky Dinescu
2009-08-10 17:35:00
strip_tags() is very bad for xss protection as it only defends against a couple of xss attack vectors. Use htmlspecialchars($var,ENT_QUOTES)
Rook
2010-02-09 18:48:30
+5
A:
What about using strip_tags, which should do just the job ?
For instance (quoting the doc) :
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
echo "\n";
will give you :
Test paragraph. Other text
Edit : but note that strip_tags doesn't validate what you give it. Which means that this code :
$text = "this is <10 a test";
var_dump(strip_tags($text));
Will get you :
string 'this is ' (length=8)
(Everything after the thing that looks like a starting tag gets removed).
Pascal MARTIN
2009-08-10 17:33:44
+1
A:
strip_tags
is the function you're after. You'd use it something like this
$text = '<strong>Strong</strong>';
$text = strip_tags($text);
// Now $text = 'Strong'
Mez
2009-08-10 17:34:23
A:
I find this to be a little more effective than strip_tags() alone, since strip_tags() will not zap javascript or css:
$search = array(
"'<head[^>]*?>.*?</head>'si",
"'<script[^>]*?>.*?</script>'si",
"'<style[^>]*?>.*?</style>'si",
);
$replace = array("","","");
$text = strip_tags(preg_replace($search, $replace, $html));
Stephen J. Fuhry
2010-02-09 18:41:54