i use nicEdit to write RTF data in my CMS. The problem is that it generates strings like this:
hello first line<br><br />this is a second line<br />this is a 3rd line
since this is for a news site, i much prefer the final html to be like this:
<p>hello first line</p><p>this is a second line<br />this is a 3rd line</p>
so my current solution is this:
- i need to trim the $data for
<br />
at the start/end of the string - replace all strings that have 2
<br/>
or more with</p><p>
(one single<br />
is allowed). - finally, add
<p>
at the start and</p>
at the end
i only have steps 1 and 3 so far. can someone give me a hand with step 2?
function replace_br($data) {
# step 1
$data = trim($data,'<p>');
$data = trim($data,'</p>');
$data = trim($data,'<br />');
# step 2 ???
// preg_replace() ?
# step 3
$data = '<p>'.$data.'</p>';
return $data;
}
thanks!
ps: it would be even better to avoid specific situations. example: "hello<br /><br /><br /><br /><br />too much space
" -- those 5 breaklines should also be converted to just one "</p><p>
"
final solution (special thanks to kemp!)
function sanitize_content($data) {
$data = strip_tags($data,'<p>,<br>,<img>,<a>,<strong>,<u>,<em>,<blockquote>,<ol>,<ul>,<li>,<span>');
$data = trim($data,'<p>');
$data = trim($data,'</p>');
$data = trim($data,'<br />');
$data = preg_replace('#(?:<br\s*/?>\s*?){2,}#','</p><p>',$data);
$data = '<p>'.$data.'</p>';
return $data;
}