views:

32

answers:

2

I currently use the following expression which I use to put paragraph tags around textarea input before storing it in a MySQL database.

$inputText = str_replace('<p></p>', '', '<p>' . preg_replace('#([\r\n]\s*?[\r\n]){2,}#', '</p>$0<p>', $inputText) . '</p>');

This works well and good, except when I wish to use header tags. These are then surrounded by unwanted paragraph tags:

<p><h3>Test Header</h3></p>

While this displays as expected, it is not great from a validation point of view.

Can anyone suggest an improved expression and/or method to catch headers tags and only apply the paragraph tags to actual paragraphs? Or, an expression which I can apply to my input prior to the expression I'm currently using to produce the same desired effect.

As a side note, I would like to be able to enter stand-alone hyperlink 'a' tags and still have them surrounded with paragraph tags as before.

I have considered that it may just be easier to manually edit the details after they are entered into the database to remove the unwanted paragraph tags.

A: 

You can use the strip_tags function like this:

<?php
$text = '<p><h3>Test Header</h3></p>';
echo strip_tags($text);
echo "\n";

// Allow <p> and <h3>
echo strip_tags($text, '<p><h3>');
?>

It should work out.

Deepesh
This method is certainly useful, but not as comprehensive as the function stolen from WordPress. Thanks!
Darren
A: 

I use this function from wordpress, wraps p's around paragraphs nicely as well as line breaks whilst preserving HTML:

function wpautop($pee, $br = 1) {
    $pee = $pee . "\n"; // just to make things a little easier, pad the end
    $pee = preg_replace('|<br />\s*<br />|', "\n\n", $pee);
    // Space things out a little
    $allblocks = '(?:table|thead|tfoot|caption|colgroup|tbody|tr|td|th|div|dl|dd|dt|ul|ol|li|pre|select|form|map|area|blockquote|address|math|style|input|p|h[1-6]|hr)';
    $pee = preg_replace('!(<' . $allblocks . '[^>]*>)!', "\n$1", $pee);
    $pee = preg_replace('!(</' . $allblocks . '>)!', "$1\n\n", $pee);
    $pee = str_replace(array("\r\n", "\r"), "\n", $pee); // cross-platform newlines
    $pee = preg_replace("/\n\n+/", "\n\n", $pee); // take care of duplicates
    $pee = preg_replace('/\n?(.+?)(?:\n\s*\n|\z)/s', "<p>$1</p>\n", $pee); // make paragraphs, including one at the end
    $pee = preg_replace('|<p>\s*?</p>|', '', $pee); // under certain strange conditions it could create a P of entirely whitespace
    $pee = preg_replace('!<p>([^<]+)\s*?(</(?:div|address|form)[^>]*>)!', "<p>$1</p>$2", $pee);
    $pee = preg_replace( '|<p>|', "$1<p>", $pee );
    $pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee); // don't pee all over a tag
    $pee = preg_replace("|<p>(<li.+?)</p>|", "$1", $pee); // problem with nested lists
    $pee = preg_replace('|<p><blockquote([^>]*)>|i', "<blockquote$1><p>", $pee);
    $pee = str_replace('</blockquote></p>', '</p></blockquote>', $pee);
    $pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)!', "$1", $pee);
    $pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee);
    if ($br) {
        $pee = preg_replace('|(?<!<br />)\s*\n|', "<br />\n", $pee); // optionally make line breaks
    }
    $pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*<br />!', "$1", $pee);
    $pee = preg_replace('!<br />(\s*</?(?:p|li|div|dl|dd|dt|th|pre|td|ul|ol)[^>]*>)!', '$1', $pee);
    $pee = preg_replace( "|\n</p>$|", '</p>', $pee );
    return $pee;
}
fire
I'm not sure how this function is working but it certainly does what I'm looking for. Would be interesting to know if you have changed this at all from the WordPress version?
Darren
nope it's from a slightly old version but the most recent version's function is almost identical and does the same job
fire