ansaurus

Question

Regex match all newline characters within in PHP

Answer 1

+2 A:

You could split this in two regex's. First split on your  tags (.*?) , then match on newline from the result.

Divide and conquer. Several small regex's will often perform faster than huge ones.

I assume you have total control over the html and know it's well formed. Because using regex on html is a no-no in most cases. Use a DOM parser instead.

Mikael Svenson 2010-06-11 20:35:39

Answer 2

+1 A:

Well, regex are not well suited to parsing HTML (use DomDocument for that). You also said that you want to "match on". Does that mean capture? Replace? "Check for"? Assuming check for, here's a crude one:

$regex = '#(?i:<p[^>]*>[^\\n]*)(\\n)(?i:[^<]*</p>)#';

It won't match foo\n, but it will match the case where there is a new line inside of a basic  tag (with no html children).

What I'd suggest, is grabbing DomDocument, and doing something like this:

$dom = new DomDocument();
$dom->loadHTML($html);
$pTags = $dom->getElementsByTagName('p');
foreach ($pTags as $p) { 
    $txt = $p->textContent;
    if (strpos($txt, "\n") !== false) {
        //You found a \n within a P tag
    }
}

ircmaxell 2010-06-11 20:41:46

ansaurus

tags:

views:

answers:

Regex match all newline characters within <p> in PHP

related questions