views:

2802

answers:

4

Hello I would like to use preg_match in PHP to parse the "Desired text" out of the following from a html document

<p class="review"> Desired text </p>

Ordinarily I would use simple_html_dom for such things but on this occasion it cannot be used (the above element doesn't appear in every desired div tag so I'm forced to use this approach to keep track of exactly when it doesn't appear and then adjust my array from simple_html_dom accordingly).

Anyway, this would solve my problem.

Thanks so much.

+4  A: 
preg_match("'<p class=\"review\">(.*?)</p>'si", $source, $match);
if($match) echo "result=".$match[1];
serg
Works perfect. Saved me several hours you have there, thanks a lot for that.
David Willis
You are welcome :)
serg
Isn't this likely to overmatch? See my answer below.
beamrider9
It won't overmatch because of lazy quantification. `.*?` will grab as less as possible, while `.*` would grab as much as possible.
serg
A: 

excelent.... thanks a lot......

sharaz
A: 

What if the string you're matching has multiple lines and is:

<p class="review"> Desired text1 </p>
<p class="review"> Desired text2 </p>
<p class="review"> Desired text3 </p>

That pattern would match once, and the match would be everything in the string.

I think a better pattern is:

"'<p class=\"review\">([^<]*)</p>'si"
beamrider9
A: 

if you want to return multiple matches then need to use preg_match_all(). You then loop through the second result group ($match[1]) to get just the content between tags.

$source = "<p class=\"review\"> Desired text1 </p>".
"<p class=\"review\"> Desired text2 </p>".
"<p class=\"review\"> Desired text3 </p>";


    preg_match_all("'<p class=\"review\">(.*?)</p>'si", $source, $match);

    foreach($match[1] as $val)
    {
        echo $val."<br>";


    }

Outputs:

Desired text1
Desired text2
Desired text3 
Andy