tags:

views:

83

answers:

3

Hi, Is there any special syntax to get nested div element with unique class name in RegExp in php. Consider I have a syntax like

<div style="demo">
<div class="row">
    <div title="[email protected]" class="text">ABC</div>
</div>
<div class="row">
    <div title="[email protected]" class="text">PQR</div>
</div></div>

here how can we retrieve all emailids using RegExp and preg_match_all(). Thanks

+4  A: 

Regex are bad at parsing HTML. Use a DOM Parser and this XPath:

//div[@style="demo"]/div[@class="row"]/div[@class="text"]/@title

If class="text" is exclusive to the divs you want to match, you can also do

//div[@class="text"]/@title

Also see:

Gordon
A: 
preg_match_all("/<div title=\"(.*)\" class=\"text\">/", $subject, $matches);

If the emails is the only data you want there are better regexps for matching emails only. See http://fightingforalostcause.net/misc/2006/compare-email-regex.php

qnrq
It should be noted, that this will only work if the div is written exactly as shown in the example. Any deviation from that format will result in the Regex to no longer match anything.
Gordon
This won’t work as `.*` is greedy and will match as much as possible.
Gumbo
As much as possible between <div title=" and " class="text", which is what we want, yes?
qnrq
@qnrq: no, because it can match all the way from the previous occurrence of <div title=", right through this one.
Avi
A: 
<?php

$html = '<div style="demo">
<div class="row">
    <div title="[email protected]" class="text">ABC</div>
</div>
<div class="row">
    <div title="[email protected]" class="text">PQR</div>
</div></div>
';

$doc = DOMDocument::loadHTML($html);
$xpath = new DOMXPath($doc);

foreach($xpath->query('//div[@style="demo"]/div[@class="row"]/div[@class="text"]/@title') as $div){
    echo $div->value . PHP_EOL;
}

This assumes that the class attributes are exactly those (verbatim), but I hope you get the idea.

Álvaro G. Vicario
Thank you for your idea
Ajith