tags:

views:

30

answers:

2

Hello. I'm trying to parse an html file.

The idea is to fetch the span's with title and desc classes and to fetch their information in each div that has the attribute class='thebest'.

here is my code:

<?php

$example=<<<KFIR
<html>
<head>
<title>test</title>
</head>
<body>
 <div class="a">moshe1
<div class="aa">haim</div>
 </div>
 <div class="a">moshe2</div>
 <div class="b">moshe3</div>

<div class="thebest">
<span class="title">title1</span>
<span class="desc">desc1</span>
</div>
<div class="thebest">
span class="title">title2</span>
<span class="desc">desc2</span>
</div>

</body>
</html>
KFIR;


$doc = new DOMDocument();
@$doc->loadHTML($example);
$xpath = new DOMXPath($doc);
$expression="//div[@class='thebest']";
$arts = $xpath->query($expression);

foreach ($arts as $art) {
    $arts2=$xpath->query("//span[@class='title']",$art);
    echo $arts2->item(0)->nodeValue;
    $arts2=$xpath->query("//span[@class='desc']",$art);
    echo $arts2->item(0)->nodeValue;
}
echo "done";

the expected results are:

title1desc1title2desc2done 

the results that I'm receiving are:

title1desc1title1desc1done
+1  A: 

Instead of doing the second query try textContent

foreach ($arts as $art) {
    echo $art->textContent;
}

textContent returns the text content of this node and its descendants.

As an alternative, change the XPath to

$expression="//div[@class='thebest']/span[@class='title' or @class='desc']";
$arts = $xpath->query($expression);

foreach ($arts as $art) {
    echo $art->nodeValue;
}

That would fetch the span children of the divs with a class thebest having a class of title or desc.

Gordon
+1  A: 

Make the queries relative... start them with a dot (e.g. ".//…").

foreach ($arts as $art) {
    // Note: single slash (direct child)
    $titles = $xpath->query("./span[@class='title']", $art);
    if ($titles->length > 0) {
        $title = $titles->item(0)->nodeValue;
        echo $title;
    }

    $descs = $xpath->query("./span[@class='desc']", $art);
    if ($descs->length > 0) {
        $desc = $descs->item(0)->nodeValue;
        echo $desc;
    }
}
salathe