views:

130

answers:

1

Hi,

I am trying to write a web scraper using simplehtmldom. I want to get a tag by searching the contents of the tag. This is the plaintext inside it, not the type of tag. Then once I have the tag by searching for the contents of its plain text I want to get the next tag after that.

How do I find a tag based on its contents? And once I have it how do I find the following tag?

Any Help would be appreciated.

Thanks.

A: 

The following will enable you to search all text nodes, then get the next tag:

// Use Simple_HTML_DOM special selector 'text'
// to retrieve all text nodes from the document
$textNodes = $html->find('text');
$foundTag = null;

foreach($textNodes as $textNode) {
    if($textNode->plaintext == 'Hello World') {
        // Get the parent of the text node
        // (A text node is always a child of
        //  its container)
        $foundTag = $textNode->parent();
        break;
    }
}

if($foundTag) {
    $nextTagAfter = $foundTag->next_sibling();
}

This is not your first question about basic Simple_HTML_DOM usage. You might want to read the official documentation.

Andrew Moore