views:

256

answers:

4

How can i scan a html page, for text within a certain div?

+1  A: 

The simplest way to do this would be to use Simple HTML DOM parser

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');    

// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]');
codaddict
A: 

preg_match to match the substring you want or use dom/xml.

jspcal
A: 

You can also do this using the DOMDocument class.

Usage is pretty straight-forward:

$dom = new DOMDocument();
$dom->loadHTML(file_get_contents($url));

// Example:
$dom->getElementById('foo');

Documentation is here.

An example of real world usage can be found here.

Koraktor
A: 

You could use build in functionality as suggested by others or you could try the Simple HTML DOM Parser is implemented as a simple PHP class and a few helper functions. It supports CSS selector style screen scraping (such as in jQuery), can handle invalid HTML, and even provides a familiar interface to manipulate a DOM.

It's worth to check it out at http://simplehtmldom.sourceforge.net/

andreas
person above you already said that?
tarnfeld