ansaurus

Question

Answer 1

+3 A:

SimpleHtmlDom example (isn't it pretty?):

// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');

// Find all links 
foreach($html->find('a') as $element) {
       echo $element->href . '<br>';
       echo $element->text; //this is what you want
}

karim79 2009-08-17 08:43:07

Answer 2

A:

If the HTML page you're reading is very regular (for instance, machine-generated according to predictable patterns), something like this would work:

preg_match('|<a\s+href="http://www.example.com/search\?la=en&amp;q=(\w+)"\s*&gt;\1&lt;/a&gt;|', $page)

But if it gets any more complicated than that, regular expressions probably won't be enough for the job - you'd be better off using a full HTML parser to extract the links and check them one-by-one to find the text you want.

David Zaslavsky 2009-08-17 08:44:59

I believe you should escape the dots in the url?http://www\.example\.com/

Håkon 2009-08-17 11:32:13

ansaurus

tags:

views:

answers:

PHP Regex match all HTML tags

related questions