tags:

views:

36

answers:

1

I have an X amount of hrefs that look like:

<a href="http://url.com/?foo=bar&amp;p=20" title="foo">Foo</a><a href="http://url2.com/?foo=bar&amp;p=30" title="foo">Foo</a>

I'm trying to extract the parameter p from each href found; So in this case have an end result array as: array (20, 30)

What would be a good regex for this? Thanks

+5  A: 

Don’t try to parse HTML with regular expressions; use an HTML parser like PHP’s DOM library or the PHP Simple HTML DOM Parser instead. Then parse the URL with parse_url and the query string with parse_str.

Here’s an example:

$html = str_get_html('…');
$p = array();
foreach ($html->find('a[href]') as $a) {
    parse_str(parse_url($a->getAttribute('href'), PHP_URL_QUERY), $args);
    if (isset($args['p'])) $p[] = $args['p'];
}
Gumbo