tags:

views:

33

answers:

4

Im trying to grab the destination (dynamic) of a link (static) with php

Im not sure what the best way to do this is.

the link is

<a href=page.php?XXYYYYYYY>LinkName</a>

the X's are letters and the Y's are numbers (both can vary in length). 'Linkname' always stays the same though.

Is regex the best option here? Or is there a better way?

+2  A: 

Regex is not the best way. Use a HTML parser such as DomDocument.

Blair McMillan
+5  A: 

I would use a DOM parser like PHP's built in one or simpleHTMLDOMParser to extract the link, and parse_url() to analyze the URL:

This function parses a URL and returns an associative array containing any of the various components of the URL that are present.

Pekka
A: 

This pattern will get only what's after the href, and until the linkname

href=([\w.?]+)
Marcos Placona
+2  A: 

If your HTML was valid, you could do this easily with SimpleXML

$html = <<< HTML
<ul>
    <li><a href="page.php?XX">Link1</a></li>
    <li><a href="page.php?YY">Link2</a></li>
    <li><a href="page.php?ZZ">Link3</a></li>
</ul>
HTML;

and then

$doc = simplexml_load_string($html);
$links = $doc->xpath('//a/@href');
foreach ($links as $link) {
    $url = parse_url($link);
    var_dump($url['query']);
}

for output

string(2) "XX"
string(2) "YY"
string(2) "ZZ"

If valid HTML is not an option, try XML Reader, DOM or SimpleHTML (like Pekka suggested)

Gordon