I would like to get the links from the search results. Can someone please help with with the regular expression to do this? I've got this, and it doesn't work:
preg_match_all("/<h3(.*)><a href=\"(.*)\"(.*)<\/h3>/", $result, $matches);
I would like to get the links from the search results. Can someone please help with with the regular expression to do this? I've got this, and it doesn't work:
preg_match_all("/<h3(.*)><a href=\"(.*)\"(.*)<\/h3>/", $result, $matches);
Your patterns are likely having the biggest issues because of the greedy vs lazy nature of it. Changing it to the following should solve that issue...
preg_match_all('#<h3.*?><a href="(.*?)".*?</h3>#', $result, $matches);
print_r($matches[1]);
There are possibly a few rare URLs that could mess the pattern up, but chances are you won't run into one. I will point out that stillstanding has a good point though using the API would be a better option.
As for people that blanket answer with "You can't parse HTML with Regex, use a DOM"... Whilst you cannot create a generic HTML parser (and should be using DOM for that task), you can match patterns in a set of text you know follows a certain structure, the fact that structure is HTML is irrelevant. Yes, if Google change their layout it will probably break, but this is also probably true of a DOM Parser. (P.S. I'm well aware this will probably get down-voted by the sheeple).