tags:

views:

213

answers:

5
+2  Q: 

Simple RegEx PHP

Since I am completely useless at regex and this has been bugging me for the past half an hour, I think I'll post this up here as it's probably quite simple.

<a href="/folder/files/hey/">hey.exe</a>
<a href="/folder/files/hey2/">hey2.dll</a>
<a href="/folder/files/pomp/">pomp.jpg</a>

In PHP I need to extract what's between the <a> tags example:

hey.exe
hey2.dll
pomp.jpg
+2  A: 

<a href="[^"]*">([^<]*)</a>

Douglas Leeder
+2  A: 

I found this regular expression tester to be helpful.

CheGueVerra
Even better: http://gskinner.com/RegExr/ (Flash implementation, interactive and all)
Tomalak
My favorite is http://rubular.com/
Chad Birch
The ICG tester is based on .NET, RegExr is ActionScript, and Rubular is Ruby. Given that the OP is using PHP, it would probably be more helpful to recommend a PHP-based tester. http://www.google.com/search?q=PHP+regex+tester
Alan Moore
Another vote for RegExr
Rytis
+2  A: 

Here is a very simple one:

<a.*>(.*)</a>

However, you should be careful if you have several matches in the same line, e.g.

<a href="/folder/hey">hey.exe</a><a href="/folder/hey2/">hey2.dll</a>

In this case, the correct regex would be:

<a.*?>(.*?)</a>

Note the '?' after the '*' quantifier. By default, quantifiers are greedy, which means they eat as much characters as they can (meaning they would return only "hey2.dll" in this example). By appending a quotation mark, you make them ungreedy, which should better fit your needs.

Luc Touraille
+2  A: 

This appears to work:

$pattern = '/<a.*?>(.*?)<\/a>/';
Chad Birch
+6  A: 

Avoid using '.*' even if you make it ungreedy, until you have some more practice with RegEx. I think a good solution for you would be:

'/<a[^>]+>([^<]+)<\/a>/i'

Note the '/' delimiters - you must use the preg suite of regex functions in PHP. It would look like this:

preg_match_all($pattern, $string, $matches);
// matches get stored in '$matches' variable as an array
// matches in between the <a></a> tags will be in $matches[1]
print_r($matches);
robmerica
+1 for recommending against (.*) and using exclusive character classes instead.
Tomalak