views:

40

answers:

2

This regex comes from Atwood and is used to filter out anchor tags with anything other than the href and a title:

 <a\shref="(\#\d+|(https?|ftp)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+)"(\stitle="[^"]+")?\s?>

I need to allow am additional attribute that specifically matches: target="_blank". So the following url should be allowed:

 <a href="http://www.google.com" target="_blank">

I tried changing the pattern to these:

 <a\shref="(\#\d+|(https?|ftp)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+)"(\stitle="[^"]+")(\starget="_blank")?\s?>
 <a\shref="(\#\d+|(https?|ftp)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+)"(\stitle="[^"]+")(\starget=\"_blank\")?\s?>

Clearly I don't know regex very well. How should the pattern be adjusted to allow the blank target and no other targets?

+1  A: 
<a\shref="(\#\d+|(https?|ftp)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+)"\s(target=\"_blank\")>

Will do what you are asking.

If you are a regex nub, let me recommend RegExBuddy. It is a program that lets you test your regex's on sample text or sample files.

Saves a lot of time.

http://www.regular-expressions.info/regexbuddy.html (Regex Buddy)

http://www.regular-expressions.info is also a good resource

Blankasaurus
Note that this solution imposes that the said attributes (href, target and title) have a specific order.
Felix
I was using this url to test with but hadn't come up with a pattern that worked. http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
Sailing Judo
this worked with the example i had... thanks.
Sailing Judo
+1  A: 
<a\shref="(\#\d+|(https?|ftp)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+)"(\stitle="[^"]+")(\starget="_blank")>
Mimisbrunnr