There's many regex's out there to match a URL. However, I'm trying to match URLs that do not appear anywhere within a <a>
hyperlink tag (HREF
, inner value, etc.). So NONE of the URLs in these should match:
<a href="http://www.example.com/">something</a> <a href="http://www.example.com/">http://www.example2.com</a> <a href="http://www.example.com/"><b>something</b>http://www.example.com/<span>test</span></a>
Any URL outside of <a></a>
should be matched.
One approach I tried was to use a negative lookahead to see if the first <a>
tag after the URL was an opening <a>
or a closing </a>
. If it is a closing </a>
then the URL must be inside a hyperlink. I think this idea was okay, but the negative lookahead regex didn't work (or more accurately, the regex wasn't written correctly). Any tips are very appreciated.