Hello,
I am having a problem with my regular expression: <a.*href=[\"'](.*?)[\"'].*>(.*?)</a>
. It is, as you can probably tell, supposed to take all the links from a string of HTML and return the link text in group 2, and the link target in group 1. But I am having a problem. If I try it in Javascript (using http://www.regextester.com/, with all the flags on), it works fine, but in Java, like this:
Pattern myPattern = Pattern.compile("<a.*href=[\"'](.*?)[\"'].*>(.*?)</a>", Pattern.CASE_INSENSITIVE);
Matcher match = myPattern.matcher(htmlData);
while(match.find()) {
String linkText = match.group(2);
String linkTarget = match.group(1);
}
I don't get all the matches I expect. With regex tester, I get many more and it works exactly like it is supposed to, but with the Java version, it just get 1 or 2 links per page.
Sorry if this is obvious, but I am new to regular expressions.
Thanks,
Isaac Waller
Edit: I think it might be something wrong with my regex. See, from this Apache indexof page:
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Bryan%20Adams%20-%20Here%20I%20Am.mp3">Bryan Adams - Here I Am.mp3</a></td><td align="right">27-Aug-2008 11:48 </td><td align="right">170K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Cars%20-%20Drive.mp3">Cars - Drive.mp3</a></td><td align="right">26-Aug-2008 19:04 </td><td align="right">149K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Cock%20Robin%20-%20When%20Your%20Heart%20Is%20Weak.mp3">Cock Robin - When Your Heart Is Weak.mp3</a></td><td align="right">26-Aug-2008 19:04 </td><td align="right">124K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Colbie%20Caillat%20-%20Bubbly.mp3">Colbie Caillat - Bubbly.mp3</a></td><td align="right">27-Aug-2008 11:49 </td><td align="right">215K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Colbie%20Caillat%20-%20The%20Little%20Things.mp3">Colbie Caillat - The Little Things.mp3</a></td><td align="right">27-Aug-2008 11:49 </td><td align="right">176K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Coldplay%20-%20Violet%20Hill.mp3">Coldplay - Violet Hill.mp3</a></td><td align="right">27-Aug-2008 11:49 </td><td align="right">136K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Corrs%20-%20Radio.mp3">Corrs - Radio.mp3</a></td><td align="right">26-Aug-2008 19:04 </td><td align="right">112K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Corrs%20-%20What%20Can%20I%20Do.mp3">Corrs - What Can I Do.mp3</a></td><td align="right">26-Aug-2008 19:04 </td><td align="right">146K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Counting%20Crows%20-%20Big%20Yellow%20Taxi.mp3">Counting Crows - Big Yellow Taxi.mp3</a></td><td align="right">26-Aug-2008 19:04 </td><td align="right">135K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Curtis%20Stigers%20-%20I%20Wonder%20Why.mp3">Curtis Stigers - I Wonder Why.mp3</a></td><td align="right">26-Aug-2008 19:03 </td><td align="right">213K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Cyndi%20Lauper%20-%20Time%20After%20Time.mp3">Cyndi Lauper - Time After Time.mp3</a></td><td align="right">26-Aug-2008 19:03 </td><td align="right">193K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="David%20Bowie%20-%20Absolute%20Beginners.mp3">David Bowie - Absolute Beginners.mp3</a></td><td align="right">26-Aug-2008 19:04 </td><td align="right">155K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Depeche%20Mode%20-%20Enjoy%20The%20Silence.mp3">Depeche Mode - Enjoy The Silence.mp3</a></td><td align="right">26-Aug-2008 19:03 </td><td align="right">230K</td></tr>
<tr><td valign="top"><img src="/icons/sound2.gif" alt="[SND]"></td><td><a href="Dido%20-%20White%20Flag.mp3">Dido - White Flag.mp3</a></td><td align="right">27-Aug-2008 11:48 </td><td align="right">158K</td></tr>
I should get:
1: Bryan%20Adams%20-%20Here%20I%20Am.mp3
2: Bryan Adams - Here I Am.mp3
and many more like that. With Regex tester, I get all the results I want. With Java, I get none.