tags:

views:

67

answers:

1

If my HTML is:

<tr><td>....</td><hr></tr>
<tr><td>....</td><hr></tr>
<tr><td>....</td><hr></tr>
<tr><td>....</td><hr></tr>
<tr><td>....</td><hr></tr>
<tr><td>....</td><hr></tr>

If my regex is:

Patterp p = Pattern.compile("<tr>(.*)<hr></tr>");

Should this get 1 result or all the individual rows?

Is there a way to force it to get all the rows and not just the entire html from the top <tr> to the last instance of <hr></tr> ?

+11  A: 

Your regex is using .* which is greedy. Try using .*? instead. A greedy match will grab as much as it can before matching following tokens, so it will go find the last <hr> in your source text. A non-greedy match will grab as little as it can before matching the next token(s).

Then, see this answer for more information about parsing HTML with regular expressions.

Greg Hewgill
+1 - for the link. Hey @Blankman ... ARE YOU PAYING ATTENTION???
Stephen C
Are you guys familiar with HtmlParser? Its in java? if so, I can post my issue that I am having.
Blankman
Post the question (as a new question of course!!!). I'm sure someone will be familiar enough to answer it.
Stephen C
http://stackoverflow.com/questions/2660866/parsing-html-using-htmlparser
Blankman