tags:

views:

116

answers:

3

I have an HTML page and I want to fetch the result between two tags <b> and <BR>:

<b>Defendants Name:</b>Donahue, Leah A                                  <BR>

What is the regular expression to fetch the words between these two tags?

+3  A: 

See http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

Juha Syrjälä
Was just about to go find this post... Though if the input is (and always will be) as basic as the example a regex may work here. Probably still a bad idea.
derivation
A: 

I think this could work:

    String str = "<b>Defendants Name:</b>Donahue, Leah A                                                    <BR>";
    Pattern pattern = Pattern.compile(".*<b>(.*)<BR>.*", Pattern.UNIX_LINES);
    Matcher m = pattern.matcher(str);
    if (m.matches() == true)
    {
        System.out.println(m.group(1));
    }

And should print

"Defendants Name:Donahue, Leah A " (excluding the quotes).

npinti
A: 

You shouldn't use regexps for parsing HTML, use an HTML parser instead. Have a look at jTidy or NekoHTML.

markusk