tags:

views:

54

answers:

2

Hi I like to know how to write REGEX for the following code.

<a href="/search?q=user:111111+[apple]" class="post-tag" title="show all posts by this user in 'apple'">Apple</a><span class="item-multiplier">&times;&nbsp;171</span><br>

I just need to fetch Apple from the above source code. Can anybody help me out in writing REGEX. Thanks

+1  A: 

There is an excellent tool at txt2re that can be used to EASILY generate regexp in various languages. I used it to generate the following:

import java.util.regex.*;

class Main
{
  public static void main(String[] args)
  {
    String txt="<a href=\"/search?q=user:111111+[apple]\" class=\"post-tag\" title=\"show all posts by this user in 'apple'\">Apple</a><span class=\"item-multiplier\">&times;&nbsp;171</span><br>";

    String re1=".*?";   // Non-greedy match on filler
    String re2="(?:[a-z][a-z]+)";   // Uninteresting: word
    String re3=".*?";   // Non-greedy match on filler
    String re4="(?:[a-z][a-z]+)";   // Uninteresting: word
    String re5=".*?";   // Non-greedy match on filler
    String re6="(?:[a-z][a-z]+)";   // Uninteresting: word
    String re7=".*?";   // Non-greedy match on filler
    String re8="((?:[a-z][a-z]+))"; // Word 1

    Pattern p = Pattern.compile(re1+re2+re3+re4+re5+re6+re7+re8,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
    Matcher m = p.matcher(txt);
    if (m.find())
    {
        String word1=m.group(1);
        System.out.print("("+word1.toString()+")"+"\n");
    }
  }
}
Omry
A: 

Normally you'll want to use a parser to convert HTML into a traversable DOM..

But in this simple case, /<a href="[^\[]+\[([^\]]+)/ should do the trick ("apple" will be in the first capture group).

Matt