tags:

views:

119

answers:

2

Hi guys I try to catch text by Regular Expression. I list codes as follows.

Pattern p=Pattern.compile("<@a>(?:.|\\s)+?</@a>"); 
Matcher m = p.matcher(fileContents.toString());
while(m.find()) {
    //Error will be thrown at this point
    System.out.println(m.group());
}

If the length of text I want to catch is too long, system will throw me a StackOverflowError. Otherwise, the codes work well. Please help me how to solve this problem.

+3  A: 

The dot and \s both match whitespace characters. That might lead to unnecessary backtracking. What do you want to match? Probably any character, including linebreaks?

Then just use the lazy dot with the dot-matches-newlines option enabled:

Pattern p=Pattern.compile("<@a>.+?</@a>", Pattern.DOTALL);

You are aware that you'll run into trouble if <@a> tags can be nested in your input?

Tim Pietzcker
In fact, the backtrack points are probably what is causing the stack overflow.
Stephen C
That's what I meant to imply (without having seen the text that is to be matched). Thanks for the clarification :)
Tim Pietzcker
A: 

Thanks for help! I got it!

Captain Kidd
BZZZT ... not an answer! Comments should be in comments.
Stephen C