ansaurus

Question

My regex is causing a stack overflow in Java; what am I missing?

Answer 1

+3 A:

Try this simplified version of your regex that removes some unnecessary | operators (which might have been causing the regex engine to do a lot of branching) and includes beginning and end of line anchors.

static final String ANIMAL_INFO_REGEX = "^([a-zA-Z]+) *= *\"([a-zA-Z_. ]+)\"$";

Asaph 2010-09-10 02:40:03

Answer 2

+3 A:

This looks like bug 5050507 . I agree with Asaph that removing the alternation should help; the bug specifically says "Avoid alternation whenever possible". I think you can go probably even simpler:

"^([a-zA-Z]+) *= *\"([^\"]+)"

Matthew Flaschen 2010-09-10 02:43:08

+1, but I want to emphasize that the bug **report** is bogus. The remarks in the Evaluation apply to *any* regex-directed (or NFA) regex engine, not just Java's. (That includes Perl, Python, PHP, .NET, JavaScript, and many others.)

Alan Moore 2010-09-10 07:08:09

Answer 3

A:

read this to understand the problem: http://www.regular-expressions.info/catastrophic.html ... and then use one of the other suggestions

Zac Thompson 2010-09-10 03:06:11

Answer 4

+1 A:

As the others have said, your regex is much less efficient than it should be. I'd take it a step further and use possessive quantifiers:

"^([a-zA-Z]++) *+= *+\"([^\"]++)\"$"

But the way you're using the Scanner doesn't make much sense, either. There's no need to use findInLine(".*") to read the line; that's what nextLine() does. And you don't need to create another Scanner to apply your regex; just use a Matcher.

static final Pattern ANIMAL_INFO_PATTERN = 
    Pattern.compile("^([a-zA-Z]++) *+= *+\"([^\"]++)\"$");

...

  Matcher lineMatcher = ANIMAL_INFO_PATTERN.matcher("");
  while (scanFile.hasNextLine()) {
    String currentLine = scanFile.nextLine();
    if (lineMatcher.reset(currentLine).matches()) {
      matches.put(lineMatcher.group(1), lineMatcher.group(2));
    }
  }

Alan Moore 2010-09-10 06:57:16

ansaurus

tags:

views:

answers:

My regex is causing a stack overflow in Java; what am I missing?

related questions