tags:

views:

124

answers:

3

From my understanding of regular expressions string "00###" has to match with "[0-9]", but not to "^[0-9]$". But it doesn't work with Java regexp's.

After some investigating of this problem I founded next information (http://www.wellho.net/solutions/java-regular-expressions-in-java.html):

It might appear that Java regular expressions are default anchored with both a ^ and $ character.

Can we be sure that this is true for all versions of JDK? And can this mode be turned off (i.e. to disable default anchoring with ^ and $)?

+8  A: 

As the article you linked to explains, it depends on the function you call. If you want to add ^ and $ by default, use matches. If you don't want that, use a Matcher and use the find method.

import java.util.regex.*;

public class Example
{
    public static void main(String[] args)
    {
     System.out.println("Matches: " + "abc".matches("a+"));

     Matcher matcher = Pattern.compile("a+").matcher("abc");
     System.out.println("Find: " + matcher.find());
    }
}

Output:

Matches: false
Find: true
Mark Byers
This isn't unique to Java, BTW. Python, and I believe also JavaScript, have some regex methods that anchor by default, and others that don't.
Laurence Gonsalves
@Laurence: JavaScript's `test()`, `exec()` and `match()` methods all work the same: if you want the match anchored, you have to anchor it yourself. On the other hand, Python's `match()` method works like Java's `lookingAt()`; the match is anchored to the beginning of the string but not to the end.
Alan Moore
+1  A: 

In addition to Mr. Byers's answer, note too that Matcher#find() picks up where its last successful match left off. That only matters for repeated use of a Matcher instance, but that's the feature that allows emulation of Perl's \G assertion. It's also useful in concert with Matcher#usePattern(Pattern), where you use one pattern to find some prefix and then swap in a repeating pattern (including \G) to loop over repeated matches with Matcher#find().

There's also Matcher#lookingAt(), which is implicitly bounded at the beginning (^) but not at the end. I prefer to think that name was inspired by the Emacs function looking-at.

seh
A: 

Yes, matches() always acts as if the regex were anchored at both ends. To get the traditional behavior, which is to match any substring of the target, you have to use find() (as others have already pointed out). Very few regex tools offer anything equivalent to Java's matches() methods, so your confusion is justified. The only other one I can think of offhand is the XML Schema flavor.

Alan Moore