tags:

views:

620

answers:

2

My understanding is that Java's implementation of regular expressions is based on Perl's. However, in the following example, if I execute the same regex with the same string, Java and Perl return different results.

Here's the Java example:

public class RegexTest {
    public static void main( String args[] ) {
        String sentence = "This is a test of regular expressions.";
        System.out.println( sentence.matches( "\\w" ) ? "Matches" : "Doesn't match" );
    }
}

This returns: Doesn't match

Here's the Perl example:

my $sentence = 'This is a test of regular expressions.';
print ( $sentence =~ /\w/ ? "Matches" : "Doesn't match" ) . "\n";

This returns: Matches

To me, the Perl result makes sense. It looks for a match for a single word character. I don't understand why Java doesn't consider it a match. What's the reason for the difference?

+25  A: 

The Java matches method is testing whether the regex matches the entire String. To test whether a regex can be found anywhere in a string, create a Matcher and use its find method.

erickson
+6  A: 

Additionally, the Perl regex syntax is NOT the Java Regex Syntax.

It doesn't apply necessarily in this case, but this is a more answer to your more general question.

Java has a regular expression syntax known as "PCRE", ie: Perl Compatible.

This name is however grossly misleading, because there is very very little about it which is really Perl compatible.

For instance, Perl regular expressions permit executing code in the expression itself, and lots of other advanced operators, and some syntax are different in Perl as they are in other languages ( ie: many languages use \> and \< as word boundary markers, but Perl just uses '\b' )

Spend a few minutes to read some of the PerlRe Documentation and you'll discover lots of awesome tricks that Perl's regular expression engine can do that nothing else seems to do.

Kent Fredric
PCRE was compatible to perl in the early days of perl version 5, or was it perl v4. Not sure. Meanwhile perl's regex moved forward a lot.
Mathieu Longtin
+1 for answering the more general question (the one in the title). I'd like to add that the book 'Mastering Regular Expressions' by Jeffrey Friedl is great for understanding the similarities and differences between common regex flavours.
Jonik