views:

285

answers:

1

This is the third part in a series of educational regex articles. It follows How does this regex find triangular numbers? (where nested references is first introduced) and How can we match a^n b^n with Java regex? (where the lookahead "counting" mechanism is further elaborated upon). This part introduces a specific form of nested assertion, which when combined with nested references allows Java regex to match what most people believe is "impossible": palindromes!!

The language of palindromes is non-regular; it's actually context-free (for a given alphabet). That said, modern regex implementation recognizes more than just regular languages, and Perl/PCRE's recursive patterns and .NET's balancing groups can readily recognize palindromes (see: Related Questions).

However, Java's regex engine supports neither of these "advanced" features. And yet "someone" (*wink*) managed to write the following regex which seems to do the job just fine (see also on ideone.com):

public class Palindrome {
    // asserts that the entirety of the string matches the given pattern
    static String assertEntirety(String pattern) {
        return "(?<=(?=^pattern$).*)".replace("pattern", pattern);
    }

    public static void main(String[] args) {
        final String PALINDROME =
            "(?x) | (?:(.) add)+ chk"
                .replace("add", assertEntirety(".*? (\\1 \\2?)"))
                .replace("chk", assertEntirety("\\2"));

        System.out.println(PALINDROME);
        // (?x) | (?:(.) (?<=(?=^.*? (\1 \2?)$).*))+ (?<=(?=^\2$).*)

        String[] tests = {
            "",     // true
            "x",    // true
            "xx",   // true
            "xy",   // false
            "xyx",  // true
            "xxx",  // true
            "xxyx", // false
            "racecar",                // true
            "step on no pets",        // true
            "aManaPlanaCanalPanaMa",  // true
            "this is impossible",     // FALSE!!!
        };
        for (String test : tests) {
            System.out.printf("[%s] %s%n", test, test.matches(PALINDROME));
        }
    }
}

So this seems to work, but how?

References


COMMON SENSE ALERT!!!

This is not the best way to detect palindromes; it's O(N^3)at best. Performing this detection in a more general purpose programming language is both more efficient and more straightforward.

You wouldn't want to use regex to detect palindromes for the same reasons you wouldn't want to use regex to find prime numbers. That said, you would study how a non-recursive non-balancing group regex can detect palindromes for the same reasons you would study how a regex can be used for primality testing: it's fun, it's challenging, it's educational.

Related questions

+5  A: 
polygenelubricants
+1 Wow! Excellent work.
Shakedown
Can you help me with this.http://stackoverflow.com/questions/3682587/split-string-of-varying-length-using-regex
Emil