views:

4972

answers:

15

That was an interview question that I was unable to answer:

How to check that a string is a palindrome using regular expressions?

p.s. There is already a question "How to check if the given string is palindrome?" and it gives a lot of answers in different languages, but no answer that uses regular expressions.

+4  A: 

Depending on how confident you are, I'd give this answer:

I wouldn't do it with a regular expression. It's not an appropriate use of regular expressions.

Jon Skeet
I would hope you'd give a little more explanation, to show that you really understand the limitations of regex. Your simple answer might be taken as "I'm stumped".
Scott Wegner
Hence the depend clause he gave.
Will Bickford
Intriguing... two recent downvotes on an answer over a year old, and with no explanation. Downvoters, could you please leave a comment so that your vote has more meaning?
Jon Skeet
+5  A: 

Here's one to detect 4-letter palindromes (e.g.: deed), for any type of character:

\(.\)\(.\)\2\1

Here's one to detect 5-letter palindromes (e.g.: radar), checking for letters only:

\([a-z]\)\([a-z]\)[a-z]\2\1

So it seems we need a different regex for each possible word length. This post on a Python mailing list includes some details as to why (Finite State Automata and pumping lemma).

FOR
+20  A: 

It's not possible. Palindromes aren't defined by a regular language. (See, I DID learn something in computational theory)

ZCHudson
Most regular expressions engines capture more than regular languages (net can capture matching parenthesis for example). Only standard regexes are limited to regular langs.
Santiago Palladino
The question did use the term "regular expression" though... so ZCHudson's answer is correct.
ceretullis
@austirg: ZCHudson's answer is correct but incomplete. Regular expressions used in modern programming languages and regular expressions used in theoretical CS classes are different beasts. The term is just a historical heritage. See http://stackoverflow.com/questions/233243#235199 and my answer.
J.F. Sebastian
@J.F. Sebastian - I'd have to agree with austirg on this. When the term regular expression is used without a specific programming language mentioned than the comp sci defintion applies. Not all languages that support regexes can do this, so we shouldn't assume the one used here does.
Rontologist
@Rontologist: I see no restrictions on the choice of programming language in the question thus any language is allowed. Look at the right: what is the meaning of "regular expression" in the related questions? Are specific programming language being mentioned in any of them?
J.F. Sebastian
@J.F. Sebastian: They are not asking for specific languages, but none of the regex specific questions on the right are asking anything beyond the comp sci regular expressions either, so the distinction is not as important.
Rontologist
In my experience, when people post regex questions without specifying a flavor, it's usually because they don't realize there _are_ different regex flavors.
Alan Moore
+1  A: 

It's actually easier to do it with string manipulation rather than regular expressions:

bool isPalindrome(String s1)

{

    String s2 = s1.reverse;

    return s2 == s1;
}

I realize this doesn't really answer the interview question, but you could use it to show how you know a better way of doing a task, and you aren't the typical "person with a hammer, who sees every problem as a nail."

Dan
+3  A: 

Depends what they're looking for... This detects any palindrome, but does require a loop (which will be required because regular expressions can't count).

I don't think "That's not possible" is really what the interviewer is looking for.

$a = "teststring";
while(length $a > 1)
{
   $a =~ /(.)(.*)(.)/;
   die "Not a palindrome: $a" unless $1 eq $3;
   $a = $2;
}
print "Palindrome";
Airsource Ltd
Good answer. The question didn't ask for a single regexp that detects a palindrome straight out of the box - it simply asked for a method of detecting palindromes that makes use of regexps. Congrats on your insight to look at it this way.
Stewart
+3  A: 

With Perl regex:

/^((.)(?1)\2|.?)$/

Though, as many have pointed out, this can't be considered a regular expression if you want to be strict. Regular expressions does not support recursion.

MizardX
this doesn't work in PCRE (it doesn't match "ababa"), but it does work in Perl 5.10
newacct
You are right. PCRE does seem treat the recursion as an atomic group, while Perl allows backtracking within it. I don't think it is possible to do this check in PCRE.
MizardX
+1  A: 

As pointed out by ZCHudson, determine if something is a palindrome cannot be done with an usual regexp, as the set of palindrome is not a regular language.

I totally disagree with Airsource Ltd when he says that "it's not possibles" is not the kind of answer the interviewer is looking for. During my interview, I come to this kind of question when I face a good candidate, to check if he can find the right argument when we proposed to him to do something wrong. I do not want to hire someone who will try to do something the wrong way if he knows better one.

Nicolas
A: 

something you can do with perl: http://www.perlmonks.org/?node_id=577368

Zsolt Botykai
+21  A: 

The answer to this question is that "it is impossible". More specifically, the interviewer is wondering if you paid attention in your computational theory class.

In your computational theory class you learned about finite state machines. A finite state machine is composed of nodes and edges. Each edge is annotated with a letter from a finite alphabet. One or more nodes are special "accepting" nodes and one node is the "start" node. As each letter is read from a given word we traverse the given edge in the machine. If we end up in an accepting state then we say that the machine "accepts" that word.

A regular expression can always be translated into an equivalent finite state machine. That is, one that accepts and rejects the same words as the regular expression (in the real world, some regexp languages allow for arbitrary functions, these don't count).

It is impossible to build a finite state machine that accepts all palindromes. The proof relies on the facts that we can easily build a string that requires an arbitrarily large number of nodes, namely the string

a^x b a^x (eg., aba, aabaa, aaabaaa, aaaabaaaa, ....)

where a^x is a repeated x times. This requires at least x nodes because, after seeing the 'b' we have to count back x times to make sure it is a palindrome.

Finally, getting back to the original question, you could tell the interviewer that you can write a regular expression that accepts all palindromes that are smaller than some finite fixed length. If there is ever a real-world application that requires identifying palindromes then it will almost certainly not include arbitrarily long ones, thus this answer would show that you can differentiate theoretical impossibilities from real-world applications. Still, the actual regexp would be quite long, much longer than equivalent 4-line program (easy exercise for the reader: write a program that identifies palindromes).

Jose M Vidal
+1  A: 

In Perl (see also Zsolt Botykai's answer):

$re = qr/
  .                 # single letter is a palindrome
  |
  (.)               # first letter
  (??{ $re })??     # apply recursivly (not interpolated yet)
  \1                # last letter
/x;

while(<>) {
    chomp;
    say if /^$re$/; # print palindromes
}
J.F. Sebastian
+1  A: 

I would explain to the interviewer that the language consisting of palindromes is not a regular language but instead context-free.

The regular expression that would match all palindromes would be infinite. Instead I would suggest he restrict himself to either a maximum size of palindromes to accept; or if all palindromes are needed use at minimum some type of NDPA, or just use the simple string reversal/equals technique.

Flame
A: 

Regarding the PCRE expression (from MizardX):

/^((.)(?1)\2|.?)$/

Have you tested it? On my PHP 5.3 under Win XP Pro it fails on: aaaba Actually, I modified the expression expression slightly, to read:

/^((.)(?1)*\2|.?)$/

I think what is happening is that while the outer pair of characters are anchored, the remaining inner ones are not. This is not quite the whole answer because while it incorrectly passes on "aaaba" and "aabaacaa", it does fail correctly on "aabaaca".

I wonder whether there a fixup for this, and also, Does the Perl example (by JF Sebastian / Zsolt) pass my tests correctly?

Csaba Gabor from Vienna

A: 

A slight refinement of Airsource Ltd's method, in pseudocode:

WHILE string.length > 1
    IF /(.)(.*)\1/ matches string
        string = \2
    ELSE
        REJECT
ACCEPT
Stewart
+1  A: 

As a few have already said, there's no single regexp that'll detect a general palindrome out of the box, but if you want to detect palindromes up to a certain length, you can use something like

(.?)(.?)(.?)(.?)(.?).?\5\4\3\2\1
Stewart
A: 

Yes, you can do it in .Net!

(?<N>.)+.?(?<-N>\k<N>)+(?(N)(?!))

You can check it here! It's a wonderful post!

Kev
Cool! If you happen to be interviewing for a .NET job, you're all set. Just hope they don't ask you to explain it. ;)
Alan Moore