views:

349

answers:

3

Given a string (assume only English characters) S of length n, we can count the number of palindromic substrings with the following algorithm:

for i = 0 to |S| do
    p1 = number of palindromes centered in i (odd length)
    p2 = number of palindromes centered in i and i+1 (even length)

    add p1 + p2 to total number of palindromic substrings of S

The above code is O(n^2) however.

I am interested in an algorithm that solves this problem in O(n). I know for sure that one exists as I've heard multiple people say that it does, and the problem exists on a local online judge site with an upper bound of 1 000 000 on n, however I've never seen the algorithm and can't seem to be able to come up with it.

Update:

The general idea I have is to compute len[i] = length of the longest palindrome centered at the character 2i + 1 and a similar array for even-length palindromes. With good bookkeeping, it should be possible to compute this in O(1) for each character, which will allow us to count a lot of palindromes all at once. I'm stuck on how exactly to compute this however.

I will accept a solution that uses O(n) and maybe even O(n log n) extra memory. I think this is impossible without it.

Any good ideas or references are appreciated.

A: 

For "normal" strings it should be rather efficient to look at each character as the potential "center" of a palindrome and then check if the surrounding characters actually build one:

# check odd palindromes
for center in range(len(ls)):
   # check how many characters to the left and right of |center|
   # build a palindrome
   maxoffs = min(center, len(ls)-center-1)
   offs = 0
   while offs <= maxoffs and ls[center-offs] == ls[center+offs]:
      offs += 1
   offs -= 1
   print ls[center-offs : center+offs+1]                                    

# check for even palindromes
for center in range(len(ls)-1):
   maxoffs = min(center, len(ls)-center-2)
   offs = 0
   while offs <= maxoffs and ls[center-offs] == ls[center+offs+1]:
      offs += 1
   offs -= 1
   if offs >= 0:
      print ls[center-offs : center+offs+2]

For normal strings this should be about O(n), though in the worst case, for example if the string consists of only one character repeated over and over again, it will still take O(n2) time.

sth
You can indeed stop the search early, which will be good enough for random strings. I'm interested in something that's always `O(n)` however. It's very easy to break this: a string made up of a single character.
IVlad
A: 

Fundamentally, such an algorithm is impossible. Consider the string which is all one characters- say, "aaaaa". The number of palindromes contained within is proportional to n^2. How can you count n^2 potential palindromes in n time? You can't, it's impossible. Therefore, no such algorithm exists for this case, and thus, the case in general.

DeadMG
-1 Check to see if all the characters in the string are the same (O(n)). If so, calculate the number of palindromes in the string based on its length (O(1)).
Vuntic
This line of reasoning is completely false. Do you need `n!` operations to compute `n!`? Do you need `Catalan(n)` operations to count valid parenthesizations? There's no theorem saying that you can't count to a certain `K` without doing `K` operations. Take dynamic programming for example, which allows you to solve a lot of counting problems efficiently. You don't have to count the palindromes one by one, there might be a way to add more at once.
IVlad
@Vuntic: Great, so now I've got "aaaaa b aaaaa". The problem still exists. Handling the one special case in which it's very easy to notice isn't going to help. @IVlad: The difference is that you don't have to compute every factorial up to n!. In this, you do. There is no way to solve the problem except by counting every existent palindrome. Even if you could count it in constant time, the fact that the palindromes are n^2 in quantity makes it an n^2 operation. Besides, the OP asked for ideas- this is mine.
DeadMG
When the number of outputs is potentially large, such as in line-line intersections, you usually separate it out. So the algorithm could be O(n + t), where t could be O(n^2). Also, in this case, there is no need to return smaller palindromes which are centered on the same character.
Strilanc
@DeadMG - I'll edit to clarify that I want good ideas :). Your argument is flawed, you are making a statement and failing to provide adequate proof. What makes you say you have to count every existent palindrome **one at a time**? You can count more at once with intelligent bookkeeping. The question is how, not if it's possible. I don't want to print every palindrome, I just want to count them. If you want to prove the lower bound of `n^2` you're going to have to do better.
IVlad
@Strilanc - it's possible to do in O(n) as well. If we can find the longest palindrome centered at each `i` in O(1) based on previously computed values (by using O(n) extra memory), then we can count a bunch of palindromes all at once that are subsequences of that longest one.
IVlad
In "aaaaa b aaaaa", you can take the instances of aaaaa and treat each as you would the entire string aaaaa, and then treat each as a single character as relates to the surrounding string. That's hardly n^2 calculations.
Eli
+4  A: 

The following site shows an algorithm for computing the longest palindromic substring in O(n) time, and does so by computing the longest palindromic substring at every possible center and then taking the maximum. So, you should be able to easily modify it for your purposes.

http://www.akalin.cx/2007/11/28/finding-the-longest-palindromic-substring-in-linear-time/

EDIT: The first link looks a little shaky upon closer inspection, so here's another one:

http://zhuhcheng.spaces.live.com/Blog/cns!DE38E96268C49F28!311.entry?wa=wsignin1.0&amp;sa=707413829

Paul Accisano
I don't really understand how they calculate P[i] in your second link. Can you clarify on that? All I see are a couple inequalities, but nothing on how to actually compute P. Your first link is a lot clearer in this regard, but some people say it's actually quadratic. I will write my own implementation and test for myself.
IVlad
I translated the python code in your first link to C++ and it looks like it's O(n). It runs instantly for a string made up of a single character and it also passes every test I tried. Looks like that's it, thanks!
IVlad