views:

301

answers:

2

Hi,

I'm trying to calculate the amount of longest possible subsequences that exist between two strings.

e.g. String X = "efgefg"; String Y = "efegf";

output: The Number of longest common sequences is: 3 (i.e.: efeg, efef, efgf - this doesn't need to be calculated by the algorithm, just shown here for demonstration)

I've managed to do this in O(|X|*|Y|) using dynamic programming based on the general idea here: http://stackoverflow.com/questions/2207987/cheapest-path-algorithm.

Can anyone think of a way to do this calculation with better runtime efficiently?

--Edited in response to Jason's comment.

A: 

I don't know but here are some attempts at thinking aloud:

The worst case I was able to construct has an exponential - 2**(0.5 |X|) - number of longest common subsequences:

X = "aAbBcCdD..."
Y = "AaBbCcDd..."

where the longest common subsequences include exactly one of {A, a}, exactly one of {B, b} and so forth... (nitpicking: if you alphabet is limited to 256 chars, this breaks down eventually - but 2**128 is already huge.)

However, you don't necessarily have to generate all subsequences to count them. If you've got O(|X| * |Y|), you are already better than that! What we learn from this is that any algorithm better than yours must not attempt to generate the actual subsequences.

Beni Cherniavsky-Paskin
+1  A: 

Longest common subsequence problem is a well studied CS problem.

You may want to read up on it here: http://en.wikipedia.org/wiki/Longest_common_subsequence_problem

KaptajnKold