For 2 strings, I want to find the number of distinct LCS's. I read on wiki on how to print all LCS's but how to check that they are distinct? The hash table is not feasible as my input string each can be 1500-2000 characters long so maximum number of LCS's can be 2000 choose 1000
Can you specify.
- Do substrings have to be consecutive, or any sub sequence would be ok? For example, in "abcde" would "abde" be a valid substring?
- What do you mean by distinct?
You can use a hash table, but instead of storing the whole substring, you just store (a list of) the beginning and end of it relative to the original string. This way, you can do a string compare in the original string if there are any collisions.
Throw the two strings to a suffix tree. This is time and space linear in length of the concatenation of the two strings.
Hi, Good god I found a posting on this topic. Actually, I faced this problem in Amazon interview recently and I sent a code to them for review. They did not tell me whether its correct or wrong but I did not get any further inputs from them (so I assumed i have erred somewhere). I dont want the efficient implementation as you guys discuss. Can anyone just review the code I've written to check its correct or not.
http://www.technicalypto.com/2010/02/find-common-sequence-that-is-out-of.html
Many thanks