ansaurus

Question

Find out how much percent one string contains in another

Answer 1

A:

Uhh... can't you just use the number of characters that need to change then?

(length(destination)-changed_character_count)/ length(source)

EDIT: based on the revised question, treat both strings as sets, compute the set intersection, and base the percentage off the size of that set and the source string as a set.

MSN 2010-06-18 23:18:16

I need how much one string contains into another, for example "Ivan" in "This is Ivan Jovanov" is contained 100%.

Pece 2010-06-18 23:35:12

@Pece: the Levenshtein distance would tell you that. That's why you compare the length of the destination string minus the size of the edits to the length of the source string. In your test case, it should end up being 100% because you don't actually delete any characters from the source string.

MSN 2010-06-18 23:37:50

The problem in this is if I compare "Ivan" with "Ivaxxxn" is that if I use: "(length(destination)-changed_character_count)/ length(source)" it will return 100%

Pece 2010-06-18 23:58:05

That's an additional constraint you should probably specify.

MSN 2010-06-19 00:04:58

Answer 2

+2 A:

It sounds like you might want the longest common subsequence which is the basis for diff algorithms. Unfortunately this problem is NP-hard which means there is no efficient (polynomial time) solution. The Wikipedia page has some suggestions.

Mark Byers 2010-06-18 23:19:44

Here the problem only consider 2 strings, therefore it can be done in quadratic time.

Mgccl 2010-06-18 23:57:47

Write now I'm testing this, so I'll write the results in a few minutes.

Pece 2010-06-19 00:06:55

Yep the tests went well, thanks.I'll edit the Question with the c# algorithm.

Pece 2010-06-19 00:23:03

ansaurus

tags:

views:

answers:

Find out how much percent one string contains in another

related questions