views:

590

answers:

4

Does anyone know of a simple way to compare two strings together to generate the "amount of difference" between the two? (in a numeric value) I have been crawling google with little luck on this. And after doing some coding it's not as simple as I had thought. Any clues?

+5  A: 

Are you talking about the "Edit Distance"? Do a search on "Levenshtein Distance", on SO or Google. I use the version posted on Stephen Toub's blog

Danimal
Upmod for being faster than me. :)
Bill the Lizard
You win for being first....exactly what I was looking for! Thanks :)
Adam Driscoll
that's one of the great things about SO -- I saw an earlier post on the topic, and it was immediately useful for me. Glad I could return the favor!
Danimal
+2  A: 

You're looking for the Levenshtein distance.

Bill the Lizard
By 10 seconds. That's actually a lot considering the Levenshtein distance between what we each typed.
Bill the Lizard
A: 

You would need to very clearly define "amount of difference". There's a lot of wiggle room in there.

For example, the old C/C++ function strcmp() function compared character by character and returned the difference the first time they didn't match.

On the other hand, the diff program provides a comprehensive list of differences between two files (which, in once sense, are also strings). How would you quantify that?

Joel Coehoorn
+1  A: 

You might want to look into the Levenshtein and Hamming distances. One calculates edit distance (insert, delete, modify) and the other bit flips.

Aaron Maenpaa