Does anyone know of a simple way to compare two strings together to generate the "amount of difference" between the two? (in a numeric value) I have been crawling google with little luck on this. And after doing some coding it's not as simple as I had thought. Any clues?
+5
A:
Are you talking about the "Edit Distance"? Do a search on "Levenshtein Distance", on SO or Google. I use the version posted on Stephen Toub's blog
Danimal
2008-10-09 18:16:10
Upmod for being faster than me. :)
Bill the Lizard
2008-10-09 18:18:03
You win for being first....exactly what I was looking for! Thanks :)
Adam Driscoll
2008-10-09 18:18:46
that's one of the great things about SO -- I saw an earlier post on the topic, and it was immediately useful for me. Glad I could return the favor!
Danimal
2008-10-09 18:20:31
By 10 seconds. That's actually a lot considering the Levenshtein distance between what we each typed.
Bill the Lizard
2008-10-09 18:21:11
A:
You would need to very clearly define "amount of difference". There's a lot of wiggle room in there.
For example, the old C/C++ function strcmp()
function compared character by character and returned the difference the first time they didn't match.
On the other hand, the diff program provides a comprehensive list of differences between two files (which, in once sense, are also strings). How would you quantify that?
Joel Coehoorn
2008-10-09 18:17:58
+1
A:
You might want to look into the Levenshtein and Hamming distances. One calculates edit distance (insert, delete, modify) and the other bit flips.
Aaron Maenpaa
2008-10-09 18:18:05