Will the Levenshtein distance algorithm work well for non-English language strings too?
Update: Would this work automatically in a language like Java when comparing Asian characters?
Will the Levenshtein distance algorithm work well for non-English language strings too?
Update: Would this work automatically in a language like Java when comparing Asian characters?
Yes. But you have to treat the non-english characters as "1 character", not as multiple characters (for example with utf-8). For example, in python you would use the unicode class to represent the string (and characters).
Levenshtein doesn't care about languages, it just tells you how many characters need to be changed (added, removed, exchanged) to get from one string to the other.
So: yes, but you'll have to check your charset, some foreign "single" characters my otherwise be treated as two (or more) characters.
Only if language is letter based. For example Russian, German,... but hieroglyph (China for example) or syllable (like Laos) - not.