What is the best algorithm to match or compute the distance between two strings in C# when the order or number of times a word appears is not important?
Best means:
- Would mostly agree with a human match
- Elegant
- Efficient
- Scalable, so that an input string could be matched to a potentially large collection of other strings
Related questions:
Some notes:
- Because of the order and occurrence independence, the inputs can be thought of as sets of unique words, not strings in the sense of arrays of characters
- Not specifically looking for a database solution, although one would be interesting
- I'm way too old for this to be a homework problem ;)