I have an array of strings, not many (maybe a few hundreds) but often long (a few hundred chars).
Those string are, generally, nonsense and different one from the other.. but in a group of those string, maybe 5 out of 300, there's a great similarity. In fact they are the same string, what differs is formatting, punctuation and a few words..
How can I work out that group of string?
By the way, I'm writing in ruby, but if nothing else an algorithm in pseudocode would be fine.
thanks