Does anybody know of a library or piece of software out there that will locate irregularities in text? For example, lets say I have...
1. Name 1, Comment 2. Name 2, Comment 3. Name 3 , Comment 5. Name 10, Comment
This software or library would first cut up portions of text that it would find similar (much alike a piece of compression software would encode repetitive similar portions of text to compress it down) but using a variable for error tolerance it could find similar portions of text, now much alike a text comparison application or diff/merge tool it could actually highlight what it sees as different. I'm thinking about possibly making this tool but I do not wish to reinvent the wheel. If there is anything out there anywhere remotely capable of this I would really like to know to possibly help on this project or at least know not to make one. Not to mention this answer could possibly help other people hunting for the same thing, I would think the demand would be high enough for the supply that's why it boggles my mind that I can't find anything at all.