tags:

views:

107

answers:

4

All diff tools I've found are just comparing line by line instead of char by char. Is there any library that gives details on single line strings? Maybe also a percentage difference, though I guess there are separate functions for that?

A: 

What you could try is to split both strings up character by character into lines and then you can use diff on that. It's a dirty hack, but atleast it should work and is quite easy to implement.

Alternately you can split the string up into a list of chars in Python and use difflib. Check Python difflib reference

JPvdMerwe
I thought of this, and it looks like the "best" option so far. I've also considered looking into the line diff tools and try to make it treat chars as lines instead... but I thought I'd check first.
Tor Valamo
+1  A: 

I was looking for something similar recently, and came across wdiff. It operates on words, not characters, but is this close to what you're looking for?

Michael Williamson
Good, but whitespace should matter too. A tab replaced by a space would be a difference not picked up by this (if split by whitespace).
Tor Valamo
wdiff seems abandoned since 1994. it does work, mostly.
lhf
+2  A: 

This algorithm diffs word-by-word:

http://github.com/paulgb/simplediff

available in Python and PHP. It can even spit out HTML formatted output using the <ins> and <del> tags.

slebetman
Good, but whitespace should matter too. A tab replaced by a space would be a difference not picked up by this.
Tor Valamo
The source code looks simple enough. You can easily change it to split on empty string instead of whitespace so you can diff character-by-character.
slebetman
Actually this one works awesome, by passing the strings directly to diff() instead of through stringDiff(). Works nicely on a char by char basis, because strings are sequences in python. And the output of the function is easy to work with too. I'm wondering about the overhead of looking for largest common substring though, when each item is only one char... though I may be misunderstanding the code...
Tor Valamo
+1  A: 

You can implement a simple Needleman–Wunsch algorithm. The pseudo code is available on Wikipedia: http://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm

Pierre