tags:

views:

80

answers:

3
+1  Q: 

Compare two files

I have 2 text files to compare on their first column. The following comm command shows the common lines from the 2 files correctly.

comm develop1.txt qa1.txt -12

But the following diff command does not show the difference as expected.

diff develop1.txt qa1.txt --side-by-side

Expected output is as follows:

mysql-data/webservice 280292 | mysql-data/webservice 28684

But these two values are not on the same line because the number in the last column are different. I do actually need to compare the numbers in side-by-side format. How can it be achieved?

+1  A: 

I'm not sure I understand exactly what you want to do. Did you try using the join command? Given two files that contain only the example lines you've shown, the result would be this:

$ join develop1.txt qa1.txt
mysql-data/webservice 280292 28684

You can tell join to output unpairable lines with:

$ join -a1 -a2 develop1.txt qa1.txt
abyx
Thanks. In other words what I am trying to do is...# mylist=`join develop1.txt qa1.txt | awk '{print $1}'`## join -a1 -a2 develop1.txt qa1.txt | grep -v `$mylist`## not working!
shantanuo
@shantanuo - if each line has only two columns then try `join -a1 -a2 file1 file2 | awk 'NF == 2 { print; }'`
abyx
This is not helping me to understand if the values shown are from the first file or the second one.
shantanuo
@shantanuo then run it once with `-a1` and once with `-a2`
abyx
+2  A: 

If you're up for something quick and dirty (not something I'd release into production but certainly okay for my own purposes):

for key in $(cat develop1.txt qa1.txt | awk '{print $1}' | sort -u) ; do
    devval=$(grep "^${key} " develop1.txt | awk '{print $2}')
    qa1val=$(grep "^${key} " qa1.txt | awk '{print $2}')
    if [[ "${devval}" != "${qa1val}" ]] ; then
        echo "$key: dev=[${devval}], qa=[${qa1val}]"
    fi
done

The first line retrieves all the unique keys from both files into a list (won't work if your keys have spaces but that's likely to make any solution harder to implement, and it doesn't appear to be the case here).

The second and third lines simply get the values for each key from the two files.

The if statement then prints out the key and the two values but only where the values are different.

Not pretty, not even thoroughly tested, but it may be adequate for your purposes. You do have to watch out for edge cases, like the possibility a key might exist multiple times in a file, or where the key may not be at the start of a line.

paxdiablo
## /mysql-data/support16 : /mysql-data/support16 5880 : /mysql-data/support16 438748 ## is it possible to suppress the lines like this? # I will like to see only the databases from one file not present in other.
shantanuo
@shantanuo, I've fixed it so that you don't get the key out three times (that was oversight on my part, sorry). Now the ones with only one key should containing the text `[]` (like `/mysql-data/support16: dev=[5880], qa=[]`) so you can run the script through a `| grep '\\[\\]'` to show you just those.
paxdiablo
A: 

I know that what I will say is not exactly what you ask, but have you tried a visual diff program? Such WinMerge (for Windows) or Meld (for Linux)? A preview of their interfaces is below (taken from google image):

WinMerge:

alt text

Meld:

alt text

Emanuel Vianna