ansaurus

Question

"Minus" operation on two files using Linux commands

Answer 1

A:

grep -x -v -f B A | grep -x -v -f C | grep -x -v -f D

The -v switch is an inverse match (i.e. match all except). The -f switch takes a file with a list of patterns to match. The -x switch forces it to match whole lines (so that lines that are substrings of other lines don't cause the longer lines to be removed).

Tyler McHenry 2009-09-03 01:23:25

Answer 2

+1 A:

Look at the join command. Read its man page and you should find what you seek.

Michael E 2009-09-03 01:39:12

Answer 3

A:

join A B | join - C | join - D

biznez 2009-09-03 01:57:15

Doesn't that do pretty much the opposite of what you want? That would give you lines that exist in all four files. Plus, it doesn't work if any of your lines have spaces in them.

Tyler McHenry 2009-09-03 02:00:22

Yea. Sorry a straight join should do it.

biznez 2009-09-03 02:03:02

But still... I'm not an expert on join but from reading the man page, join A B will give you all the lines in both A and B, not the lines in A but not B, which is what you asked about. From what I can tell the join-based answer to your original question would be something like: `join -t \n -v 1 A B | join -t \n -v 1 - C | join -t \n -v 1 - D`

Tyler McHenry 2009-09-03 02:07:54

Answer 4

+2 A:

comm is good for this, either:

cat B C D | sort | comm -2 -3 A -

or:

comm -2 -3 A B | comm -2 -3 - C | comm -2 -3 - D

depending on what's easier/clearer for your script.

caf 2009-09-03 02:02:33

I'd say this is easily the simplest of the answers that have been given so far.

Tyler McHenry 2009-09-03 02:10:48

ansaurus

tags:

views:

answers:

"Minus" operation on two files using Linux commands

related questions