tags:

views:

448

answers:

2

I need to manage lists with find-command. Suppose the lists have random names in non-distinct lists (ie their intersection is not empty set). How can I do:

A \ B

find files in the list A except the files in the list B

A intersection B

find files common to the lists A and B

Please, consult here.

A union B

find all files in the two lists

EXAMPLES

$ find . | awk -F"/" '{ print $2 }'

.zcompdump
.zshrc
.bashrc
.emacs

$ find ~/bin/FilesDvorak/.* -maxdepth 0 | awk -F"/" '{ print $6 }'

.bashrc
.emacs
.gdbinit
.git

I want:

A \ B:

.zcompdump
.zshrc

A Intersection B:

.bashrc
.emacs

A Union B:

.zcompdump
.zshrc
.bashrc
.emacs
.bashrc
.emacs
.gdbinit
.git

A try for the Intersection

When I save the outputs to separate lists, I cannot understand why the command does not take the common things, ie the above intersection:

find -f all_files -and -f right_files .

Questions emerged from the question:

  1. find ~/bin/FilesDvorak/.* -maxdepth 0 -and ~/.PAST_RC_files/.*

  2. Please, consult for recursive find Click here!

  3. find ~/bin/FilesDvorak/.* -maxdepth 0 -and list

+2  A: 

Seriously, this is what comm(1) is for. I don't think the man page could be much clearer: http://linux.die.net/man/1/comm

Dave
@Dave: Which keywords did you use to find the command? I could not find it by the keywords: union, intersection, exclude and find.
Masi
+1  A: 

There are several tools that can help you find the intersection in file lists. 'find' isn't one of them. Find is for finding files that match a certain criteria on the filesystem.

Here are some ways of finding your answer.

To generate your two file lists

find . -maxdepth 1 | sort > a
(cd ~/bin/FilesDvorak/; find . -maxdepth 1 | sort > b)

Now you have two files a and b that contain directory entries without recursing into sub directories. (To remove the leading ./ you can add a "sed -e 's/^.\///'" or your first awk command between the find an sort)

To find the Union

cat a b | sort -u

To find the A\B

comm -23 a b

To find the intersection

comm -12 a b

'man comm' and 'man find' for more information.

jabbie
@jabbie: Could you give an concrete example about how you use comm? I could not find a situation where I could not survive with vimdiff and sdiff only.
Masi
Comm is a very simple command compared to diff tools. Comm only works on lexically sorted files. It compares the two files and tells you which lines exist in only in the first file, only in the second file, or in both of the files. As a concrete example you could use it to compare two dict files. So on OSX I have a connectives and words file in /usr/share/dict. I can 'comm -23 connectives words' to see those lines in the connectives file that aren't in the words file. sdiff with grep or awk magic could do the same. Comm is just a simpler tool that doesn't work properly on unsorted files.
jabbie