views:

29118

answers:

14

I'm looking for the string "foo=" (without quotes) in text files in a directory tree. It's on a common Linux machine, I have bash shell:

grep -ircl "foo=" *

In the directories are also many binary files which match "foo=". As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images): how would I do that?

I know there are the --exclude=PATTERN and --include=PATTERN options, but what is the pattern format? manpage of grep says:

--include=PATTERN     Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN     Recurse in directories skip file matching PATTERN.

Searching on grep include, grep include exclude, grep exclude and variants did not find anything relevant

If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option, I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find).

UPDATES: @Adam Rosenfield's answer is just what I was looking for:

grep -ircl --exclude=*.{png,jpg} "foo=" *

@rmeador's answer is also a good solution:

grep -Ir --exclude="*\.svn*" "pattern" *

It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders.(...)

+16  A: 

I use the following to grep source trees successfully:

grep pattern -r --include=*.{cpp,h} rootdir

I haven't used --exclude, but I assume the syntax is the same.

Adam Rosenfield
That's exactly what I was looking for, thanks =)
Piskvor
+1  A: 

I find grepping grep's output to be very helpful sometimes:

grep -rn "foo=" . | grep -v "Binary file"

Though, that doesn't actually stop it from searching the binary files.

Aaron Maenpaa
You can use `grep -I` to skip binary files.
Nathan Fellman
+1  A: 

find and xargs are your friends. Use them to filter the file list rather than grep's --exclude

Try something like

find . -not -name '*.png' -o -type f -print | xargs grep -icl "foo="
Andrew Stein
This doesn't work on filenames with spaces, but that problem is easily solved by using print0 instead of print and adding the -0 option to xargs.
Adam Rosenfield
+1  A: 

Try this one:

 $ find . -name "*.txt" -type f -print | xargs file | grep "foo=" | cut -d: -f1

Founded here: http://www.unix.com/shell-programming-scripting/42573-search-files-excluding-binary-files.html

Gravstar
This doesn't work on filenames with spaces, but that problem is easily solved by using print0 instead of print and adding the -0 option to xargs.
Adam Rosenfield
+14  A: 

If you just want to skip binary files, I suggest you look at the -I option. It ignores binary files. I regularly use the following command:

grep -rI --exclude-dir="\.svn" "pattern" *

It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders, for whatever pattern I want. I have it aliased as "grepsvn" on my box at work.

rmeador
Thanks, that's very useful for some other scenarios I've encountered.
Piskvor
+10  A: 

Please take a look at ack, which is designed for exactly these situations. Your example of

grep -ircl --exclude=*.{png,jpg} "foo=" *

is done with ack as

ack -icl "foo="

because ack never looks in binary files by default, and -r is on by default. And if you want only CPP and H files, then just do

ack -icl --cpp "foo="
Andy Lester
Looks nice, will try the standalone Perl version next time, thanks.
Piskvor
A: 

Hi! those scripts don't accomplish all the problem...Try this better:

du -ha | grep -i -o "\./.*" | grep -v "\.svn\|another_file\|another_folder" | xargs grep -i -n "$1"

this script is so better, because it uses "real" regular expressions to avoid directories from search. just separate folder or file names with "\|" on the grep -v

enjoy it! found on my linux shell! XD

A: 

The suggested command:

grep -Ir --exclude="*\.svn*" "pattern" *

is conceptually wrong, because --exclude works on the basename. Put in other words, it will skip only the .svn in the current directory.

+5  A: 

grep 2.5.3 introduced the --exclude-dir parameter which will work the way you want.

grep -rI --exclude-dir=.svn PATTERN .

You can also set an environment variable: GREP_OPTIONS="--exclude-dir=.svn"

I'll second Andy's vote for ack though, it's the best.

Corey
A: 

The --binary-files=without-match option to GNU grep gets it to skip binary files. (Equivalent to the -I switch mentioned elsewhere.)

(This might require a recent version of grep; 2.5.3 has it, at least.)

mjs
A: 

I'm a dilettante, granted, but here's how my ~/.bash_profile looks:

export GREP_OPTIONS="-orl --exclude-dir=.svn --exclude-dir=.cache --color=auto" GREP_COLOR='1;32'

Note that to exclude two directories, I had to use --exclude-dir twice.

4D4M
+4  A: 

In grep 2.5.1 you have to add this line to ~/.bashrc or ~/.bash profile

export GREP_OPTIONS="--exclude=\*.svn\*"
deric
A: 

too ignore all binary results from grep

grep -Ri "pattern" * | awk '{if($1 != "Binary") print $0}'

the awk part will filter out all the Binary file foo matches lines

lathomas64
+2  A: 

http://bmmkevin.blogspot.com/2010/10/exclude-svn-folder-and-backup-files.html

Solution summaried in my blog.

Kevin
`export GREP_OPTIONS="--exclude-dir=\\*/.svn/\\* --exclude=\\*~"` - interesting, good to know that grep reads $GREP_OPTIONS. Thanks.
Piskvor