views:

1561

answers:

6

I've been using the following command to grep for a string in all the python source files in and below my current directory:

find . -name '*.py' -exec grep -nHr <string> {} \;

I'd like to simplify things so that I can just type something like

findpy <string>

And get the exact same result. Aliases don't seem sufficient since they only do a string expansion, and the argument I need to specify is not the last argument. It sounds like functions are suitable for the task, so I have several questions:

  • How do I write it?
  • Where do I put it?
+1  A: 

Put the following three lines in a file named findpy

#!/bin/bash

find . -name '*.py' -exec grep -nHr $1 {} \;

Then say

chmod u+x findpy

I normally have a directory called bin in my home directory where I put little shell scripts like this. Make sure to add the directory to your PATH.

Chas. Owens
+1  A: 

The script:

#!/bin/bash
find . -name '*.py' -exec grep -nHr "$1" {} ';'

is how I'd do it.

You write it with an editor like vim and put it somewhere on your path. My normal approach is to have a ~/bin directory and make sure my .profile file (or equivalent) contains:

PATH=$PATH:~/bin
paxdiablo
I don't think you need the -r on your grep there ;-)
lhunath
+5  A: 

If you don't want to create an entire script for this, you can do it with just a shell function:

findpy() { find . -name '*.py' -exec grep -nHr "$1" {} \; ; }

...but then you may have to define it in both ~/.bashrc and ~/.bash_profile, so it gets defined for both login and interactive shells (see the INVOCATION section of bash's man page).

Gordon Davisson
My distribution has it set up that the only action that .bash_profile takes is to source .bashrc, so it's a non-issue. Thanks!
saffsd
functions and aliasses should go in ~/.bashrc and ~/.bash_profile or ~/.profile should source ~/.bashrc
lhunath
Yah, sourcing one from the other makes life much easier. For safety, I have my .bash_profile check first, like this: if [ -f ~/.bashrc ]; then . ~/.bashrc; fi
Gordon Davisson
+3  A: 

All the "find ... -exec" solutions above are OK in the sense that they work, but they are horribly inefficient and will be extremely slow for large trees. The reason is that they launch a new process for every single *.py file. Instead, use xargs(1), and run grep only on files (not directories):

#! /bin/sh
find . -name \*.py -type f | xargs grep -nHr "$1"

For example:

$ time sh -c 'find . -name \*.cpp -type f -exec grep foo {} \; >/dev/null'
real    0m3.747s
$ time sh -c 'find . -name \*.cpp -type f | xargs grep foo >/dev/null'
real    0m0.278s
Idelic
How does time sh -c 'find . -name \*.cpp -type f -exec grep foo {} + >/dev/null' compare? For me it was a little faster than xargs, but xargs gave me some "no such file or directory errors" on a few Python files with spaces in their names (thanks cmu). The -exec versions didn't complain.
Dennis Williamson
Using "-exec ... +" should be equivalent to xargs in terms of performance, but it's not portable and not as flexible as xargs. Spaces in file names can be easily handled by passing -print0 to find and -0 to args, so file names are delimited by NUL characters instead of blanks, i.e. "find -name \*.cpp -print0 | xargs -0 grep foo'.
Idelic
Actually, I just checked that "-exec ... +" is in POSIX, hence it can be considered portable. That leaves just flexibility as argument for xargs :-)
Idelic
I found out some problems with exotic file names (that contained ':' characters). Change it to the following:find . -type f -name \*.py -print0|xargs --null grep -nHr "$1"
Roalt
+4  A: 

On a side note, you should take a look at Ack for what you are doing. It is designed as a replacement for Grep written in Perl. Filtering files based on the target language or ignoring .svn directories and the like.

Example (snippet from Trac source):

$ ack --python foo ./mysource
ticket/tests/wikisyntax.py
139:milestone:foo
144:<a class="missing milestone" href="/milestone/foo" rel="nofollow">milestone:foo</a>

ticket/tests/conversion.py
34:        ticket['foo'] = 'This is a custom field'

ticket/query.py
239:        count_sql = 'SELECT COUNT(*) FROM (' + sql + ') AS foo'
Danny
A: 

Many versions of grep have options to do recursion, specify filename pattern, etc.

grep --perl-regexp --recursive --include='*.py' --regexp="$1" .

This recurses starting from the current directory (.), looks only at files ending in 'py', uses Perl-style regular expressions.

If your version of grep doesn't support --recursive and --include, then you can still use find and xargs, but be sure to allow for pathnames with embedded spaces by using the -print0 argument to find and the --null option to xargs to handle that.

find . -type f -name '*.py' -print0 | xargs --null grep "$1"

should work.

Harold Bamford
On my system, find is faster than grep --recursive. Also, grep --recursive returns "recursive directory loop" errors under some circumstances when find doesn't.
Dennis Williamson
Interesting! I've never gotten the "recursive directory loop" as I usually run on a Windows box which doesn't have real symbolic links. I assume that is how you got that error?
Harold Bamford