tags:

views:

4100

answers:

4

How do I use grep to perform a search which, when a match is found, will print the file name as well as the first n characters in that file? Note that n is a parameter that can be specified and it is irrelevant whether the first n characters actually contains the matching string.

+1  A: 

You need to pipe the output of grep to sed to accomplish what you want. Here is an example:

grep mypattern *.txt | sed 's/^\([^:]*:.......\).*/\1/'

The number of dots is the number of characters you want to print. Many versions of sed often provide an option, like -r (GNU/Linux) and -E (FreeBSD), that allows you to use modern-style regular expressions. This makes it possible to specify numerically the number of characters you want to print.

N=7
grep mypattern *.txt /dev/null | sed -r "s/^([^:]*:.{$N}).*/\1/"

Note that this solution is a lot more efficient that others propsoed, which invoke multiple processes.

Diomidis Spinellis
On my linux machine sed does not accept a -E (grep does though). It does however work with -r
Evan Teran
He is looking for the first N characters of the file, not N matching characters. Also, add /dev/null in there in case *.txt evaluates to 0 or 1 files. E.g. grep mypattern *.txt /dev/null | sed ...
Mr.Ree
The extended RE pattern I gave will not match lines with fewer than N characters, and these will therefore be correctly printed according to the specification. You are right about the need for adding the /dev/null argument.
Diomidis Spinellis
+2  A: 

There are few tools that print 'n characters' rather than 'n lines'. Are you sure you really want characters and not lines? The whole thing can perhaps be best done in Perl. As specified (using grep), we can do:

pattern="$1"
shift
n="$2"
shift
grep -l "$pattern" "$@" |
while read file
do
    echo "$file:" $(dd if="$file" count=${n}c)
done

The quotes around $file preserve multiple spaces in file names correctly. We can debate the command line usage, currently (assuming the command name is 'ngrep'):

 ngrep pattern n [file ...]


I note that @litb used 'head -c $n'; that's neater than the dd command I used. There might be some systems without head (but they'd pretty archaic). I note that the POSIX version of head only supports -n and the number of lines; the -c option is probably a GNU extension.

Jonathan Leffler
+4  A: 
grep -l pattern *.txt | 
    while read line; do 
        echo -n "$line: "; 
        head -c $n "$line"; 
        echo; 
     done

Change -c to -n if you want to see the first n lines instead of bytes.

Johannes Schaub - litb
A: 

Two thoughts here:

1) If efficiency was not a concern (like that would ever happen), you could check $status [csh] after running grep on each file. E.g.: (For N characters = 25.)

foreach FILE ( file1 file2 ... fileN )
  grep targetToMatch  ${FILE} > /dev/null
  if ( $status == 0 ) then
     echo -n "${FILE}:  "
     head -c25 ${FILE}
  endif
end

2) GNU [FSF] head contains a --verbose [-v] switch. It also offers --null, to accomodate filenames with spaces. And there's '--', to handle filenames like "-c". So you could do:

grep --null -l targetToMatch -- file1 file2 ... fileN |
xargs --null head -v -c25 --
Mr.Ree