tags:

views:

1682

answers:

7
+1  Q: 

bash grep newline

[Editorial insertion: Possible duplicate of the same poster's earlier question?]

Hi, I need to extract from the file:

first
second
third

using the grep command, the following line:

second
third

How should the grep command look like?

A: 

Line? Or lines?

Try

grep -E -e '(second|third)' filename

Edit: grep is line oriented. you're going to have to use either Perl, sed or awk to perform the pattern match across lines.

BTW -E tell grep that the regexp is extended RE.

Rob Wells
Lines. But what about constructing a regular expression that will be used by egrep? How the new line character is represented?
Markus
A: 
grep -E '(second|third)' /path/to/file
egrep -w 'second|third' /path/to/file
Andrejs Cainikovs
+2  A: 

I don't really understand what do you want to match. I would not use grep, but one of the following:

tail -2 file         # to get last two lines
head -n +2 file      # to get all but first line
sed -e '2,3p;d' file # to get lines from second to third

(not sure how standard it is, it works in GNU tools for sure)

liori
I agree. grep isn't really the right way to go on this.
Jim
+4  A: 

Your question abstract "bash grep newline", implies that you would want to match on the "second\nthird" sequence of characters - i.e. something containing newline within it.

Since the grep works on "lines" and these two are different lines, you would not be able to match it this way.

So, I'd split it into several tasks:

1) you match the line that contains "second" and output the line that has matched and the subsequent line:

grep -A 1 "second" testfile

2) you translate every other newline into the sequence that is guaranteed not to occur in the input. I think the simplest way to do that would be using perl:

perl -npe '$x=1-$x; s/\n/##UnUsedSequence##/ if $x;'

3) you do a grep on these lines, this time searching for string "##UnUsedSequence##third":

grep "##UnUsedSequence##third"

4) you unwrap the unused sequences back into the newlines, sed might be the simplest:

sed -e 's/##UnUsedSequence##/\n'

So the resulting pipe command to do what you want would look like:

grep -A 1 "second" testfile | perl -npe '$x=1-$x; s/\n/##UnUsedSequence##/ if $x;' | grep "##UnUsedSequence##third" | sed -e 's/##UnUsedSequence##/\n/'

Not the most elegant by far, but should work. I'm curious to know of better approaches, though - there should be some.

Andrew Y
A: 

So you just don't want the line containing "first"? -v inverts the grep results.

$ echo -e "first\nsecond\nthird\n" | grep -v first
second
third
Mark Rushakoff
+2  A: 

I don't think grep is the way to go on this.

If you just want to strip the first line from any file (to generalize your question), I would use sed instead.

sed '1d' INPUT_FILE_NAME

This will send the contents of the file to standard output with the first line deleted.

Then you can redirect the standard output to another file to capture the results.

sed '1d' INPUT_FILE_NAME > OUTPUT_FILE_NAME

That should do it.

If you have to use grep and just don't want to display the line with first on it, then try this:

grep -v first INPUT_FILE_NAME

By passing the -v switch, you are telling grep to show you everything but the expression that you are passing. In effect show me everything but the line(s) with first in them.

However, the downside is that a file with multiple first's in it will not show those other lines either and may not be the behavior that you are expecting.

To shunt the results into a new file, try this:

grep -v first INPUT_FILE_NAME > OUTPUT_FILE_NAME

Hope this helps.

Jim
I think you may have your last two examples backwards.
Telemachus
Thanks for catching that.
Jim
+3  A: 

Instead of grep, you can use pcregrep which supports multiline patterns

pcregrep -M 'second\nthird' file

-M allows the pattern to match more than one line.

notnoop