tags:

views:

19

answers:

1

Let's say we have two files.

match.txt: A file containing patterns to match:

fed ghi
tsr qpo

data.txt: A file containing lines of text:

abc fed ghi jkl
mno pqr stu vwx
zyx wvu tsr qpo

Now, I want to issue a grep command that should return the first and third line from data.txt:

abc fed ghi jkl
zyx wvu tsr qpo

... because each of these two lines match one of the patterns in match.txt.

I have tried:

grep -F -f match.txt data.txt

but that returns no results.

grep info: GNU grep 2.6.3 (cygwin)
OS info: Windows 2008 R2

Update: The fix is to use this command: tr -d "\r" <match.txt | grep -F -f - text.txt

It seems that grep does not correctly respect windows line endings (CR/LF) for match files presented to it via the -f flag.

+1  A: 

I just tried exactly the example you gave and it worked as expected.

[~] $ grep -F -f match.txt data.txt 
abc fed ghi jkl
zyx wvu tsr qpo

Can you give more information? What OS are you running? What version of grep? What line endings do your input files contain?

JohnCC
There is a windows tag attached to this question. I have updated the original question with full version info.
Michael Goldshteyn
I saw the tag. Is this Cygwin grep or SFU grep?
JohnCC
Cygwin - question updated...
Michael Goldshteyn
If your files have windows (CRLF) line endings and Cygwin was set for Unix line endings, this will cause you problems. That is because the input strings in the match file will be read as "fed ghi^M" where ^M is a carriage return. Your data will be read as "abc fed ghi jkl^M". As you can see the first does not occur in the second! Actually, with the test data you gave, I get one match using windows line endings, because one of the patterns occurs at the end of the line. With Unix line endings, the grep works as you hoped.
JohnCC
Both of the input files have windows line endings (CR/LF). I get no matches...
Michael Goldshteyn
You can use "dos2unix <file>" to convert the endings. You can also go the other way with "unix2dos <file>". Cygwin can be set to expect Windows or DOS line endings at install time and every time "mount" is used. To see which you chose, run the installer again or do "mount". If a mount has the "binary" option, it will use Unix line endings. If it has "text" it will use DOS/Win line endings.
JohnCC
If I change the match lines to eliminate the space between words. I get matches. The problem is related to spaces in the match.txt file, despite the fact that I used the -F switch.
Michael Goldshteyn
I ran mount, and the drive containing both files is set to text (DOS/Win line endings).
Michael Goldshteyn
Changing the line endings on match.txt to unix line endings does fix the problem. Updating question...
Michael Goldshteyn
That's odd. By the way, you can do an on-the-fly translation with: tr -d '\r' < data.txt | grep -F -f <(tr -d '\r' < match.txt)
JohnCC
It is unnecessary to do the transformation on data.txt. Just removing the \r chars from match.txt fixes the problem. I just don't understand why grep is not respecting the windows line endings in match.txt
Michael Goldshteyn
The fix: tr -d "\r" <match.txt | grep -F -f - text.txt . I have accepted your answer, thanks for the help.
Michael Goldshteyn
Very interesting! Thanks for letting me know the final outcome.
JohnCC