tags:

views:

772

answers:

3

Suppose I have several strings: str1 and str2 and str3.

  • How to find lines that have all the strings?
  • How to find lines that can have any of them?
  • And how to find lines that have str1 and either of str2 and str3 [but not both?]?
A: 

Personally, I do this in perl rather than trying to cobble together something with grep.

For instance, for the first one:

while (<FILE>)
{
   next if ! m/pattern1/;
   next if ! m/pattern2/;
   next if ! m/pattern3/;

   print $_;
}
Paul Tomblin
-1: The question is not: what is the best tool to do that, but how do I do it with grep.
quosoo
And the answer is "Don't do it with grep, you'll go mad trying"
Paul Tomblin
@quosoo: among Unix programmers, "grep" can refer to either the grep program specifically, or the general problem of searching a body of text for strings or patterns. It's not clear which usage Tim intended, so I think Paul Tomblin's answer is on point.
Jim Lewis
groundhog
It's not meant to be concise, it's meant to be easy to understand, and more importantly, easy to remember.
Paul Tomblin
@paul: sorry, but "easy to understand" and "perl" are not linkable symbols.
groundhog
+2  A: 

You can't reasonably do the "all" or "this plus either of those" cases because grep doesn't support lookahead. Use Perl. For the "any" case, it's egrep '(str1|str2|str3)' file.

The unreasonable way to do the "all" case is:

egrep '(str1.*str2.*str3|str3.*str1.*str2|str2.*str1.*str3|str1.*str3.*str2)' file

i.e. you build out the permutations. This is, of course, a ridiculous thing to do.

For the "this plus either of those", similarly:

egrep '(str1.*(str2|str3)|(str2|str3).*str1)' file
chaos
+7  A: 

This looks like three questions. The easiest way to put these sorts of expressions together is with multiple pipes. There's no shame in that, particularly because a regular expression (using egrep) would be ungainly since you seem to imply you want order independence.

So, in order,

  1. grep str1 | grep str2 | grep str3

  2. egrep '(str1|str2|str3)'

  3. grep str1 | egrep '(str2|str3)'

you can do the "and" form in an order independent way using egrep, but I think you'll find it easier to remember to do order independent ands using piped greps and order independent or's using regular expressions.

groundhog
Nicely done. +1
Paul Tomblin
3 doesn't meet the "but not both" requirement, but that requirement is hard to meet. You'll need each side of the alternation to have a carefully crafted prefix and suffix excluding the other string from appearing anywhere else in the line.
Michael E
@Michael E: the 'but not both' bit is in italics and has a question mark after it because when I edited the question, I wasn't sure what the asker was after. @Groundhog wrote his (good) answer before I refined/revised the question with a comment/amendment that should, perhaps, be removed.
Jonathan Leffler
OK, my comment is therefore no longer relevant. I'll leave it up though so that yours isn't out of context. I hadn't looked at the question's edit history.
Michael E
@groundhog: did you consider `fgrep` at all?
Jonathan Leffler
@jonathan - I believe fgrep only works with a fixed parameter string, not a regular expression.
groundhog