tags:

views:

572

answers:

2

I need to find occurrences of "(+)" in my sql scripts, (i.e., Oracle outer join expressions). Realizing that "+", "(", and ")" are all special regex characters, I tried:

grep "\(\+\)" *

Now this does return occurrences of "(+)", but other lines as well. (Seemingly anything with open and close parens on the same line.) Recalling that parens are only special for extended grep, I tried:

grep "(\+)" *
grep "(\\+)" *

Both of these returned only lines that contain "()". So assuming that "+" can't be escaped, I tried an old trick:

grep "([+])" *

That works. I cross-checked the result with a non-regex tool.

Question: Can someone explain what exactly is going on with the "+" character? Is there a less kludgy way to match on "(+)"?

(I am using the cygwin grep command.)

EDIT: Thanks for the solutions. -- And now I see that, per the GNU grep manual that Bruno referenced, "\+" when used in a basic expression gives "+" its extended meaning, and therefore matches one-or-more "("s followed by a ")". And in my files that's always "()".

+1  A: 

You probably need to add some backslashes because the shell is swallowing them.

ETA: Actually, I just tried on my Cygwin and grep "(+)" seems to work just fine for what you want.

KernelM
I just tried grep -E "\(\+\)" *, and that works. So it's does not seem to be a shell issue. Could cygwin non-extended regex be broken??
Chris Noe
Too bad only one answer can be accepted. Bruno walked me through it, but you were correct as well. I had forgotten that "+" is also only special for extended regex.
Chris Noe
No, those characters should be fine inside a double-quoted string.
Andrew Medico
+9  A: 

GNU grep (which is included in Cygwin) supports two syntaxes for regular expressions: basic and extended. grep uses basic regular expressions and egrep or grep -E uses extended regular expressions. The basic difference, from the grep manual, is the following:

In basic regular expressions the metacharacters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).

Since you want the ordinary meaning of the characters ( + ), either of the following two forms should work for your purpose:

grep "(+)" *       # Basic
egrep "\(\+\)" *   # Extended
Bruno De Fraine
I just discovered that your second example works, not the first. Hmm. Is cygwin grep busted??
Chris Noe
What is the output of grep --version? For me, it works for "grep (GNU grep) 2.5.1".
Bruno De Fraine
Hmm, same here: grep (GNU grep) 2.5.1
Chris Noe
And the following does not work? echo -e "(+)\n()\n+" | grep "(+)"
Bruno De Fraine
Ah, that is working! (I didn't realize you dropped the \ in front of +) So duh, "+" is also an extended regex. I was stubbornly thinking is was a basic. THANKS.
Chris Noe