tags:

views:

1300

answers:

5

I have trouble using Perl grep() with a string that may contain chars that are interpreted as regular expressions quantifiers.

I got the following error when the grep pattern is "g++" because the '+' symbols are interpreted as quantifiers. Here is the output of for program that follows:

1..3
ok 1 - grep, pattern not found
ok 2 - grep, pattern found

Nested quantifiers in regex; marked by <-- HERE
in m/g++ <-- HERE / at escape_regexp_quantifier.pl line 8.

Is there a modifier I could use to indicate to grep that the quantifiers shall be ignored, or is there a function that would escape the quantifiers ?

#! /usr/bin/perl 

sub test_grep($)
{
    my $filter = shift;
    my @output = ("-r-xr-xr-x   3 root     bin       122260 Jan 23  2005 gcc",
                  "-r-xr-xr-x   4 root     bin       124844 Jan 23  2005 g++");
    return grep (!/$filter/, @output);
}

use Test::Simple tests => 2;

ok(test_grep("foo"), "grep, pattern not found");
ok(test_grep("gcc"), "grep, pattern found");
ok(test_grep("g++"), "grep, pattern found");

PS: in addition to the answer question above, I welcome any feedback on Perl usage in the above as I'm still learning. Thanks

+23  A: 

The standard way is to use the \Q escape indicator before your variable, to tell Perl not to parse the contents as a regular expression:

return grep (!/\Q$filter/, @output);

Altering that line in your code yields:

1..3
ok 1 - grep, pattern not found
ok 2 - grep, pattern found
ok 3 - grep, pattern found
Adam Bellaire
note \Q ends at \E, should probably ensure that \E in contents is escaped.
Hasturkun
You don't need to - \E in the *interpolated* pattern causes no problems.
hexten
+14  A: 

I think you are looking for quotemeta

Leon Timmermans
Yes, thas was the term I missed, thanks.
philippe
+8  A: 

in addition to the answer question above, I welcome any feedback on Perl usage in the above as I'm still learning. Thanks

I would advice you not to use prototypes (the ($) after test_grep). They have their uses, but not for most cases and definitely not in this one.

Leon Timmermans
can you elaborate on this? I've never seen that advice before.
Alnitak
See http://stackoverflow.com/questions/297034/why-are-perl-function-prototypes-bad
cjm
+1  A: 

PS: in addition to the answer question above, I welcome any feedback on Perl usage in the above as I'm still learning.

The best advice I can give for Perl coding advice in general is to install Perl::Critic and use the perlcritic command on your code. If you can't do that, you can use the on-line perl critic tool. It will help if you have a copy of Perl Best Practices handy, since Perl::Critic has already read the book and will give you references to page numbers, however even if you don't have the book around you can still find extended feedback in the Perl::Critic documentation sections starting with Perl::Critic::Policy::.

pjf
+2  A: 

I'd suggest using qr to create Regexp objects rather than strings in this case anyway.

ok(test_grep(qr/foo/), "grep, pattern not found");
ok(test_grep(qr/gcc/), "grep, pattern found");
ok(test_grep(qr/g\+\+/), "grep, pattern found");

Then you don't need the \Q escape. Though you can still use it:

ok(test_grep(qr/\Qg++/), "grep, pattern found");
Tanktalus