tags:

views:

419

answers:

4

I want to use regular expressions (Perl compatible) to be able to find a pattern surrounded by two other patterns, but not include the strings matching the surrounding patterns in the match.

For example, I want to be able to find occurrences of strings like:

Foo Bar Baz

But only have the match include the middle part:

Bar

I know this is possible, but I can't remember how to do it.

+7  A: 

Parentheses define the groupings.

"Foo (Bar) Baz"

Example

~> cat test.pl
$a = "The Foo Bar Baz was lass";

$a =~ m/Foo (Bar) Baz/;

print $1,"\n";
~> perl test.pl
Bar
Vinko Vrsalovic
Wow you actually gave an entire script I have to vote this one up.
Cervo
+4  A: 

Use lookaround:

(?<=Foo\s)Bar(?=\sBaz)

This would match any "Bar" that is preceded by "Foo" and followed by "Baz", separated through a single white space. "Foo" and "Baz" would not be part of the final match.

Tomalak
Cervo
Oops, I did not think of that. Pattern adapted.
Tomalak
Your regex return the spaces before and after Bar.(?<=Foo\s)Bar(?=\sBaz) should do the job if there is only one space before and after Bar.
madgnome
Done already. Clicked "post" too early, you saw an intermediate version. ;-)
Tomalak
+2  A: 

$string =~ m/Foo (Bar) Baz/

$1

This may not be exactly what you want as the match is still "Foo Bar Baz". But it shows you how to just get the part that you are interested in. Otherwise you can use lookahead and lookbehind to get the match without consuming characters...

Cervo
+3  A: 

In the general case, you probably can't. The simplest approach is to match everything and use backreferences to capture the portion of interest:

Foo\s+(Bar)\s+Baz

This isn't the same as not including the surrounding text in the match though. That probably doesn't matter if all you want to do is extract "Bar" but would matter if you're matching against the same string multiple times and need to continue from where the previous match left off.

Look-around will work in some cases. Tomalak's suggestion:

(?<=Foo\s)Bar(?=\sBaz)

only works for fixed width look-behind (at least in Perl). As of Perl 5.10, the \K assertion can be used to effectively provide variable width look-behind:

Foo\s+\KBar(?=\s+Baz)

which should be capable of doing what you asked for in all cases, but would require that you're implementing this in Perl 5.10.

While it would be convenient, there's no equivalent of \K for ending the matched text, so you have to use a look-ahead.

Michael Carman
Excellent. I'd give +2 if I could :)
Vinko Vrsalovic
I'd also give it +2, because of the references to the new features in Perl 5.10.
Brad Gilbert