tags:

views:

160

answers:

3

I'm trying to extract only certain elements of a string using regular expressions and I want to end up with only the captured groups.

For example, I'd like to run something like (is|a) on a string like "This is a test" and be able to return only "is is a". The only way I can partially do it now is if I find the entire beginning and end of the string but don't capture it:

.*?(is|a).*? replaced with $1

However, when I do this, only the characters preceding the final found/captured group are eliminated--everything after the last found group remains.

is is a test.

How can I isolate and replace only the captured strings (so that I end up with "is is a"), in both PHP and Perl?

Thanks!

Edit: I see now that it's better to use m// rather than s///, but how can I apply that to PHP's preg_match? In my real regex I have several captured group, resulting in $1, $2, $3 etc -- preg_match only deals with one captured group, right?

A: 

You put everything into captures and then replaces only the ones you want.

(.*?)(is|a)(.*?)
Jherico
That still only gets me "is is a test"...
Andrew
+5  A: 

If all you want are the matches, the there is no need for the s/// operator. You should use m//. You might want to expand on your explanation a little if the example below does not meet your needs:

#!/usr/bin/perl

use strict;
use warnings;

my $text = 'This is a test';

my @matches = ( $text =~ /(is|a)/g );

print "@matches\n";
__END__

C:\Temp> t.pl
is is a

EDIT: For PHP, you should use preg_match_all and specify an array to hold the match results as shown in the documentation.

Sinan Ünür
That worked! preg_match_all was the key. Thanks!
Andrew
+1  A: 

You can't replace only captures. s/// always replaces everything included in the match. You need to either capture the additional items and include them in the replacement or use assertions to require things that aren't included in the match.

That said, I don't think that's what you're really asking. Is Sinan's answer what you're after?

Michael Carman