tags:

views:

161

answers:

4

If I am given a rexexp in Perl, can I find out how many capturing brackets there are? So, for example:

\w       -> 0
(\w)     -> 1
(\w(\w)) -> 2
+4  A: 

Do you want to know how many matches there are or how many sets of brackets? If you want to be able to count the brackets then you might want to take a look at a module like Text::Balanced which parses delimited text.

On the other hand, if you want to know how matches there are you would be better off executing your regular expression in list context:

my @matches = $string_to_match_on =~ /(\w(\w))/;

The size of the list will give you the number of matches:

my $count = @matches;

(as a list or array in scalar context gives the size of the list or array).

Nic Gibson
I am trying to find out the number of captures (possible matches).I am wary of trying to count brackets as becuse of edge cases like "\(" or [(]
justintime
The I think you probably need a 'proper' parser such as Text::Balanced or you could throw caution to wind and write one using Parse::RecDescent. You are right that counting brackets is unlikely to work much of the time. Are you dealing with regular expressions as input to your code?
Nic Gibson
+1  A: 

There are two special array @- and @+ containing the start positions and the end positions of successful matches. Use the array length when matching is done.

Static analysis: To know all pairs you need to parse the regex string. Count all unescaped opening bracktes which have a closing one.

fgm
Unescaped opening brackets WITHOUT a "?:" following them, as (?:text) doesn't capture.
Chris Lutz
Using the @- array is gives me the info I want.
justintime
Rather late to accept the anwser. When I posted I was new to SO and hadn't worked out accepting posts.
justintime
+1  A: 

It's not really trivial as not all parentheses are not capturing - for example (?:...), (?=...) and so on.

Generally, remember you can always:

my @catch_all = $string =~ m/......................./;

and then just check @catch_all;

depesz
If I do that, how do I tell the difference between a regexp with no captures and match failing.
justintime
If that matches fails, the empty list is always returned. If the match succeeds, the list will contain the single number '1' if there are no capturing parentheses.Depending upon what you're doing, you may also find named captures useful. There's a Perl-tip on them at http://perltraining.com.au/tips/2008-02-08.html
pjf
+2  A: 

It is important to know why you need this.

Does YAPE::Regex help?

Edit: Here is demonstration:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;
use YAPE::Regex;

my $regex = qr/^(A)(B)(C)[0-9]+(\w+)$/;

my $parser = YAPE::Regex->new($regex);

my $n_captures;

while (my $node = $parser->next) {
    if ( $parser->state =~ /^capture\(([0-9]+)\)$/ ) {
        $n_captures = $1;
    }
}

print "$n_captures\n";


C:\Temp> t
4
Sinan Ünür