views:

177

answers:

4

I have the following Perl script counting the number of Fs and Ts in a string:

my $str = "GGGFFEEIIEETTGGG";
my $ft_count = 0;
$ft_count++ while($str =~ m/[FT]/g);
print "$ft_count\n";

Is there a more concise way to get the count (in other words, to combine line 2 and 3)?

+15  A: 
Sinan Ünür
this should be `my $ft_count = $str =~ tr/FT/FT/;`, otherwise it will remove the characters from `$str`
Hasturkun
ok, missed that. you can still remove the brackets, though.
Hasturkun
It counts the brackets. Must be `tr/FT//`
M42
Of course tr/// isn't a regex so *technically* it doesn't answer the specific question :-) It's much better then using a regexp though.
ishnid
Your benchmark is letting the "m" case get away with just finding the first match, since the regex match is in scalar context. If I fix that line to "my $cnt = () = $y =~ m/[FT]/g;", "tr" ends up around 3000% better than "m" (on my Linux box). Incidentally, the original code is about twice as fast as "m".
aschepler
@Sinan +1 for suggesting `tr///`. I think your benchmark has a bug. In order to count replacements with regex, you need an intervening list context: `my $cnt = ()= $y =~ m/[FT]/g;`. When you run it that way, `tr///` is much faster than `m//`. I'm also on v5.10 under ActivePerl.
FM
@FM Geez. I could have sworn I had typed everything correctly. Thank you also @aschepler. My sanity has been restored.
Sinan Ünür
@Sinan Ünür This is why it is a good idea to have a test section before the benchmark. I normally stuff the lambdas to be tested into a hash, iterate over the hash printing its return value, and then performing then benchmark. If any of the values differ, then I know I have a bad benchmark.
Chas. Owens
+8  A: 

Yes, you can use the CountOf secret operator:

my $ft_count = ()= $str =~ m/[FT]/g;
Chas. Owens
Also known as the goatse operator ;) `=()=`
Daenyth
A: 

You can combine line 2, 3 and 4 into one like so:

my $str = "GGGFFEEIIEETTGGG";
print $str =~ s/[FT]//g; #Output 4;
Mike
Being a comment on another answer, this would be better as a comment than an answer :)
ysth
@ysh,thanks for the comment. I didn't realize my answer is actually a comment on another answer, is it? The OP asks [Is there a more concise way to get the count (in other words, to combine line 2 and 3) and here's my answer to the question. Had someone already mentioned what I suggested?
Mike
@ysth, the original post is a possible duplicate of this [http://stackoverflow.com/questions/1849329/is-there-a-perl-shortcut-to-count-the-number-of-matches-in-a-string/1850686#1850686] and I posted a similar solution to that question. I think this post can be combined with that one.
Mike
+8  A: 

When the "m" operator has the /g flag AND is executed in list context, it returns a list of matching substrings. So another way to do this would be:

my @ft_matches = $str =~ m/[FT]/g;
my $ft_count = @ft_matches; # count elements of array

But that's still two lines. Another weirder trick that can make it shorter:

my $ft_count = () = $str =~ m/[FT]/g;

The "() =" forces the "m" to be in list context. Assigning a list with N elements to a list of zero variables doesn't actually do anything. But then when this assignment expression is used in a scalar context ($ft_count = ...), the right "=" operator returns the number of elements from its right-hand side - exactly what you want.

This is incredibly weird when first encountered, but the "=()=" idiom is a useful Perl trick to know, for "evaluate in list context, then get size of list".

Note: I have no data on which of these are more efficient when dealing with large strings. In fact, I suspect your original code might be best in that case.

aschepler