tags:

views:

146

answers:

3

Let's say I have a subroutine/method that a user can call to test some data that (as an example) might look like this:

sub test_output {
    my ($self, $test) = @_;
    my $output = $self->long_process_to_get_data();
    if ($output =~ /\Q$test/) {
        $self->assert_something();
    }
    else {
        $self->do_something_else();
    }
}

Normally, $test is a string, which we're looking for anywhere in the output. This was an interface put together to make calling it very easy. However, we've found that sometimes, a straight string is problematic - for example, a large, possibly varying number of spaces...a pattern, if you will. Thus, I'd like to let them pass in a regex as an option. I could just do:

$output =~ $test

if I could assume that it's always a regex, but ah, but the backwards compatibility! If they pass in a string, it still needs to test it like a raw string.

So in that case, I'll need to test to see if $test is a regex. Is there any good facility for detecting whether or not a scalar has a compiled regex in it?

+16  A: 

As hobbs points out, if you're sure that you'll be on 5.10 or later, you can use the built-in check:

 use 5.010;
 use re qw(is_regexp);
 if (is_regexp($pattern)) {
     say "It's a regex";
 } else {
     say "Not a regex";
 }

However, I don't always have that option. In general, I do this by checking against a prototype value with ref:

 if( ref $scalar eq ref qr// ) { ... }

One of the reasons I started doing it this way was that I could never remember the type name for a regex reference. I can't even remember it now. It's not uppercase like the rest of them, either, because it's really one of the packages implemented in the perl source code (in regcomp.c if you care to see it).

If you have to do that a lot, you can make that prototype value a constant using your favorite constant creator:

 use constant REGEX_TYPE => ref qr//;

I talk about this at length in Effective Perl Programming as "Item 59: Compare values to prototypes".

If you want to try it both ways, you can use a version check on perl:

 if( $] < 5.010 ) { warn "upgrade now!\n"; ... do it my way ... }
 else             { ... use is_regex ... }
brian d foy
Cute. Except for having to remember the messed up capitalization of `Regexp` (compared to all the other values `ref` returns), is there any other reason to prefer this over `ref $scalar eq 'Regexp'`?
Sinan Ünür
I hate magic constants and hard-coded strings and I try to get rid of them whenever I can. They are generally poor programming practice.
brian d foy
Anyone know why its not uc like the other types?
Eric Strom
It's not a core Perl data type. That is, there's no symbol table slot for a regex. You also can't have named regexes like you can with scalars, arrays, hashes, subroutines, filehandles, etc.
brian d foy
@brian - I like that approach...Do you also do ref [] and ref {}? or those are too basic to rate "magic constant" in your view?
DVK
The capitalization isn't that strange. Just imagine it was blessed in a package called `Regexp`.
mobrule
@brian - re "not a core data type" - do you know whether that will change in Perl6? Thanks!
DVK
@DVK: I use this for all reference type checking, even for hashes and arrays.
brian d foy
@brian - Is there any possible performance concern (if the ref[] is in tight loop)? Or it gets optimized to a constant by perl?
DVK
You don't have to ask me all these questions. Try it yourself. :)
brian d foy
Note that since Perl doesn't have an anonymous scalar constructor you'll have to create a named variable to use this pattern for that case. e.g. `ref \ do { my $x }`. *Don't* take a reference to an existing variable, as you'll get REF instead of SCALAR if it already holds a reference to something else.
Michael Carman
@Micheal: you don't need to take a reference to a scalar variable, just a reference to a scalar, like \''
brian d foy
Regex objects actually get slightly more "core" in 5.12.0, as they're now references to scalars of type REGEXP rather than references to scalars with magic. This is, however, completely invisible to user code, unless you manage to bypass overloaded stringification, in which case you'll notice that regexes now print as `Regexp=REGEXP(0x1234567)` instead of `Regexp=SCALAR(0x1234567)` :)
hobbs
+3  A: 

See the ref built-in.

eugene y
+7  A: 

As of perl 5.10.0 there's a direct, non-tricky way to do this:

use 5.010;
use re qw(is_regexp);
if (is_regexp($pattern)) {
    say "It's a regex";
} else {
    say "Not a regex";
}

is_regexp uses the same internal test that perl uses, which means that unlike ref, it won't be fooled if, for some strange reason, you decide to bless a regex object into a class other than Regexp (yes, that's possible).

In the future (or right now, if you can ship code with a 5.10.0 requirement) this should be considered the standard answer to the problem. Not only because it avoids a tricky edge-case, but also because it has the advantage of saying exactly what it means. Expressive code is a good thing.

hobbs
Very very sweet. Was hoping for something like this. Will probably use this once we can finally move past 5.8. >_>
Robert P