tags:

views:

95

answers:

5

I think this question is repeated but searching wasn't helpful for me.

my $pattern = "javascript:window.open\('([^']+)'\);";
$mech->content =~ m/($pattern)/;
print $1;

I want to have an external $pattern in the RegEx. How can I do this? The current one returns:

Use of uninitialized value $1 in print at main.pm line 20.
+1  A: 

You do not need the parentheses in the match pattern. It will match the whole pattern and return that as $1, which I am guess is not matching, but I am only guessing.

$mech->content =~ m/$pattern/;

or

$mech->content =~ m/(?:$pattern)/;

These are the clustering, non-capturing parentheses.

The way you are doing it is correct.

kuroutadori
I removed the brackets still it returns the same error. TO be sure that RegEx is correct I copy pasted it as `m/javascript:window.open\('([^']+)'\);/;` and it returns the required Result. Another intresting aspect is that when I try `[0-9]+` in `$pattern`. It returns a correct result.
Shubham
A: 

It reports an uninitialized value because $1 is undefined. $1 is undefined because you have created a nested matching group by wrapping a second set of parentheses around the pattern. It will also be undefined if nothing matches your pattern.

Tore A.
+2  A: 

$1 was empty, so the match did not succeed. I'll make up a constant string in my example of which I know that it will match the pattern.

Declare your regular expression with qr, not as a simple string. Also, you're capturing twice, once in $pattern for the open call's parentheses, once in the m operator for the whole thing, therefore you get two results. Instead of $1, $2 etc. I prefer to assign the results to an array.

my $pattern = qr"javascript:window.open\('([^']+)'\);";
my $content = "javascript:window.open('something');";
my @results = $content =~ m/($pattern)/;
# expression return array
# (
#     q{javascript:window.open('something');'},
#     'something'
# )
daxim
That works, but if you just want a list of the inner matches (argument to 'open'), you probably want to use this in the third line: my @results = $content =~ m/$pattern/g;Note that I removed the parentheses and added the g flag.
Tore A.
+1  A: 

The solutions have been already given, I'd like to point out that the window.open call might have multiple parameters included in "" and grouped by comma like:

javascript:window.open("http://www.javascript-coder.com","mywindow","status=1,toolbar=1");

There might be spaces between the function name and parentheses, so I'd use a slighty different regex for that:

my $pattern = qr{
    javascript:window.open\s*
    \(
    ([^)]+)
    \)
}x;

print $1 if $text =~ /$pattern/;

Now you have all parameters in $1 and can process them afterwards with split /,/, $stuff and so on.

rubber boots
That `split` will also split on the comma in `status=1,toolbar=1` which is wrong. A better tool like [`Data::Record`](http://p3rl.org/Data::Record) is neccesary.
daxim
+2  A: 

When I compile that string into a regex, like so:

my $pattern = "javascript:window.open\('([^']+)'\);";
my $regex   = qr/$pattern/;

I get just what I think I should get, following regex:

(?-xism:javascript:window.open('([^']+)');)/

Notice that it it is looking for a capture group and not an open paren at the end of 'open'. And in that capture group, the first thing it expects is a single quote. So it will match

javascript:window.open'fum';

but not

javascript:window.open('fum');

One thing you have to learn, is that in Perl, "\(" is the same thing as "(" you're just telling Perl that you want a literal '(' in the string. In order to get lasting escapes, you need to double them.

my $pattern = "javascript:window.open\\('([^']+)'\\);";
my $regex   = qr/$pattern/;

Actually preserves the literal ( and yields:

(?-xism:javascript:window.open\('([^']+)'\);)

Which is what I think you want.

As for your question, you should always test the results of a match before using it.

if ( $mech->content =~ m/($pattern)/ ) { 
     print $1;
}

makes much more sense. And if you want to see it regardless, then it's already implicit in that idea that it might not have a value. i.e., you might not have matched anything. In that case it's best to put alternatives

$mech->content =~ m/($pattern)/;
print $1 || 'UNDEF!';

However, I prefer to grab my captures in the same statement, like so:

my ( $open_arg ) = $mech->content =~ m/($pattern)/;
print $open_arg || 'UNDEF!';

The parens around $open_arg puts the match into a "list context" and returns the captures in a list. Here I'm only expecting one value, so that's all I'm providing for.

Finally, one of the root causes of your problems is that you do not need to specify your expression in a string in order for your regex to be "portable". You can get perl to pre-compile your expression. That way, you only care what instructions the characters are to a regex and not whether or not you'll save your escapes until it is compiled into an expression.

A compiled regex will interpolate itself into other regexes properly. Thus, you get a portable expression that interpolates just as well as a string--and specifically correctly handles instructions that could be lost in a string.

my $pattern = qr/javascript:window.open\('([^']+)'\);/;

Is all that you need. Then you can use it, just as you did. Although, putting parens around the whole thing, would return the whole matched expression (and not just what's between the quotes).

Axeman
Declaring a regexp can be done in one step without the literal string intermediate form (here named `$pattern`).
daxim
@daxim, I knew that. I'm just illustrating why what he wrote isn't working. I just included the `qr/` to demonstrate how I was compiling the statement, to examine what Perl took the string to represent.
Axeman
I was writing that comment for the benefit of future readers, not for you.
daxim