tags:

views:

291

answers:

6

I have the following Perl code:

my $progName = shift ;

open(IPLAYERLIST, "iplayer-list.html") or die "Cannot open iplayer index file iplayer-list.html\n" ;
while (<IPLAYERLIST>) {

 if ( /($progName)/is ) {
 #if ( /Just A Minute/is ) { <-- This works!
  my $iplayerID = $1 ;
  print "IPlayer program id for $progName is $iplayerID\n" ;

  #  === do stuff here ===
 }
 else
 {
  print "Failed to match $progName in $_\n";
 }
}

IPLAYERLIST is a BBC IPlayer listing so it is searching for a specific program name.

If I call this with $progName = "Just A Minute", it fails to match, even though the string is in the file. If I call it with a single character, eg "M" then it succeeds. If I replace the $progName variable with a constant string ("Just A Minute") then it succeeds. When it prints $progName it always prints the correct string so I can't see how the regexp could be getting anything different.

I have cut the code and pasted it into a test script:

#!/usr/bin/perl
use strict ;

my $searchstr = "foo bar Just A Minute baz boo" ;
my $progName = $ARGV[0] ;
print "searching for [$progName] in [$searchstr]\n" ;
if ( $searchstr =~ /$progName/is ) {
    print "Well the test worked\n" ;
} else {
    print "Failed to match [$progName] in [$searchstr]\n";
}

and that works fine. So why does the first example not find "Just A Minute" in a file containing "Just A Minute"?!?!?

--- Alistair.

A: 

There doesn't seem to be anything wrong with your example. It works just fine in my tests.

Can you give us the complete error output you're seeing, as in the "Failed to match X in Y" output?

The only thing I can think of is that $progName isn't set to the right value. Seeing the complete error output would rule that out.

Adam Bellaire
+1  A: 

Your program (the first one) works fine for me.

Note that you have to quote the argument sting (because it contains spaces), otherwise you're just looking for a match with "Just". So run it like this...

perl yourprog.pl "Just A Minute"

I ran it with this input file:

Foo
Just A Minute
Bar

Which outputs...

Failed to match Just A Minute in Foo

IPlayer program id for Just A Minute is Just A Minute
Failed to match Just A Minute in Bar

Note the blank lines after the Foo and Bar lines. That's because you are not chopping the newlines off the lines read from the file. So there is a "\n" on the end of "Foo\n" and "Bar\n" which gets printed in the output. But this does not affect the matching.

noswonky
A: 

Check your html file.

I ran the following

my $progName = shift ;

open(IPLAYERLIST, "list.txt") or die "Cannot open iplayer index file\n" ;
while (<IPLAYERLIST>) {

        if ( /($progName)/is ) {
        #if ( /Just A Minute/is ) { <-- This works!
                my $iplayerID = $1 ;
                print "IPlayer program id for $progName is $iplayerID\n" ;

                #  === do stuff here ===
        }
        else
        {
                print "Failed to match $progName in $_\n";
        }
}

with the following file list.txt:

egg
spam
foo bar Just A Minute baz boo
egg spam Just A Minute spam egg
foo
bar

It seems to work, the output for perl prog.pl "just a minute" is

Failed to match just a minute in egg

Failed to match just a minute in spam

IPlayer program id for just a minute is Just A Minute
IPlayer program id for just a minute is Just A Minute
Failed to match just a minute in foo

Failed to match just a minute in bar
Federico Ramponi
A: 

I will try to post a better test with results etc tomorrow. I will need to extract the function and wrap it first. Right now it is time for bed!

A: 

If your list is in HTML, what's your guarantee that the "Just A Minute" you see in a browser is actually "Just A Minute" in your source code?

It could be

Just    A    Minute (extra spaces)
Just  
A  
Minute
Just <!--comment-->A Minute
Just[the nbsp entity]A Minute

and so on and so on.

Show us the HTML.

AmbroseChapel
A: 

I extracted the entire function into a test program and it ran perfectly! I will have to spend some time isolating the issue before I re-post this question. At the moment it looks like I would have to post the entire 700 line program, with supporting files and instructions to allow people to test it, which is beyond the scope of stackoverflow.

--- Alistair.