tags:

views:

231

answers:

5

I am trying to parse the filename from paths. I have this:

my $filepath = "/Users/Eric/Documents/foldername/filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Linux path:";
print $1 . "\n\n";
print "-------\n";

my $filepath = "c:\\Windows\eric\filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Windows path:";
print $1 . "\n\n";
print "-------\n";

my $filepath = "filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Without path:";
print $1 . "\n\n";
print "-------\n";

But that returns:

Linux path:

-------
Windows path:Windowsic
                      ilename.pdf

-------
Without path:Windowsic
                      ilename.pdf

-------

I am expecting this:

Linux path:
filename.pdf
-------
Windows path:
filename.pdf
-------
Without path:
filename.pdf
-------

Can somebody please point out what I am doing wrong?

Thanks! :)

+5  A: 

Why not use File::Basename?

$name = basename($filepath)
print $name

The regex

m/^.*\\(.*[.].*)$/
#    ^^

assumes a separator \, so case 1 and 3 will never match. In case 2,

"c:\\Windows\eric\filename.pdf";

\e and \f are both special characters in Perl. So the code "correctly" returns Windows\eric\filename.pdf as the filename. Remember to use \\!

KennyTM
Thank you Kenny for this. I will look into File::Basename but want to get this to work first. :)
Eric
Why waste time getting it to work? Just use what works now and move on in life.
brian d foy
+4  A: 

Perl provides this capability: http://perldoc.perl.org/File/Basename.html

You also need to be wary of string escapes - your Windows path string is being escaped on '\', '\f' and '\e' - it's been a while since I've dealt with Perl escapes, but I'm guessing the \e is also swallowing the 'r' after it. This explains the unexpected output.

AllenJB
Thank you for this. Sure enough, the \f and \e were problems.
Eric
+7  A: 

In this case, as others have said, the mistake is to do it by hand.

In addition to File::Basename, you should take a look at File::Spec and Path::Class. They offer well-tested, cross-platform methods for handling files and directories. Path::Class in particular provides helper methods for dealing with file and directory names that are foreign to the system the script lives on. It looks like that might come in handy here.

#!/usr/bin/env perl
use strict;
use warnings;
use Path::Class qw/file foreign_file/;

my $nix = "/Users/Eric/Documents/foldername/filename.pdf";
my $win = 'c:\\Windows\eric\filename.pdf'; # single quote to avoid escape issues

print file($nix)->basename(), "\n";
print foreign_file('Win32', $win)->basename(), "\n";
Telemachus
+2  A: 

Well, the answer to what is happening would be: various errors.

my $filepath = "/Users/Eric/Documents/foldername/filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Linux path:";
print $1 . "\n\n";
print "-------\n";

$filepath doesn't have any \\s in it, so it won't match and there's no $1. You put /s in it. Your expression would have to be:

# regular expression matches return their captures in a list context.
my ( $path ) = $filepath =~ m|/([^/.]*\.[^/.]*)$|;
print "Linux path:$path\n\n-------\n"; # little need to . a " string

my $filepath = "c:\\Windows\eric\filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Windows path:";
print $1 . "\n\n";
print "-------\n";

You're using double quotes, which, taking their cue from UNIX shells, are more active than single quote strings. Thus, you need to escape all your backslashes, like this:

my $filepath = "c:\\Windows\\eric\\filename.pdf";

or just use single quotes:

my $filepath = 'c:\Windows\eric\filename.pdf';

Actually, since perl understands '/' for windows, this works too (but not for the regex.)

my $filepath = "c:/Windows/eric/filename.pdf";

As long as you fix it before handing it back to Windows.

my $filepath = "filename.pdf";
$filepath =~ m/^.*\\(.*[.].*)$/;
print "Without path:";
print $1 . "\n\n";
print "-------\n";

This didn't match, so $1 is still the last match. That's why it's repeated. But this points up the value of catching the captures instead of referring to $1.

Axeman
Thank you for pointing out all the errors. This is what I was looking for. :)
Eric
Axeman, can you look at the next entry? The $name at the end doesn't print. That is the last problem to solve on this. Thanks! :)
Eric
A: 

This is what I have now:

if ($filepath =~ m|/[^/.]*\.[^/.]*$|) {
    my ( $name ) = $filepath =~ m|/([^/.]*\.[^/.]*)$|;
    print $name;
} elsif ($filepath =~ m/^.*\\.*[.].*$/) {
    my ( $name ) = $filepath =~  m/^.*\\(.*[.].*)$/;
} else {
    my ( $name ) = $filepath;
}

print $name;

But, $name doesn't print.

Eric
`$name` does not exist out where you're trying to print it. Each block of the if-elsif-else structure is declaring its own `$name`. If you put `use strict;` at the top of your code, it would tell you (although it could be clearer about it) that there is no lexical `$name` "owned" by that block and so it creates a undefined scalar called `$name` and prints out nothing. It's not "not printing" it, it's just printing nothing.
Axeman
Perfect. I got it now working exactly as I need it! THanks Axeman. :)
Eric