tags:

views:

92

answers:

6

I have a string that looks like: /somedir/ref/some-dir/foo.word

How could I extract foo from the above string? The whole string, including foo may vary, however, the structure is always the same. It will always the letters between the last slash and the last dot.

+12  A: 

It looks like you're trying to find the filename (without extension) from a fully-qualified file path. If this is the case, then look into the File::Basename core module:

my $str = "/somedir/ref/some-dir/foo.word";
my( $filename, $directory, $suffix ) = fileparse($str, qr/\.[^.]*/);

The fileparse() method takes two arguments: the string to be parsed and the file suffix to be removed. If you don't know what the file suffix is going to be beforehand, then you can supply a regular expression. In this case, the suffix will match a period followed by zero or more non-period characters.

Edit: And if you're not finding filenames, and want the letters between the last / and the last ., try this:

my $str = "/somedir/ref/some-dir/foo.word";
my @elems1 = split '/', $str;
my @elems2 = split '\.', $elems1[-1];
my $foo = $elems2[-2];

TIMTOWTDI! :-)

CanSpice
you may want to join all but the last element of `@elems2` rather than just getting the second to last.
flies
A: 
my ($foo) = $str =~ m|/(\w+)[^/]+$|;

That assumes the "foo" part can consist of any "word" characters (alphanumeric plus underscore).

Sean
This is incorrect, and will not work on the test case given.
Ryan Mentley
I checked it before I posted it. This command prints "foo": perl -le 'my $str = "/somedir/ref/some-dir/foo.word"; my ($foo) = $str =~ m|/(\w+)[^/]+$|; print $foo'
Sean
The word to be matched can be anything, so \w cannot match for example : `foo-bar`
bourbaki
bourbaki: No, the word to be matched can contain only letters, per the original poster's comment.
Sean
@Sean: you're right, i didn't saw the PO's comments. Sorry.
bourbaki
If you are only matching letters, \w is still wrong.
brian d foy
@foy: \w does exactly what I said it does: alphanumeric plus underscore. I posted my answer before the OP clarified that only letters were allowed.
Sean
Why the hell are people still upvoting that "This is incorrect" comment? My answer works perfectly on the original case, as I demonstrated. If you somehow disagree, what's incorrect about it? What output do you see?
Sean
A: 

try this

s/.*\/([^.]*)\..*/$1/g
Hemang
A: 

Try

if ($str =~ /\/([^\/]+)\.[^\/]*?$/) {
    $foo = $1; # This is the word 'foo' in your test case.
} else {
    die("Error matching string");
}

Demonstration (using Ruby, but the regex is the same in both languages): http://www.rubular.com/r/7FUeFFV4QI

Edit: Fixed a bug

Ryan Mentley
If $str doesn't match that regex, you'll be assigning to $foo whatever $1 was after the last successful regex match.
Sean
While you are certainly correct, does that truly merit a downvote? Fixed in the post.
Ryan Mentley
@Sean - This is not an issue at all, especially since we do not know the nature of the OPs problem. Perhaps it might even be DESIRABLE behavior, I know I've used a situation similar to this where I did in fact want the previous value when the current one failed. I have up-voted this for its wrongful down-vote. No valid answers/code should be down-voted IMHO.
gnomed
@Ryan: I didn't downvote this answer; it was just a side comment. Are you the one who just went on a downvoting spree on a bunch of my old answers?
Sean
@Sean: I was not; my apologies for suspecting you. I would never downvote legitimate answers out of revenge.
Ryan Mentley
A: 
$str =~ m/\/(\w+)\./

foo will be stored in the special $1 variable in perl. If you want it in a normal variable after just assign it.

$myvar = $1;

This is easily the simplest solution listed so far.

This will extract any word string between "/" and "." in the input. It will always be the word you want, unless there is multiple periods in the string to match against. But I am assuming "." will only be at the end (like on a file extenstion).

gnomed
Style improvement: Eliminate nugatory side effect variables and leaning toothpicks altogether. `my ($result) = $str =~ m{/(\w+)\.};`
daxim
Curious why someone would down-vote perfectly valid code that answers the question precisely.
gnomed
A: 
#!/usr/bin/perl

use strict; use warnings;

my $s = '/somedir/ref/some-dir/foo.word';

if ( my ($x) = $s =~ m{/(\w+)\.\w+\z} ) {
    print "$x\n";
}
Sinan Ünür