tags:

views:

1304

answers:

5

I want to get just the filename using regex, so I've been trying simple things like

([^\.]*)

which of course work only if the filename has one extension. But if it is adfadsfads.blah.txt I just want adfadsfads.blah. How can I do this with regex?

In regards to David's question, 'why would you use regex' for this, the answer is, 'for fun.' In fact, the code I'm using is simple

length_of_ext = File.extname(filename).length
filename = filename[0,(filename.length-length_of_ext)]

but I like to learn regex whenever possible because it always comes up at Geek cocktail parties.

+3  A: 

Everything followed by a dot followed by one or more characters that's not a dot, followed by the end-of-string:

(.+?)\.[^\.]+$

The everything-before-the-last-dot is grouped for easy retrieval.

If you aren't 100% sure every file will have an extension, try:

(.+?)(\.[^\.]+$|$)
Rex M
It does not match a filename which has no extension
Dennis Cheung
+1  A: 

how about 2 captures one for the end and one for the filename.

eg.

(.+?)(?:\.[^\.]*$|$)
sfossen
That's all fine, but since I'll be throwing out the filename, why bother? I would like a regex that just gets the filename.
Yar
This one also will not match a filename containing no extension.
j_random_hacker
+6  A: 

Try this:

(.+?)(\.[^.]*$|$)

This will:

  • Capture filenames that start with a dot (e.g. ".logs" is a file named ".logs", not a file extension), which is common in Unix.
  • Gets everything but the last dot: "foo.bar.jpeg" gets you "foo.bar".
  • Handles files with no dot: "secret-letter" gets you "secret-letter".


Note: as commenter j_random_hacker suggested, this performs as advertised, but you might want to precede things with an anchor for readability purposes.

John Feminella
There is a good explanation of this one at http://www.movingtofreedom.org/2008/04/01/regex-match-filename-base-and-extension/
Adam Bernier
The star should be a plus, I think - though it is not clear what a file called 'log.' should return.
Jonathan Leffler
Beautiful stuff, and thanks Adam for the link.
Yar
Although this does work as advertised, could I suggest prepending a "^" anchor just for readability's sake? Without the anchor, a programmer seeing this regex for the first time needs to perform a detailed analysis to verify that returned match always starts at the start of the string.
j_random_hacker
To have only one capture: (.+?)(?:\.[^.]*$|$)
sebnow
A: 

Ok, I am not sure why I would use regular expression for this. If I know for example that the string is a full filepath, then I would use another API to get the file name. Regular expressions are very powerfull but at the same time quite complex (you have just proved that by asking how to create such a simple regex). Somebody said: you had a problem that you decided to solve it using regular expressions. Now you have two problems.

Think again. If you are on .NET platform for example, then take a look at System.IO.Path class.

David Pokluda
Well, that's not much fun, is it? Anyway, adjusted the question to your answer, please see above. Thanks.
Yar
A: 

I have a similar problem using Apache.

In my .htaccess file I'd like to convert requests like this:

url/portfolio/filename.htm

to:

url?filename

Any takers?

That's relatively easy, but make it a question and then people can answer it... though do a search first to make sure no one has asked it before.
Yar