tags:

views:

305

answers:

3

I need a regular expression that detects if a given string is a url to a [potential] file ie
/file.pdf
http://www.whatever.com/file.docx
../file.longfileextension
Thanks guys

+2  A: 

You might inspect the end to see if it looks like a file extension, but URLs don't actually map to files; what if the URL is rewritten?

If you wanted to determine what a given URL resolved to, you could issue a HEAD request and inspect the content-type and content-disposition headers to see if the content is of a type that implies an underlying file, but even that's not bulletproof, since images, PDF, etc. could all be dynamically generated.

Rob
+1  A: 

You can't.

E.g. http://example.com/files/readme might be a text file or a folder (*nix style OSs conventionally would not add a .txt extension).

Even if there is a file extension, there may be no file, with server side code processing the URL to create content (e.g. an ASP.NET HttpHandler).

Why are you trying to do this? If you wish to detect if the URL would return a file, you can guess with the extension (remembering that applications are free to invent their own), but the only real way is to perform a HTTP HEAD request and check the returned content type (but again you have the same problem with what is a valid file MIME type).

Richard
i need to detect if a url entered by a user is an attempt to link to a file on the system or not.
Adam Naylor
@Adam: expanded.
Richard
A: 

This expression will do the job.

^.*/(?<filename>[^/]+?\.[^/]+)$
    ^                 Anchor to the begining of the string
    .*                Any character zero or more times
    /                 Slash
    (?<filename>      Named group 'filename'
       [^/]+?            Not a slash at least once and captured lazily
       \.                One file extension separator (dot)
       [^/]+             Not a slash at leats once
    )                 End of named group
    $                 Anchor to the end of the string
Daniel Brückner
Thank you for not reading between the lines :)
Adam Naylor
I read between the lines at first. But while it not solves the general case of the problem it will solve special cases and your are probably after a special case as the modified question indicates.
Daniel Brückner