views:

596

answers:

5

I am in a situation where I need to allow a user to download a file dynamically determined from the URL. Before the download begins, I need to do some authentication, so the download has to run through a script first. All files would be stored outside of the web root to prevent manual downloading.

For example, any of the following could be download links:

Basically, the folder depth can vary.

To prevent a directory traversal, like say: http://example.com/downloads/../../../../etc/passwd I need to obviously do some checking on the URI. (Note: I do not have the option of storing this info in a database, the URI must be used)

Would the following regexp be bullet-proof in making sure that a user doesnt enter something fishy:

preg_match('/^\/([-_\w]+\/)*[-_\w]+\.(zip|gif|jpg|png|pdf|ppt|png)$/iD', $path)

What other options of making sure the URI is sane do I have? Possibly using realpath in PHP?

A: 

What characters will your filenames contain? If it's simply [a-zA-Z0-9] single dots dashes and slashes then feel free to strip anything else.

cherouvim
A: 

I think you could use htaccess for this.

Fernando
+5  A: 

I would recommend using realpath() to convert the path into an absolute. Then you can compare the result with the path(s) to the allowed directories.

Emil H
Yes I think doing this along with a regexp check should probably do the trick
+2  A: 

I'm not a PHP developer but I can tell you that using a Regex based protection for such a scenario is like wearing a T-shirt against a hurricane.

This kind of problem is known as a Canonicalization vulnerability in security parlance (whereby your application parses a given filename before the OS has had a chance to convert it to its absolute file path). Attackers will be able to come up with any number of permutations of the filename which would almost certainly fail to be matched by your Regex.

If you must use Regex, then make it as pessimistic as possible (match only valid filenames, reject everthing else). I would suggest that you do some research on Canonicalization methods in PHP.

Cerebrus
+1. Also know your server: if you're running PHP on Windows, an attempt to access a device-reserved filename like ‘com.txt’ may fail hard.
bobince
A: 

I think the following 3 checks can be an ideal solution

  • Make sure the file matches a generally accepted Regexp of what the file path could look like
  • Use realpath (in PHP) to get a canonical form of the users requested file and compare it to make sure it is within a base directory
  • Starting with PHP v5.3, you can use ini_set to restrict the open_basedir to a specific folder, so that files outside of that folder cannot possibly be read (with fopen, include, fread, etc)