I am in a situation where I need to allow a user to download a file dynamically determined from the URL. Before the download begins, I need to do some authentication, so the download has to run through a script first. All files would be stored outside of the web root to prevent manual downloading.
For example, any of the following could be download links:
- http://example.com/downloads/companyxyz/overview.pdf
- http://example.com/downloads/companyxyz/images/logo.png
- http://example.com/downloads/companyxyz/present/ppt/presentation.ppt
Basically, the folder depth can vary.
To prevent a directory traversal, like say: http://example.com/downloads/../../../../etc/passwd I need to obviously do some checking on the URI. (Note: I do not have the option of storing this info in a database, the URI must be used)
Would the following regexp be bullet-proof in making sure that a user doesnt enter something fishy:
preg_match('/^\/([-_\w]+\/)*[-_\w]+\.(zip|gif|jpg|png|pdf|ppt|png)$/iD', $path)
What other options of making sure the URI is sane do I have? Possibly using realpath in PHP?