views:

1293

answers:

12

Hi guys,

I want to allow uploads of very large files into our PHP application (hundred of megs - 8 gigs). There are a couple of problems with this however.

Browser:

  • HTML uploads have crappy feedback, we need to either poll for progress (which is a bit silly) or show no feedback at all
  • Flash uploader puts entire file into memory before starting the upload

Server:

  • PHP forces us to set post_max_size, which could result in an easily exploitable DOS attack. I'd like to not set this setting globally.
  • The server also requires some other variables to be there in the POST vars, such as an secret key. We'd like to be able to refuse the request right away, instead of after the entire file is uploaded.

Requirements:

  • HTTP is a must.
  • I'm flexible with client-side technology, as long as it works in a browser.
  • PHP is not a requirement, if there's some other technology that will work well on a linux environment, that's perfectly cool.
+3  A: 

How about a Java applet? That's how we had to do it at a company I previously worked for. I know applets suck, especially in this day and age with all our options available, but they really are the most versatile solution to desktop-like problems encountered in web development. Just something to consider.

Marc W
java applet might do the trick, but that's only really half the problem.
Evert
Wordpress uses a flash based uploader.
Chacha102
+1  A: 

Have you looked into using APC to check the progress and total file size. Here is a good blog post about it. It might help.

Peter D
The APC trick requires polling, which I don't like because of our load balanced scenario.
Evert
You can't poll because each poll request might be set to a different server than the download started on?
Peter D
You could store the poll key in a database
Aiden Bell
Peter D - correctAiden, this is not possible because I can only request information on the upload (even with the key) on the server where the upload started.
Evert
Ah I see. What if you had an internal script launched to poll the right server and store those results in a central place and have a handler on the front-end?
Aiden Bell
+1  A: 

Maybe you could use Webdav and Javascript in the browser

AJAX Big file upload, with progress, to WebDAV

http://www.webdavsystem.com/ajax/programming/upload_progress

A simple library

http://debris.demon.nl/projects/davclient.js/doc/README.html

You can then get the JS to redirect the user to a success page. Secret keys and what-not can be handled in a PHP prelude before handing off the JS Client->WebDAV

Aiden Bell
Javascript will not allow me to read the contents of a local file.I don't know exactly how 'webdavsystem' does it, but I think they simply still use a standard upload and have a special handler for that on the server.
Evert
+2  A: 

You can set the post_max_size for just scripts in 1 directory. Place your upload script there, and allow only that script to handle large sizes. It's still possible for that script to be attacked with large/useless files, but it avoids setting it globally.

Use that with APC and you might be able to work out something good: IBM Developer works article on APC

acrosman
APC is difficult to use in our load balanced setup. We don't use cookie fixation, so in order to properly use this we'd need to poll the actual server the file is uploaded to (which kinda sucks in our situation).Just having the post_max_size on 1 directory also does not do the trick for me, because it's still susceptible for DOS attacks on that 1 directory, and I want to block requests if they contain invalid GET data right when it starts..
Evert
A: 

I would look into FTP, SSH or SCP this allows you to upload a large file and still have access control over the file as well. This might take a little longer to implement but its probably the most secure way I could think of.

Phill Pafford
We don't want to really go this route.. HTTP is simple so we don't want to overcomplicate the environment. We are open to using something else than PHP on the server-side, but HTTP is a must.
Evert
A: 

I know it sucks to add another dependency but in my experience, most websites that are doing something like this are using flash on the client side, and uploading the large file as chunks

adobe as a howto on flash file uploads

I also found this tutorial on codeproject:

Multiple File Upload With Progress Bar Using Flash and ASP.NET

PS - I know you're using PHP and not .net, I figured the important part was the flash ;)

Jiaaro
Unfortunately flash has been problematic. Flash puts the entire file into memory before uploading, resulting in a complete freeze of my mac for a couple of minutes :(
Evert
I wonder... is it possible to read in chunks as well?
Jiaaro
The FileReference class in Flash does not allow direct file access, only upload.
Evert
Damn these security layers ;P
Aiden Bell
+1  A: 

Python Handler?

Using a Python POST handler instead of PHP. Generate a unique identifier from your PHP app that the client can put in the HTTP headers. With mod_python to reject or accept the large upload before the entire POST body is transmitted.

I think http://www.modpython.org/live/current/doc-html/dir-handlers-hph.html

Allows you to check headers and decline the rest of the POST input. I haven't tried it but might be the right path?

Looking at the source of mod_python, the buffering of the input via read() seems to allow bit-at-a-time evaluation of the HTTP input. Headers are first.

https://svn.apache.org/repos/asf/quetzalcoatl/mod_python/trunk/src/filterobject.c

Aiden Bell
+4  A: 

upload_max_filesize can be set on a per-directory basis; the same goes for post_max_size

e.g.:

<Directory /uploadpath/>
  php_value upload_max_filesize 10G
  php_value post_max_size 10G
</IfModule>
Frank Farmer
A: 

I've had success with uploadify, and I would recommend it. It's a jQuery/Flash script that handles large uploads, and you can pass extra parameters to it (like the secret key). To solve the server-side issues, simply use the following code. The changes take affect just for the script they're called in:

//Check to see if the key is there
if(!isset($_POST['secret_key']) || !isValid($_POST['secret_key']))
{
    exit("Invalid request");
}
function isValid($key)
{
    //Put your validation code here.
}

//This line changes the timeout.
//Give it a value in seconds (3600 = 1 hour)
set_time_limit(3600);

//Set these amounts to whatever you need.
ini_set("post_max_size","8192M");
ini_set("upload_max_filesize","8192M");

//Generally speaking, the memory_limit should be higher
//than your post size.  So make sure that's right too.
ini_set("memory_limit","8200M");

EDIT In response to your comment:

Given what you've said, I'm afraid you may not be able to meet your requirements over http. All of the solutions out there are code that add features to http that it was never designed for.

Like you said yourself, it's a simple protocol. Apart from writing your own client software that runs outside of the browser, a java applet, or using a different protocol (like FTP, which was designed for this), you might not get what you want.

I've done the best I could within the given constraints. Sorry I couldn't do better.

Andrew
Flash does not work well, read the above comments.Setting the upload_max_filesize and post_max_size after the script has already started will have no effect.
Evert
A: 

Tried all of this... this is by far the best I have used yet...

http://www.uploadify.com/

Mike Curry
A: 

Take a look at jumploader.com

A good java-applet for uploading.

I've used it for uploading images and it works fine. Haven't tried with bigger files than 10MB, but i should work for really big files too.

Johan
A: 

Try JFileUpload. It supports HTTP, FTP with PHP, JSP or ASP.NET script to handle large uploads. It can split files in chunks and recompose files from chunks on server-side.
http://www.jfileupload.com/products/jfileupload/index.html

fileuploader