views:

878

answers:

3

Hello all,

The task is simple: on the server side (python) accept an HTTP POST which contains an uploaded file and more form parameters.

I am trying to implement upload progress indicator, and therefore I need to be able to read the file content chunk-by-chunk.

All methods I found are based on cgi.FieldStorage, which somehow only allows me to obtain the file in its entirety (in memory, which is a disaster in itself). Some advise to redefine the FieldStorage.make_file method(), which seems to break down the cgi implementation (weird...).

I am currently able to read the entire wsgi input, chunk by chunk, to the filesystem, resulting in the following data:

-----------------------------9514143097616
Content-Disposition: form-data; name="myfile"; filename="inbound_marketing_cartoon_ebook.pdf"
Content-Type: application/pdf

... 1.5 MB of PDF data

-----------------------------9514143097616
Content-Disposition: form-data; name="tid"

194
-----------------------------9514143097616--

Does anyone know if there are any Python libraries that could reliably parse this thing? Or should I do this manually? (Python 2.5 that is)

Thanks.

+1  A: 

It seems counter-intuitive (and I feel that the module is poorly-named), but email will likely do what you want. I've never used it, but a coworker has in an e-mail processing system; since these messages are simply RFC 2822 in nature, email will probably parse them.

The documentation for email is quite thorough, at first glance.

My gut feeling would be to say that you're likely going to end up with the file in memory, however, which you did express chagrin at.

Jed Smith
+2  A: 

As you suggested, I would (and have done before) override the make_file method of a FieldStorage object. Just return an object which has a write method that both accepts the data (into a file or memory or what-have-you) and tracks how much has been received for your progress indicator.

Doing it this way you also get access to the length of the file (as supplied by the client), file name, and the key that it is posted under.

Why does this seem to break down the CGI implementation for you?

Another option is to do the progress tracking in the browser with a flash uploader (YUI Uploader and SWFUpload come to mind) and skip tracking it on the server entirely. Then you don't have to have a series of AJAX requests to get the progress.

Mike Boers
A: 

You might want to take a look at what Django has done. They have a really nice implementation of custom file upload handlers, which allows you to subclass them to enable things like progress bars etc. See the documentation and the relevant code - even if you don't want to use Django, it's bound to give you some ideas.

Daniel Roseman