tags:

views:

829

answers:

1

I have a python cgi script that receives files uploaded via a http post. The files can be large (300+ Mb). The thing is, cgi.FieldStorage() is incredibly slow for getting the file (a 300Mb file took 6 minutes to be "received"). Doing the same by just reading the stdin took around 15 seconds. The problem with the latter is, i would have to parse the data myself if there are multiple fields that are posted.

Are there any faster alternatives to FieldStorage()?

+2  A: 

"[I] would have to parse the data myself"

Why? CGI has a parser you can call explicitly.

Read the uploaded stream and save it in a local disk file.

For blazing speed, use a StringIO in-memory file. Just be aware of the amount of memory the upload will take.

Use cgi.parse(mylocalfile).

S.Lott
cgi doc says about parse.parse_multipart: "This is easy to use but not much good if you are expecting megabytes to be uploaded — in that case, use the FieldStorage class instead which is much more flexible."
Ash
Typically, flexibility comes at a cost. And that cost is usually speed.
S.Lott
Have you tried it?
Seun Osewa
I dont think, that putting the upload into an in-memory file is a good idea. On a server this could kill the server fast. When multiple 300MB uploads occur at the same time, you have very fast 1GB of upload. I also assume, that the trouble is the parser. Most of the time, inputs on the web are just several 100 bytes or less. There you don't optimize the parser for fast access speed. The best would be a parser, that is optimized to let the input stream threw.
Juergen
@Juergen: "Read the uploaded stream and saving it in a local disk file" should avoid the memory consumption from a big upload. You can limit the buffer size easily.
S.Lott