views:

45

answers:

1

I have a Python application that, to be brief, receives data from a remote server, processes it, responds to the server, and occasionally saves the processed data to disk. The problem I've encountered is that there is a lot of data to write, and the save process can take upwards of half a minute. This is apparently a blocking operation, so the network IO is stalled during this time. I'd like to be able to make the save operation take place in the background, so-to-speak, so that the application can continue to communicate with the server reasonably quickly.

I know that I probably need some kind of threading module to accomplish this, but I can't tell what the differences are between thread, threading, multiprocessing, and the various other options. Does anybody know what I'm looking for?

+2  A: 

Since you're I/O bound, then use the threading module.

You should almost never need to use thread, it's a low-level interface; the threading module is a high-level interface wrapper for thread.

The multiprocessing module is different from the threading module, multiprocessing uses multiple subprocesses to execute a task; multiprocessing just happens to use the same interface as threading to reduce learning curve. multiprocessing is typically used when you have CPU bound calculation, and need to avoid the GIL (Global Interpreter Lock) in a multicore CPU.

A somewhat more esoteric alternative to multi-threading is asynchronous I/O using asyncore module. Another options includes Stackless Python and Twisted.

Lie Ryan