views:

137

answers:

3

I want to use Python's multiprocessing to do concurrent processing without using locks (locks to me are the opposite of multiprocessing) because I want to build up multiple reports from different resources at the exact same time during a web request (normally takes about 3 seconds but with multiprocessing I can do it in .5 seconds).

My problem is that, if I expose such a feature to the web and get 10 users pulling the same report at the same time, I suddenly have 60 interpreters open at the same time (which would crash the system). Is this just the common sense result of using multiprocessing, or is there a trick to get around this potential nightmare?

Thanks

+1  A: 

locks are only ever nessecary if you have multiple agents writing to a source. If they are just accessing, locks are not needed (and as you said defeat the purpose of multiprocessing).

Are you sure that would crash the system? On a web server using CGI, each request spawns a new process, so it's not unusual to see thousands of simultaneous processes (granted in python one should use wsgi and avoid this), which do not crash the system.

I suggest you test your theory -- it shouldn't be difficult to manufacture 10 simultaneous accesses -- and see if your server really does crash.

Jared Forsyth
@Jared Thanks. I might have to give it a try. I will be using UWSGI.
orokusaki
+2  A: 

You are barking up the wrong tree if you are trying to use multiprocess to add concurrency to a network app. You are barking up a completely wrong tree if you're creating processes for each request. multiprocess is not what you want (at least as a concurrency model).

There's a good chance you want an asynchronous networking framework like Twisted.

Mike Graham
@Mike, I'm not creating a new process for each request. I'm creating a new process from within my program only if multiple images need resizing on a single view in my app. This has nothing to do with network stuff at all. My only concern with the internet is regarding multiple clients using the same exact function which does this process creation.
orokusaki
@orokusaki, You have a written-in-Python, CPU-bound image resizing routine you want to run with lots of communication between it and some code that launches it?
Mike Graham
@Mike Basically, when a user goes to `/spam/upload/` and uploads an image, I want to use PIL to re-size the image to 6 different sizes and offload each one to S3. Each one will take some time and I thought it would be nice to open 6 processes, and in each one: do one resizing and send it to S3, so that the user's request will only take the time of approximately one resizing/S3 uploading.
orokusaki
+2  A: 

If you're really worried about having too many instances you could think about protecting the call with a Semaphore object. If I understand what you're doing then you can use the threaded semaphore object:

from threading import Semaphore
sem = Semaphore(10)
with sem:
    make_multiprocessing_call()

I'm assuming that make_multiprocessing_call() will cleanup after itself.

This way only 10 "extra" instances of python will ever be opened, if another request comes along it will just have to wait until the previous have completed. Unfortunately this won't be in "Queue" order ... or any order in particular.

Hope that helps

JudoWill
@JudoWill Thanks. So, if I understand correctly, `Semaphore` will protect the system from ever having more than 10 threads open from that one area of code (ie, will not block more interpreters from starting else where if I use threading in 5 places, etc)?
orokusaki
@orokusaki. Actually in your case you'll need to find a way to pass around the "same" `Semaphore` object between all of your functions. If you define it once "globally" then it should be the same instance through all of your calls (or at least until your main process restarts itself)
JudoWill