Is there a better way to serve the results of an expensive, blocking python process over HTTP?

views:

227

answers:

+5 Q:

Is there a better way to serve the results of an expensive, blocking python process over HTTP?

We have a web service which serves small, arbitrary segments of a fixed inventory of larger MP3 files. The MP3 files are generated on-the-fly by a python application. The model is, make a GET request to a URL specifying which segments you want, get an audio/mpeg stream in response. This is an expensive process.

We're using Nginx as the front-end request handler. Nginx takes care of caching responses for common requests.

We initially tried using Tornado on the back-end to handle requests from Nginx. As you would expect, the blocking MP3 operation kept Tornado from doing its thing (asynchronous I/O). So, we went multithreaded, which solved the blocking problem, and performed quite well. However, it introduced a subtle race condition (under real world load) that we haven't been able to diagnose or reproduce yet. The race condition corrupts our MP3 output.

So we decided to set our application up as a simple WSGI handler behind Apache/mod_wsgi (still w/ Nginx up front). This eliminates the blocking issue and the race condition, but creates a cascading load (i.e. Apache creates too many processses) on the server under real world conditions. We're working on tuning Apache/mod_wsgi right now, but still at a trial-and-error phase. (Update: we've switched back to Tornado. See below.)

Finally, the question: are we missing anything? Is there a better way to serve CPU-expensive resources over HTTP?

Update: Thanks to Graham's informed article, I'm pretty sure this is an Apache tuning problem. In the mean-time, we've gone back to using Tornado and are trying to resolve the data-corruption issue.

For those who were so quick to throw more iron at the problem, Tornado and a bit of multi-threading (despite the data integrity problem introduced by threading) handles the load acceptably on a small (single core) Amazon EC2 instance.

+1 A:

You might consider a queuing system with AJAX notification methods.

Whenever there is a request for your expensive resource, and that resource needs to be generated, add that request to the queue (if it's not already there). That queuing operation should return an ID of an object that you can query to get its status.

Next you have to write a background service that spins up worker threads. These workers simply dequeue the request, generate the data, then saves the data's location in the request object.

The webpage can make AJAX calls to your server to find out the progress of the generation and to give a link to the file once it's available.

This is how LARGE media sites work - those that have to deal with video in particular. It might be overkill for your MP3 work however.

Alternatively, look into running a couple machines to distribute the load. Your threads on Apache will still block, but atleast you won't consume resources on the web server.

Frank Krueger 2009-12-18 17:46:05

We're serving small, arbitrary segments of a fixed inventory of larger MP3 files. The model is, make a GET request to a URL specifying which segments you want, get an `audio/mpeg` stream in response, so AJAX won't work. :)

David Eyk 2009-12-18 17:58:27

Well, then we're back to my second point: distribute the load. Linux machines are cheap.

Frank Krueger 2009-12-18 18:02:21

It looks like you are doing things right -- just lacking CPU power: can you determine what is the CPU loading in the process of generating these MP3?

I think the next thing you have to do there is to add more hardware to render the MP3's on other machines. Or that or find a way to deliver pre-rendered MP3 (maybe you can cahce some of your media?)

BTW, scaling for the web was the theme of a Keynote lecture by Jacob Kaplan-Moss on PyCon Brasil this year, and it is far from being a closed problem. The stack of technologies one needs to handle is quite impressible - (I could not find an online copy o f the presentation, though - -sorry for that)

jsbueno 2009-12-18 17:50:19

We definitely didn't need more hardware when we were serving from Tornado. The problem may simply be tuning Apache.

David Eyk 2009-12-18 18:01:03

+2 A:

Have you tried Spawning? It is a WSGI server with a flexible assortment of threading modes.

joeforker 2009-12-18 18:18:12

Interesting. I'll have to look into it.

David Eyk 2009-12-21 16:53:42

+1 A:

Are you making the mistake of using embedded mode of Apache/mod_wsgi? Read:

http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html

Ensure you use daemon mode if using Apache/mod_wsgi.

Graham Dumpleton 2009-12-20 04:51:15

Excellent article. Thanks. This might end up doing the trick.

David Eyk 2009-12-21 16:41:57

In fact, I'm going to accept this link as the best answer, as the problem does appear to be an Apache tuning problem.

David Eyk 2009-12-21 16:45:54

+1 A:

Please define "cascading load", as it has no common meaning.

Your most likely problem is going to be if you're running too many Apache processes.

For a load like this, make sure you're using the prefork mpm, and make sure you're limiting yourself to an appropriate number of processes (no less than one per CPU, no more than two).

Nicholas Knight 2009-12-20 22:48:36

By "cascading load" I mean Apache was spawning processes willy-nilly. Sorry for the obfuscation. Graham's link seems to explain the situation pretty well.

David Eyk 2009-12-21 16:44:13

ansaurus

tags:

views:

answers:

Is there a better way to serve the results of an expensive, blocking python process over HTTP?

related questions