views:

783

answers:

2

I have django running through WSGI like this :

<VirtualHost *:80>
    WSGIScriptAlias / /home/ptarjan/django/django.wsgi
    WSGIDaemonProcess ptarjan processes=2 threads=15 display-name=%{GROUP}
    WSGIProcessGroup ptarjan
    Alias /media /home/ptarjan/django/mysite/media/
</VirtualHost>

But if in python I do :

def handler(request) :
    data = urllib2.urlopen("http://example.com/really/unresponsive/url").read()

the whole apache server hangs and is unresponsive with this backtrace

#0  0x00007ffe3602a570 in __read_nocancel () from /lib/libpthread.so.0
#1  0x00007ffe36251d1c in apr_file_read () from /usr/lib/libapr-1.so.0
#2  0x00007ffe364778b5 in ?? () from /usr/lib/libaprutil-1.so.0
#3  0x0000000000440ec2 in ?? ()
#4  0x00000000004412ae in ap_scan_script_header_err_core ()
#5  0x00007ffe2a2fe512 in ?? () from /usr/lib/apache2/modules/mod_wsgi.so
#6  0x00007ffe2a2f9bdd in ?? () from /usr/lib/apache2/modules/mod_wsgi.so
#7  0x000000000043b623 in ap_run_handler ()
#8  0x000000000043eb4f in ap_invoke_handler ()
#9  0x000000000044bbd8 in ap_process_request ()
#10 0x0000000000448cd8 in ?? ()
#11 0x0000000000442a13 in ap_run_process_connection ()
#12 0x000000000045017d in ?? ()
#13 0x00000000004504d4 in ?? ()
#14 0x00000000004510f6 in ap_mpm_run ()
#15 0x0000000000428425 in main ()

on Debian Apache 2.2.11-7.

Similarly, can we be protected against :

def handler(request) :
    while (1) :
        pass

In PHP, I would set time and memory limits.

+3  A: 

If I understand well the question, you want to protect apache from locking up when running some random scripts from people. Well, if you're running untrusted code, I think you have other things to worry about that are worst than apache.

That said, you can use some configuration directives to adjust a safer environment. These two below are very useful:

  • WSGIApplicationGroup - Sets which application group WSGI application belongs to. It allows to separate settings for each user - All WSGI applications within the same application group will execute within the context of the same Python sub interpreter of the process handling the request.

  • WSGIDaemonProcess - Configures a distinct daemon process for running applications. The daemon processes can be run as a user different to that which the Apache child processes would normally be run as. This directive accepts a lot of useful options, I'll list some of them:

    • user=name | user=#uid, group=name | group=#gid:

      Defines the UNIX user and groupname name or numeric user uid or group gid of the user/group that the daemon processes should be run as.

    • stack-size=nnn

      The amount of virtual memory in bytes to be allocated for the stack corresponding to each thread created by mod_wsgi in a daemon process.

    • deadlock-timeout=sss

      Defines the maximum number of seconds allowed to pass before the daemon process is shutdown and restarted after a potential deadlock on the Python GIL has been detected. The default is 300 seconds.

You can read more about the configuration directives here.

nosklo
So, with the 300 second timeout does that mean that the apache process should have killed my python daemon and restarted it? Because the whole apache was locked down, and unusable. Do I need more threads? Less? More processes?
Paul Tarjan
+7  A: 

It is not 'deadlock-timeout' you want as specified by another, that is for a very special purpose which will not help in this case.

As far as trying to use mod_wsgi features, you instead want the 'inactivity-timeout' option for WSGIDaemonProcess directive.

Even then, this is not a complete solution. This is because the 'inactivity-timeout' option is specifically to detect whether all request processing by a daemon process has ceased, it is not a per request timeout. It only equates to a per request timeout if daemon processes are single threaded. As well as help to unstick a process, the option will also have side effect of restarting daemon process if no requests arrive at all in that time.

In short, there is no way at mod_wsgi level to have per request timeouts, this is because there is no real way of interrupting a request, or thread, in Python.

What you really need to implement is a timeout on the HTTP request in your application code. Am not sure where it is up to and whether available already, but do a Google search for 'urllib2 socket timeout'.

Graham Dumpleton
So there is no way for python to run robustly? Yikes... Sounds like I should file a mod_wsgi feature request.
Paul Tarjan
There is no point asking for a mod_wsgi feature request. The inability to interrupt a thread handling a request is a Python limitation.
Graham Dumpleton
Doesn't http://docs.python.org/library/thread.html#thread.interrupt_main give the ability to interrupt a thread? Or am I missing something?
Paul Tarjan
In Apache/mod_wsgi threads are created by Apache and not by Python and requests don't usually use the main thread, thus the function for interrupting main thread is of no use.
Graham Dumpleton