views:

685

answers:

2

Hi,

I have a django app running on apache with fastcgi (uses Flup's WSGIServer).

This gets setup via dispatch.fcgi, concatenated below:

#!/usr/bin/python

import sys, os

sys.path.insert(0, os.path.realpath('/usr/local/django_src/django'))

PROJECT_PATH=os.environ['PROJECT_PATH']

sys.path.insert(0, PROJECT_PATH)

os.chdir(PROJECT_PATH)

os.environ['DJANGO_SETTINGS_MODULE'] = "settings"

from django.core.servers.fastcgi import runfastcgi

runfastcgi(method="threaded",daemonize='false',)

The runfastcgi is the one that does the work, eventually running a WSGIServer on a WSGIHandler.

Sometimes an exception happens which crashes fastcgi.

EDIT: I don't know what error crashes fastcgi, or whether fastcgi even crashes. I just know that sometimes the site goes down--consistently down--until I reboot apache. THe only errors that appear in the error.log are the broken pipe and incomplete headers ones, listed below.

Incomplete headers:

note: I've replaced sensitive information or clutter with "..."

[Sat May 09 ...] [error] [client ...] (104)Connection reset by peer: FastCGI: comm with server ".../dispatch.fcgi" aborted: read failed
[Sat May 09 ...] [error] [client ...] FastCGI: incomplete headers (0 bytes) received from server ".../dispatch.fcgi"
[Sat May 09 ...] [error] [client ...] (32)Broken pipe: FastCGI: comm with server ".../dispatch.fcgi" aborted: write failed,

Broken pipe:

note: this happens to be for a trac site not a django app, but it looks the same.

Unhandled exception in thread started by <bound method Connection.run of <trac.web._fcgi.Connection object at 0xb53d7c0c>>
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 654, in run
    self.process_input()
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 690, in process_input
    self._do_params(rec)
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 789, in _do_params
    self._start_request(req)
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 773, in _start_request
    req.run()
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 582, in run
    self._flush()
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 589, in _flush
    self.stdout.close()
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 348, in close
    self._conn.writeRecord(rec)
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 705, in writeRecord
    rec.write(self._sock)
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File "/usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 520, in _sendall
    sent = sock.send(data)
socket.error: (32, 'Broken pipe')

I've looked through /var/log/apache2/error.log, but I can't seem to find the cause of the crashing. I sometimes have memory swapping problems, but I think this is different. (Please excuse my ignorance. I am willing to learn how to implement and debug server admin stuff better.)

I'd like to wrap the the runfastcgi with a try/except. What is the best way to handle random exceptions (until I figure out the actual cause(s))?

I believe the WSGIServer handles many requests. If I catch an exception, can I re-call runfastcgi without fear of an infinite loop? Should I return an Error HttpRequest for the offending, exception-calling request? I'm not even sure how to do that.

I've been looking through django/core/servers/fastcgi.py and django/core/handlers/wsgi.py and django/http/init.py

I haven't been able to make progress understanding flup's side of things.

Have ideas or experiences I might learn from?

Thanks!

+1  A: 

This is probably a Flup bug. When a flup-based server's client connection is closed before flup is done sending data, it raises a socket.error: (32, 'Broken pipe') exception.

Trying to catch the exception by a try catch around runfastcgi will not work. Simply because the exception is raised by a thread.

OK, I'll explain why the wrapping your own code in a try catch won't work. If you look closely at the exception traceback you'll see that the first statement in the trace is not runfastcgi. That's because the exception is happening in a different thread. If you want to catch the exception you need to wrap any of the statements listed by the trace in a try/catch like this:

# in file /usr/lib/python2.4/site-packages/Trac-0.12dev_r7715-py2.4.egg/trac/web/_fcgi.py", line 654, in run
try:
    self.process_input()
except socket.error:
    # ignore or print an error
    pass

The point is, you can catch the error by modifing Flup's code. But I don't see any benefit from this. Especial because this exception seems to be harmless and there already is a patch for it.

Nadia Alramli
ok. The error occurs because the client closes the connection before flup is done. This ooccurs, however, when I simply go my site's mainpage. It appears as though the server has crashed, or at least fastcgi, yet these are the only errors I see.Any advice how to debug this?
Have you tried applying the patch mentioned in the bug ticket? It might solve your issue. Is the error causing the request to fail? or is it just showing in the logs with no notable side effects?
Nadia Alramli
i'm trying to figure out whether the broken pipes are a causes or symptom. hm...i guess i could apply the patch and see. learning more on how to debug or understand the situation would be my preference, though. don't want to chase risky red herrings.
OK, I added more details to the answer, hopefully that will make things clearer
Nadia Alramli
A: 

Broken pipe usually doesn't come deterministically. You get a Broken pipe if a write operation on a pipe or socket fails because the other end has closed the connection. So if your FastCGI gets a Broken pipe, it means that the webserver has closed to connection too early. In some cases this is not an problem, it can be ignored silently.

As a quick hack, try to catch and ignore the socket.error with Broken pipe. You may have to add an except: clause to many more places.

pts
"Many more places" Thanks for the tip but I'm afraid I'm still learning. Where? I'd rather not patch django or flup. I'm happy to modify dispatch.fcgi and my own django app.