views:

343

answers:

2

I've been tackling this for a while. I setup a completely new machine. I've installed a fresh copy of postgresql and all my other dependencies. Basically, I get these database disconnections at random times. I can perform identical requests and either it works or it doesn't. Very nondeterministic in outward appearance. Watching logs at Postgresql, it doesn't even get a connection. Now, I would expect that if it never connected I would get this problem when establishing the connection and getting the cursor, but I get it when trying to actually use the connection later. Given the traceback below, I would expect to see a connection made in the pg logs, and then disconnected for some reason later. I don't, so I wonder if there is some clue in that mismatch.

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/core/handlers/wsgi.py", line 242, in __call__
    response = self.get_response(request)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/core/handlers/base.py", line 73, in get_response
    response = middleware_method(request)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/middleware/locale.py", line 16, in process_request
    language = translation.get_language_from_request(request)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/utils/translation/__init__.py", line 97, in get_language_from_request
    return real_get_language_from_request(request)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/utils/translation/trans_real.py", line 349, in get_language_from_request
    lang_code = request.session.get('django_language', None)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/contrib/sessions/backends/base.py", line 63, in get
    return self._session.get(key, default)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/contrib/sessions/backends/base.py", line 172, in _get_session
    self._session_cache = self.load()
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/contrib/sessions/backends/db.py", line 16, in load
    expire_date__gt=datetime.datetime.now()
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/models/manager.py", line 120, in get
    return self.get_query_set().get(*args, **kwargs)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/models/query.py", line 300, in get
    num = len(clone)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/models/query.py", line 81, in __len__
    self._result_cache = list(self.iterator())
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/models/query.py", line 238, in iterator
    for row in self.query.results_iter():
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/models/sql/query.py", line 287, in results_iter
    for rows in self.execute_sql(MULTI):
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/models/sql/query.py", line 2369, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python2.6/dist-packages/Django-1.1-py2.6.egg/django/db/backends/util.py", line 19, in execute
    return self.cursor.execute(sql, params)
OperationalError: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
+1  A: 

Do you fork() child processes (use preforked FastCGI or something similar)? This might be the reason that connection established in parent process doesn't work in child. If you use preforked method it's easy to switch to threading to see whether the problem has gone away. I saw exactly the same floating error in such case.

Denis Otkidach
While I am using a preforked fastcgi backend, the connection is established per-request, in the child processes. Also, if something like this was the case I would expect the problem to be more predictable, while in reality the requests usually succeed and the failure is outwardly nondeterministic.
ironfroggy
When child error inherits socket descriptor and sends data to it any child (this or other) can receive response. That causes error to be floating. I suggest to add logging to insure connection is initialized after fork. Due to extensive usage of global variables in django early connection establishment can be hidden from your eyes. Note, that whole code in imported before fork.
Denis Otkidach
I have already logged to determine this. The connection is only made at request time, in the child, in response to the start-request signal. The child processes are already established before being sent that request to trigger the connection.
ironfroggy
A: 

This is a very similar question to the one posted here:

http://stackoverflow.com/questions/393637/django-fastcgi-randomly-raising-operationalerror

I imagine the answer will be the same to both if and when someone eventually figured it out. This same problem has been bothering me for about a month now and I have no idea what could be causing it.

pcardune
Finally got pointed to that earlier today and was planning to link to it myself from here. This is a real problem, and obviously a lot of us have run into it, but it can be very hard to take the evidence and find information. Thanks for the tip.
ironfroggy