views:

99

answers:

1

This is related to this http://stackoverflow.com/questions/926579/configure-apache-to-recover-from-modpython-errors, although I've since stopped assuming that this has anything to do with mod_python. Essentially, I have a problem that I wasn't able to reproduce consistently and I wanted some feedback on whether the proposed solution seems likely and some potential ways to try and reproduce this problem.

The setup: a django-powered site would begin throwing errors after a few days of use. They were always ImportErrors or ImproperlyConfigured errors, which amount to the same thing, since the message always specified trouble loading some module referenced in the settings.py file. It was not generally the same class. I am using preforked apache with 8 forked children, and whenever this problem would come up, one process would be broken and seven would be fine. Once broken, every request (with Debug On in the apache conf) would display the same trace every time it served a request, even if the failed load is not relevant to the particular request. An httpd restart always made the problem go away in the short run.

Noted problems: installation and updates are performed via svn with some post-update scripts. A few .pyc files accidentally were checked into the repository. Additionally, the project itself was owned by one user (not apache, although apache had permissions on the project) and there was a persistent plugin that ended up getting backgrounded as root. I call these noted problems because they would be wrong whether or not I noticed this error, and hence I have fixed them. The project is owned by apache and the plugin is backgrounded as apache. All .pyc files are out of the repository, and they are all force-recompiled after each checkout while the server and plugin have been stopped.

What I want to know is

  1. Do these configuration disasters seem like a likely explanation for sporadic ImportErrors?
  2. If there is still a problem somewhere else in my code, how would I best reproduce it?

As for 2, my approach thus far has been to write some stress tests that repeatedly request the same page so as to execute common code paths.

Incidentally, this has been running without incident for about 2 days since the fix, but the problem was observed with 1 to 10 day intervals between.

+2  A: 

"Do these configuration disasters seem like a likely explanation for sporadic ImportErrors"

Yes. An old .pyc file is a disaster of the first magnitude.

We develop on Windows, but run production on Red Hat Linux. An accidentally moved .pyc file is an absolute mystery to debug because (1) it usually runs and (2) it has a Windows filename for the original source, making the traceback error absolutely senseless. I spent hours staring at logs -- on linux -- wondering why the file was "C:\This\N\That".

"If there is still a problem somewhere else in my code, how would I best reproduce it?"

Before reproducing errors, you should try to prevent them.

First, create unit tests to exercise everything.

Start with Django's tests.py testing. Then expand to unittest for all non-Django components. Then write yourself a "run_tests" script that runs every test you own. Run this periodically. Daily isn't often enough.

Second, be sure you're using logging. Heavily.

Third, wrap anything that uses external resources in generic exception-logging blocks like this.

try:
    some_external_resource_processing()
except Exception, e:
    logger.exception( e )
    raise

This will help you pinpoint problems with external resources. Files and databases are often the source of bad behavior due to permission or access problems.

At this point, you have prevented a large number of errors. If you want to run cyclic load testing, that's not a bad idea either. Use unittest for this.

class SomeLoadtest( unittest.TestCase ):
    def test_something( self ):
        self.connection = urllib2.urlopen( "localhost:8000/some/path" )
        results = self.connection.read()

This isn't the best way to do things, but it shows one approach. You might want to start using Selenium to test the web site "from the outside" as a complement to your unittests.

S.Lott
Thanks! Good to know that it's a common problem. Also, thanks for the last chunk of sample code. I had started doing something similar with django's test client, but realized it's not as broad a test if I'm bypassing the transport layer.
David Berger