views:

283

answers:

2

hello again!

I have some code that throws causes syncdb to throw an error (because it tries to access the model before the tables are created).

Is there a way to keep the code from running on syncdb? something like:

if not syncdb:
    run_some_code()

Thanks :)

edit: PS - I thought about using the post_init signal... for the code that accesses the db, is that a good idea?

More info

Here is some more info as requested :)

I've run into this a couple times, for instance... I was hacking on django-cron and determined it necessary to make sure there are not existing jobs when you load django (because it searches all the installed apps for jobs and adds them on load anyway).

So I added the following code to the top of the __init__.py file:

import sqlite3

try:
        # Delete all the old jobs from the database so they don't interfere with this instance of django
        oldJobs = models.Job.objects.all()
        for oldJob in oldJobs:
                oldJob.delete()
except sqlite3.OperationalError:
        # When you do syncdb for the first time, the table isn't 
        # there yet and throws a nasty error... until now
        pass

For obvious reasons this is crap. it's tied to sqlite and I'm there are better places to put this code (this is just how I happened upon the issue) but it works.

As you can see the error you get is Operational Error (in sqlite) and the stack trace says something along the lines of "table django_cron_job not found"

Solution

In the end, the goal was to run some code before any pages were loaded.

This can be accomplished by executing it in the urls.py file, since it has to be imported before a page can be served (obviously).

And I was able to remove that ugly try/except block :) Thank god (and S. Lott)

+2  A: 

Code that tries to access the models before they're created can pretty much exist only at the module level; it would have to be executable code run when the module is imported, as your example indicates. This is, as you've guessed, the reason by syncdb fails. It tries to import the module, but the act of importing the module causes application-level code to execute; a "side-effect" if you will.

The desire to avoid module imports that cause side-effects is so strong in Python that the if __name__ == '__main__': convention for executable python scripts has become commonplace. When just loading a code library causes an application to start executing, headaches ensue :-)

For Django apps, this becomes more than a headache. Consider the effect of having oldJob.delete() executed every time the module is imported. It may seem like it's executing only once when you run with the Django development server, but in a production environment it will get executed quite often. If you use Apache, for example, Apache will frequently fire up several child processes waiting around to handle requests. As a long-running server progresses, your Django app will get bootstrapped every time a handler is forked for your web server, meaning that the module will be imported and delete() will be called several times, often unpredictably. A signal won't help, unfortunately, as the signal could be fired every time an Apache process is initialized as well.

It isn't, btw, just a webserver that could cause your code to execute inadvertently. If you use tools like epydoc, for example they will import your code to generate API documentation. This in turn would cause your application logic to start executing, which is obviously an undesired side-effect of just running a documentation parser.

For this reason, cleanup code like this is either best handled by a cron job, which looks for stale jobs on a periodic basis and cleans up the DB. This custom script can also be run manually, or by any process (for example during a deployment, or as part of your unit test setUp() function to ensure a clean test run). No matter how you do it, the important point is that code like this should always be executed explicitly, rather than implicitly as a result of opening the source file.

I hope that helps. I know it doesn't provide a way to determine if syncdb is running, but the syncdb issue will magically vanish if you design your Django app with production deployment in mind.

Jarret Hardie
yeah... I was just trying to figure out enough to get this app to work, (I didn't write it) but maybe I'll put in the time and refactor it until things make sense :)
Jiaaro
+4  A: 

"edit: PS - I thought about using the post_init signal... for the code that accesses the db, is that a good idea?"

Never.

If you have code that's accessing the model before the tables are created, you have big, big problems. You're probably doing something seriously wrong.

Normally, you run syncdb approximately once. The database is created. And your web application uses the database.

Sometimes, you made a design change, drop and recreate the database. And then your web application uses that database for a long time.

You (generally) don't need code in an __init__.py module. You should (almost) never have executable code that does real work in an __init__.py module. It's very, very rare, and inappropriate for Django.

I'm not sure why you're messing with __init__.py when Django Cron says that you make your scheduling arrangements in urls.py.


Edit

Clearing records is one thing.

Messing around with __init__.py and Django-cron's base.py are clearly completely wrong ways to do this. If it's that complicated, you're doing it wrong.

It's impossible to tell what you're trying to do, but it should be trivial.

Your urls.py can only run after syncdb and after all of the ORM material has been configured and bound correctly.

Your urls.py could, for example, delete some rows and then add some rows to a table. At this point, all syncdb issues are out of the way.

Why don't you have your logic in urls.py?

S.Lott
regarding the post_init signal, I realized that it didn't do what I thought it did, so that approach wouldn't work
Jiaaro
the code snippet above got moved to the base.py file in django-cron, but as the design of the app stands, I think it is still necessary to clear all the records out of the database when the app gets started (since it doesn't remember jobs once you kill django or restart the server)
Jiaaro