views:

38

answers:

2

I am building a logging-bridge between rabbitmq messages and Django application to store background task state in the database for further investigation/review, also to make it possible to re-publish tasks via the Django admin interface. I guess it's nothing fancy, just a standard Producer-Consumer pattern.

  1. Web application publishes to message queue and inserts initial task state into the database
  2. Consumer, which is a separate python process, handles the message and updates the task state depending on task output

The problem is, some tasks are missing in the db and therefore never executed. I suspect it's because Consumer receives the message earlier than db commit is performed. So basically, returning from Model.save() doesn't mean the transaction has ended and the whole communication breaks.

Is there any way I could fix this? Maybe some kind of post_transaction signal I could use?

Thank you in advance.

A: 

This sounds brittle to me: You have a web app which posts to a queue and then inserts the initial state into the database. What happens if the consumer processes the message before the web app can commit the initial state?

What happens if the web app tries to insert the new state while the DB is locked by the consumer?

To fix this, the web app should add the initial state to the message and the consumer should be the only one ever writing to the DB.

[EDIT] And you might also have an issue with logging. Check that races between the web app and the consumer produce the appropriate errors in the log by putting a message to the queue without modifying the DB.

[EDIT2] Some ideas:

How about showing just the number of pending tasks? For this, the web app could write into table 1 and the consumer writes into table 2 and the admin if would show the difference.

Why can't the web app see the pending tasks which the consumer has in the queue? Maybe you should have two consumers. The first consumer just adds the task to the DB, commits and then sends a message to the second consumer with just the primary key of the new row. The admin iface could read the table while the second consumer writes to it.

Last idea: Commit the transaction before you enqueue the message. For this, you simply have to send "commit" to the database. It will feel odd (and I certainly don't recommend it for any case) but here, it might make sense to commit the new row manually (i.e. before you return to your framework which handles the normal transaction logic).

Aaron Digulla
Well, it acutually first inserts, then publishes (in the terms of sequential code execution) and exactly what happens is the consumer sometimes receives the message before commit. If the consumer would be the only one ever writing to the DB, I will not be able to see pending tasks via the admin interface, which is the feature I want.
A: 

Web application publishes to message queue and inserts initial task state into the database

Do not do this.

Web application publishes to the queue. Done. Present results via template and finish the web transaction.

A consumer fetches from the queue and does things. For example, it might append to a log to the database for presentation to the user. The consumer may also post additional status to the database as it executes things.

Indeed, many applications have multiple queues with multiple produce/consumer relationships. Each process might append things to a log.

The presentation must then summarize the log entries. Often, the last one is a sufficient summary, but sometimes you need a count or information from earlier entries.

S.Lott