views:

62

answers:

0

I have a production web site with the following environment:

  • Rails 2.3.5
  • MySQL Server 5.1.33
  • Enterprise Ruby 1.8.6 (2008-08-11 patchlevel 287) [x86_64-linux]
  • mysql gem 2.7
  • Old version of BackgrounDRb plugin running on 4 different servers for background tasks, with 5 different workers each (Ruby threads, not separate processes!).

One of the BackgrounDRb workers processes the job queue using a variation of "optimistic locking":

    update_sql = "update jobs
                  set updated_at = CURRENT_TIMESTAMP,
                      in_process = 1
                  where id = #{job.id} and in_process = 0"

    affected_rows = Job.connection.update(update_sql)
    captured_job = affected_rows > 0 ? Job.find(job.id) : nil

The code above tries to update the record with the given ID and with extra condition for in_process field. So if this same record was already updated by a different server/process then UPDATE statement would just return 0 (zero) and the job would not be processed simultaneously by 2 different servers.

The problem is: sometimes "Job.connection.update(update_sql)" returns 0 (zero) even when the record was actually updated! I was only able to find that out after a heavy logging was added to the code. It only happens in Production at night when we have a heavy load...

My guess is that mysql gem uses some global variable (class-variable) for affected_rows that is shared across all 5 threads of BackgrounDRb process, but I'm not sure. I was looking through the code of mysql gem and ActiveRecord, but I couldn't understand how it really works.

Could you please help with explanation of how this could happen?

Update 2010-07-07: We decided to not use threads for job processing - that'll solve all our problems: every job processor would be a separate process :)