views:

363

answers:

2

I have site running rails application and resque workers running in production mode, on Ubuntu 9.10, Rails 2.3.4, ruby-ee 2010.01, PostgreSQL 8.4.2

Workers constantly raised errors: PGError: server closed the connection unexpectedly.

My best guess is that master resque process establishes connection to db (e.g. authlogic does that when use User.acts_as_authentic), while loading rails app classes, and that connection becomes corrupted in fork()ed process (on exit?), so next forked children get kind of broken global ActiveRecord::Base.connection

I could reproduce very similar behaviour with this sample code imitating fork/processing in resque worker. (AFAIK, users of libpq recommended to recreate connections in forked process anyway, otherwise it's not safe )

But, the odd thing is that when I use pgbouncer or pgpool-II instead of direct pgsql connection, such errors do not appear.

So, the question is where and how should I dig to find out why it is broken for plain connection and is working with connection pools? Or reasonable workaround?

+3  A: 

When I created Nestor, I had the same kind of problem. The solution was to re-establish the connection in the forked process. See the relevant code at http://github.com/francois/nestor/blob/master/lib/nestor/mappers/rails/test/unit.rb#L162

From my very limited look at Resque code, I believe a call to #establish_connection should be done right about here: http://github.com/defunkt/resque/blob/master/lib/resque/worker.rb#L123

François Beausoleil
Thanks, so I simply addedd hook: Resque.after_fork = Proc.new { ActiveRecord::Base.establish_connection }
gc
+2  A: 

You cannot pass a libpq reference across a fork() (or to a new thread), unless your application takes very close care of not using it in conflicting ways. (Like, a mutex around every single attempt to use it, and you must never close it). This is the same for both direct connections and using pgbouncer. If it worked in pgbouncer, that was pure luck in missing a race condition for some reason, and will eventually break.

If your program uses forking, you must create the connection after the fork.

Magnus Hagander