views:

173

answers:

5

I have a rails application that occasionally needs to do very slow network calls (that might not ever return) based on a web user hitting a route. If the call is slower than normal (or doesn't return), Rails seems to block, not allowing ANYTHING else to happen (I can't open up another browser and hit a different route, I can't use a different machine to hit a different route, nothing).

I tried using Phusion Passenger to handle multi-thread (assuming it was a mongrel server issue), and while Apache CAN handle a test controller that goes into an infinite loop (while Mongrel can't), it seems like every so often (when the network call doesn't come back), it still blocks. I can't really replicate this, so it's not all that testable:

Is RAILS itself blocking, or is Apache not the correct server? If it's rails, is there ANYTHING I can do to make it open itself up to other users while blocked?

+3  A: 

Rails itself is blocking because Ruby (as of 1.8) is single-threaded (see here). The way we get around the problem, which we just took from others, is to run multiple copies of Rails and load balancer them behind nginx. However, since nginx does the same thing as Apache, if you run X copies of Rails, after the never-returning route is hit X times, you'll hit the same problem you're currently seeing.

You likely could use nginx or apache and configure it to only keep connections alive for 60 seconds - maybe this will restore connections from the never-returns-route and still allow that computation to go though unimpeded (that is, not send a connection reset to rails).

Chris Bunch
Right: I already knew that, that's why I was using Passenger with Apache (Passenger makes multiple copies of rails automatically). My frustration was that DESPITE this it was still occasionally messing up.
Jenny
+1  A: 

Normally in the case where you have to do some long-running or of-questionable-reliability work you will want to use a separate worker process rather than trying to inline it in the request. Try something like workling to get started and maybe RabbitMQ to get serious.

thenduks
I should also note this really has nothing to do with Rails specifically. If your requests take forever to return and you use up all your app processes (in practically any language) you will end up with this problem. The recent mainstreamification of 'evented' web servers is somewhat in response to this (see nodejs.org, for example).
thenduks
+3  A: 

We moved any calls that could take longer than one or two seconds to a prioritized queue to be processed asynchronously. We use delayed_job (the collectiveidea branch) to handle all background jobs, but another that is getting popular these days is resque, I have not used it and cannot speak to it other that its basic function.

From there you can use ajax to poll the status of the job while you show a progress bar or a spinner or a can eating a baked potato - until the job finishes. This way the background job processes in the background and your web front end will remain snappy.

The flow would be:

  1. User submits Request
  2. Request Queued
  3. Front end responds
  4. Network operation starts
  5. Network operation ends
  6. Begin polling at customer
  7. Background job finished
  8. Poll gets response and updates UI

VS.

  1. User submits Request
  2. Network operation starts
  3. Network operation ends
  4. Front end responds

This is a bit more complicated than your standard fare of get/process/display but when you have long-running processes, this is the desired method of handling them. An example in the wild of this is Mint.com. When it updates your bank accounts you can click all over the site and when it is done, you are notified.

If you are unable to go this route then you are just going to need to increase the number of processes on your front end to handle incoming requests. We are migrating to unicorn but passenger should be effective in making sure that your busy workers do not get new requests. I recommend this only as a last resort - your requests should all respond in a second or at the most two and everything else should go to a background job processor.

Geoff Lanotte
A: 

This may or may not help, but... If the calls that users are making can return immediately (ie, don't wait for that long network call to return), then you might want to return to try handing off the network call to a separate process. What I've seen done is to run a separate "server" process (1) that accepts calls from the primary (web server) process to do things. Then, the main rails process can handle users and send orders to the secondary process to make the network calls.

The reason for the separate process to handle the network calls is twofold:

  1. In general, that separate process has a lot less memory so can fork off instances of itself a lot faster if you want to fork for each network call.
  2. If the separate process does lock up, it doesn't stall the main web server.
RHSeeger
+1  A: 

JRuby uses native threads and won't block the way MRE or REE can. A JRuby on Rails app is easily deployed to Glassfish or Torquebox, and it's not that much more difficult to use warbler to create a .war file that is deployed to any Java server (like Tomcat). Also you don't have to spin up n copies of the Rails stack to handle n requests.

I'm doing the latter and it works great.

Mark Thomas