views:

80

answers:

2

I'm having trouble with one of our Rails 3 app's. When a lot of requests are sent to the server (10 / second) the whole server stalls. I tried a lot of different passenger setups and sometimes I noticed a slight improvement but none of them ended up to be a solution.

My setup:

  • Intel i7 (8 cores)
  • 8GB ram
  • Ubuntu 10.04 Server
  • Ruby 1.9.2
  • Rails 3
  • Apache 2.2.14
  • Passenger 2.2.15
  • MySQL 5.1.41

My current passenger.conf:

PassengerMaxPoolSize 12
PassengerUseGlobalQueue on
PassengerHighPerformance on
RailsSpawnMethod smart
PassengerMaxRequests 5000
PassengerStatThrottleRate 5
RailsAppSpawnerIdleTime 0
PassengerPoolIdleTime 600

This server is dedicated to one app. Well, one app in staging and production mode.

I tried to play with the PassengerMaxPoolSize, setting it to 4, 12, 20, 40, 80, ... stalling remains. The strange thing is Passenger seems to spawn more apps than the defined MaxPoolSize. Currently it is set to 12 but in htop I can find at least 34 of these:

1234 username 20 0 260M 97572 3892 S 0.0 1.2 0:00.13 Rack: /var/www/domains/domain.com/current

I can replicate this issue easily by just opening 30 tabs with the root page of our app. The first 10 or so load instantly, the rest takes at least a minute to present something.

I'm out of ideas. Anyone an idea on how to fix this?

A: 

Check your rails log to make sure it isn't serving any static requests (image, css, js files etc). If it is, then every page load is triggering a lot more requests through passenger.

If that's the case, you can configure apache to send the static files itself it they exist, instead of forwarding all requests on to passenger.

Jeremy
No, that is not the case. I checked the logs.
wout
The app is really responsive. Around 100 ms per action according to newrelic. But the moment too many requests are coming in at the same time the server just stops responding at all. It can take minutes before it becomes responsive again. Restarting apache helps. When I check memory status and processor usage the moment it is stalling about 60% of the memory is unused and usually one thread is running (hanging) at 100%, the 7 other threads are at nearly 0%. So it's not that resources are fully used. It looks like one operation is blocking Apache or Passenger.
wout
Go with Hongli's answer then, he'd know.
Jeremy
+1  A: 

Phusion Passenger is probably trying to spawn more processes, but during spawning it cannot respond to requests. Try Phusion Passenger 3 which implements asynchronous spawning.

Hongli
Ok great! That fixed the hanging. But now apache throws a 500 sometimes. If I open ten tabs really quickly, one or two are a 500 (apache 500, not rails). Hitting refresh then loads the page after all. Any idea what this could be? Thanx for helping out!
wout
This happens also when for example the production app is running for a while and I restart the staging app. Those errors appear then in both production and staging. Or when it has been quiet and we start a Stumble campain, so when suddenly a lot of visitors arrive at the same time.
wout
I'm back to 2.2.15. At least the hanging requests come alive eventually. Whereas on passenger 3 I get a lot of 500 errors.
wout
You should look in the error log. That'd probably tell you why it gives 500s.
Hongli
wout
There's probably a /tmp cleaner daemon on your system that deletes essential Phusion Passenger runtime files in /tmp after a while. Disable that daemon.
Hongli
I've set the tmp cleaner daemon to only remove files that are older than seven days, and only after reboot. Didn't help really. I've been testing a lot more and I think it's something else. The 500 errors are thrown every time a new instance is started. So the visitor firing up the instance gets a 500, once it is started everything is fine. This is only on Passenger 3.0.0 btw.
wout