tags:

views:

83

answers:

4

One of the interesting things about Twitter is the over capacity, fail whale. My question is, programmatically, how can they detect when their system is over capacity? Is there a special type of exception that gets thrown in this case?

A: 

I believe its the routers/load balancer's detect this for twitter. If a machine or group of machines have large number of exceptions or return some HTTP 5xx errors, then the load balancers fail over to a "fail whale" server.

Daniel A. White
+2  A: 

There are any number of things that could be used to determine this - it'll depend on the system and what metrics the devs decide to use. A few examples:

  • vBulletin, a PHP-based forum system can shut itself down if the Unix load average hits a certain (admin-selected) value
  • Some systems that involve queuing (as Twitter does) can monitor the size of the queue and shut out users if the queue grows too large
  • Some systems have the servers doing the actual processing behind a proxy or load balancer. If they go offline, the proxy or load balancer can redirect traffic to an error page like the failwhale.
ceejayoz
A: 

Presumably, they've done basic load-testing so they have a solid idea of how much they can process before slowing down unacceptably or even crashing.

Steven Sudit
A: 

Diskeeper tells you that your system is stressed if you've used over a certain portion of your virtual memory, which I thought was interesting at the time.

quillbreaker