How do large server farms handle gracefully shutting down all or part of the farm? I'm thinking of planed and unplanned cases like:
- "We need to shutdown Rack 42"
- "We need to do work on the power feeds to the whole block"
- "Blackout! UPS's running out of Juice! Aahh!"
- "AC is down, air temp is 125F and climbing"
The issues I'm interested in are how people handle sequencing, and kicking the whole thing off. Also it occurs to me that this could easily get mixed with bringing up and down services and with the software up grade system.
(At this point I'm more asking out of curiosity than anything.)