I work on a small scale application (about 5000 users), but we do maintain some important user preference data. Whenever we release an upgrade we check if there are users online (we do it after hours and usually there are none) and then just put an outage page and apply the new build (both UI and DB changes). It all takes about half and hour and is usually pain free.
But I always wonder how sites like Amazon or Ebay or Google push upgrades to production. I know its phased out over time and servers etc, but thousands of users are logged in at any time and are continuously updating data. I know that there is load balancing such that if one server is taken down, the users session is seamlessly transferred to another machine, and similar database options, but it still seems overwhelming to keep everything running (although with fewer servers) but still upgrade UI , DB and functionality smoothly.
Are there specific guidelines and strategies somewhere about deployment for large websites? Any whitepapers? What are the best practices in that area?
EDIT: I said half an hour, but that includes the time from which we take the app down, to the time we get it back up. This includes UI and functionality smoke tests, DB Consistency checks and a small load test. The time to actually 'deploy' is infact less than two minutes.