Does anyone have or know of a good template / plan for doing automated server upgrades? In this case I am upgrading a python/django server, but am going to have to apply this update to many machines, and want to be sure that the operation is fully testable and recoverable should anything go wrong.
Am picturing something along the lines of:
- remotely fetch new code
- verify code download (e.g. hash of files)
- take down server, display "you are upgrading dialog"
- backup database(s)
- backup code directory
- apply new code updates
- verify code update (e.g. hash of files)
- apply database update (if necessary)
- run tests
- if success
- startup server
- verify server update
- else
- restore old database
- restore old code
- report error
- startup server
- verify server restore
I'm sure that this isn't exhaustive, and there are many other error conditions to consider, but am wondering if something like this already exists as a formalized process/best practices checklist to follow? Ideally this whole thing should of course be done by a single script call.