What's the difference between failover and disaster recovery?
Failover: When one machine fails, another machine (usually in the same location) takes over and resumes service
Disaster recovery: When Godzilla destroys your data center, you do have alternative locations to keep providing your service and protocols/means for the other location to know how to keep delivering the service
Depending on the particular needs of each service, disaster recovery might just be a backup tape in a safe in a different location. In other words, it's just having a defined protocol to recover from disaster. Likewise, failover might just be having a spare backup machine which makes you go to the data center for it to take over the place of the failed one, that is, having a defined protocol about what to do in case of machine failure.
Summing up, failover answers the question 'what do I do in case a single machine fails?', disaster recovery answers 'what do I do in case a disaster happens (fire, floods, war, ISP goes bankrupt, whatever)?'
Failover is more linked to backup procedure.
The main difference between the two, from the end client's point of view is the downtime.
- Failover is expected to have a low downtime (1 2 hours top)
- DR can have anything between 6 hours to a day or two.
The other difference is the nature of environments available after a failover or a DR.
- Failover means the end clients see nothing and can continue his activity (development or production management)
- DR should mean only production environment is back up. All development environments are down, or seriously degraded.
Since a disaster (like 9/11) can completely destroy a datacenter, does it mean that DR is the processes of rebuilding everything for that datacenter?