What's your disaster recovery plan?

views:

282

answers:

+1 Q:

What's your disaster recovery plan?

And what would you recommend for an ASP Net web application, with a not so large SQL server database (around 10Gb)?

I was just wondering, is that a good idea to have an Amazon EC2 instance configured ready to host your app in an emergency?

In this scenario, what would be the best approach to keep the database updated (log shipping? manual backup restore?) and the easiest and fastest way to change the dns settings?

Edit: the acceptable downtime would be something between 4 to 6 hours, thats why i considered using the Amazon ec2 option for its lower cost if compared to renting a secondary server.

+2 A:

It all depends on what your downtime requirements are. If you've got to be back up in seconds in order to not lose your multi-billion dollar business, then you'll do things a lot differently to if you've got a site that makes you maybe $1000/month and whose revenue won't be noticeably affected if it's down for a day.

I know that's not a particularly helpful answer, but this is a big area, with a lot of variables, and without more information it's almost impossible to recommend something that's actually going to work for your situation (since we don't really know what your situation is).

womble 2009-02-01 03:24:56

+2 A:

Update - Just saw your comment. Amazon EC2 with log shipping is definitely the way to go. Don't use mirroring because that normally assumes the other standby database is available. Changing your DNS should not take more than 1/2 hour if you set your TTL to that. That would give you time to integrate any logs that are pending. Might turn on the server once a week or so just to integrate logs that are pending (or less to avoid racking up hourly costs.)

Your primary hosting location should have redundancy at all levels:

Multiple internet connections,
Multiple firewalls set to failover,
Multiple clustered web servers,
Multiple clustered database servers,
If you store files, use a SAN or Amazon S3,
Every server should have some form of RAID depending on the server's purpose,
Every server can have multiple PSUs connected to separate power sources/breakers,
External and internal server monitoring software,
Power generator that automatically turns on when the power goes out, and a backup generator for good measure.

That'll keep you running at your primary location in the event of most failure scenarios.

Then have a single server set up at a remote location that is kept updated using log shipping and include it in your deployment script (after your normal production servers are updated...) A colocated server on the other side of the country does nicely for these purposes. To minimize downtime of having to switch to the secondary location keep your TTL on the DNS records as low as you are comfortable.

Of course, so much hardware is going to be steep so you'll need to determine what is worth being down for 1 second, 1 minute, 10 minutes, etc. and adjust accordingly.

DavGarcia 2009-02-01 03:44:45

Hi,

The starting point for a rock solid DR Strategy is to first work out what the true cost is to the business of your server/platform downtime.

The following article will get you started along the right lines.

http://articles.techrepublic.com.com/5100-10878_11-1038783.html

If you require further guidelines good old Google can provide plenty more reading.

A project of this nature requires you to collaborate with your key business decision makers and you will need to communicate to them what the associated costs of downtime are and what the business impact would be. You will likely need to collaborate with several business units in order to gather the required information. Collectively you then need to come to a decision as to what is considered acceptable downtime for your business. Only then can you devise a DR strategy to accommodate these requirements.

You will also find that conducting this exercise may highlight shortcomings in your platforms current configuration with regard to high availability and this may also need to be reviewed as an aside project.

The key point to take away from all of this is that the decision as to what is an acceptable period of downtime is not for the DBA alone to decide but rather to provide the information and expert knowledge necessary so that a realistic decision can be reached. Your task is to implement a strategy that can meet the business requirements.

Don’t forget to test your DR strategy by conducting a test scenario in order to validate your recovery times and to practice the process. Should the time come when you need to implement your DR strategy you will likely be under pressure, your phone will be ringing frequently and people will be hovering around you like mosquitoes. Having already honed and practiced your DR response, you can be confident in taking control of the situation and implementing the recovery will be a smooth process.

Good luck with your project.

John Sansom 2009-02-01 13:49:58

ansaurus

tags:

views:

answers:

What's your disaster recovery plan?

related questions