tags:

views:

39

answers:

1

Recently the data center I'm using experienced an outage. During the outage customers using my service reported that their websites were very slow. The customers integrate my service in two places: 1. A script tag in web pages that pointing to my server. 2. making API calls against my server with php: fputs($fp, "POST $path HTTP/1.1\r\n"); ... stream_set_timeout($fp, 10); $result = fread($fp, 2000); ...

How can I protect websites from being affected when the data-center or my server are down? How can I simulate a data center outage so I can add a solution?

Thanks

A: 

We use Amazon Web Services to achieve good protection from an outage.

They offer multiple availability zones (=physically different data centers that are unlikely to all be affected by the same condition). Our application servers are setup in multiple zones, with a load balancer forwarding requests to all of the servers.

If some of the servers become unresponsive, the load balancer will no longer route traffic to them (it also periodically checks if they come back up).

If there is an outage of the load balancer (our monitoring software tells us that), we switch the elastic IP address of the load balancer to another load balancer.

Eric J.
I will use amazon when I'll need more servers.Currently I'm using a VPS but want to guarantee that my clients won't get affected when I'm down.
pablo
He answered your question regardless. You would need a setup similar to how he described. Load balancer -> server nodes
Kevin Peno
I don't mind if my server/service is down.I want to make sure my customers servers won't be slowed down in this case. I'm trying to understand how to simulate a data-center outage so I can choose the correct timeout and remote calls to minimize the impact. Will using a non-existing IP or domain name have the same effect?
pablo
To add: To "fake" a power outage, you would take a server node down and see how the balancer responds
Kevin Peno
Or if you just have the one server, stop your web server to simulate an outage.
Eric J.