I operate an OLTP system that allows SSL connections over the internet at multiple sites. I would like to find an effective solution to how to transparently and automatically reroute transaction connections when one site is down. Bonus points for considering the site down when it is not actually unreachable or unable to connect but just delayed or overloaded or sending back bad results.
For example, the user system would attach to www.abcdef.com or 123.234.56.7 and actually be redirected to one.abcdef.com/two.abcdef.com or 99.5.2.1/68.96.79.1 depending on which site is working. This sounds a lot like load balancing but it's primarily how to use the network to avoid a single point of failure as opposed to how to spread the work between servers.
The advantages to the user are that (1) they only have to know one URL or one IP to connect to and (2) their transactions work in several different failure scenarios. Like, if the public network near one of the sites fails or is misrouted, if the local loop for that ISP fails, if in-house routers or servers fail. Of course, the transactions still fail if the problem is close to the user.