tags:

views:

854

answers:

3

Here is the situation. This small company I'm working with wants to have a redundant internet access. They run bunch of services from their office - a website, POP+SMTP server and use VPN for accessing network shares from home. They have 2 independent internet connections from 2 ISP's (one is a local cable provider, and another one is a DSL). If their default connection goes down, they switch to the backup.

Unfortunately way they are set up right now they have to change their DNS records every time this happens which means every switch results in at best few hours of downtime. They want to be completely redundant, and be able to switch between networks without the downtime. How can that be accomplished?

Would it be possible to set up a server in a remote data center, point the DNS at it and have it forward the network traffic to the correct IP?

If this was just a web application, I'd set up a basic server and use a 301 redirect or something like that. I don't really care if it has to be switched manually each time because they have a dedicated IT person or two on the staff who could do it. But they also need their POP, SMTP and VPN traffic redirected this way.

So it's almost like load balancing, but not really. Are there existing solutions that would provide this functionality? How would you provide ISP redundancy like that?

+3  A: 

Last time I did it, I had my own ASN and spoke BGP to both ISPs announcing my own /24 (which you can get from either ISP, or maybe ARIN). You could go down this route, but its a fair bit of setup. And a random Cable/DSL ISP probably won't set this up with you. This does eliminate all the points of failure and makes the switchover completely transparent.

You can also lower the TTLs on the DNS records to 5 minutes or so. This won't be instant switch over, but 5 minutes may be fast enough.

Otherwise, you certainly can use the remote server in a colo, but then of course that becomes the single point of failure. You have a couple of choices of how to redirect the traffic:

  • GRE/etc. tunnel: You run two tunnels from the colo box, one to each connection at your office. Tunnel all relevant traffic both ways and you wind up with a few IPs from the colo at your office. You can then run a routing protocol (even something simple, like RIP) to make this automatically fail over, or even use both ISPs simultaneously for additional bandwidth. This can be implemented fairly easily on Linux boxes or Cisco routers. I assume Juniper can to, but I've never used them. Failover is transparent (e.g., will not break VPN connections). Beware of MTU issues. If your office connections do not do reverse path filtering (or can make an exception for you), you do not have to tunnel outgoing traffic back to the colo.
  • NAT. May or may not work with whatever protocols you're running, but you can set up 1:1 static NAT at the colo to redirect the traffic. Easily done for common TCP protocols, and maybe your VPN too. Doable on pretty much anything. Failover is not transparent; existing connections will time out. Can also use both connections for additional bandwidth on a per-connection basis. Traffic must be tunneled back to colo.
  • Obvious third answer: Move the services to the colo. Has the advantage of protecting from power outages, too.
derobert
+2  A: 

I think DynDns "Custom DNS" service may help in your situation - the IP behind your domain name may be updated dynamically (many routers has built-in support for it).

Updated: To reduce the downtime, you can create a simple script that pings the primary ISP once a while and in case of failure updates DynDNS to secondary one (and the same way back).

Dmitry Khalatov
Came here to say just this. It's not going to eliminate downtime though. But in your situation, nothing but a router somewhere outside your building with a VIP forwarding all traffic to the right underlying IP will be truely fault tolerant.
TheSoftwareJedi
A: 

you can setup a route-map with default-nexthop.

This will take care of outgoing traffic and not incoming redundancy which would require an AS number and BGP announcements of a block that is at least a /24 as ISPs will not announce anything smaller than a /24 block.