Link to home
Start Free TrialLog in
Avatar of ghjm
ghjm

asked on

Redundant WAN links

I am setting up a short-haul WAN IP connection between two buildings. I have a 100Mbps wireless infared laser system as my primary link, and a leased T1 as my backup link. I plan to use Linux boxes as routers on both ends. Both the T1 and laser connections will be back-to-back between the two Linux boxes.

We expect that the laser system will go down on a regular basis, for example during heavy rain storms. I want to be able to revert to the T1 with no interruption to client machines and with the absolute minimum possible number of lost packets. The T1 is always up.

I know that this should be doable with routed, gated, etc, but I don't really know how they work. I know the basics of IP routing, the piece I'm missing is how to set up routing for two links so that packets travel over whichever one is running.

Can someone tell me how this is all supposed to work?
Avatar of zhongbing
zhongbing

i suggest you use static route betewwen the 2 linux box.

if you only use static route. then just use "route add" command to add a new route and use "route delete " to remove the original route.  and you need change the static route table of the machines of the 2 networks.

if you use rip in the 2 networks. then you need write a gated.conf to broadcast the rip information to all other boxes in the 2 networks. the /etc/gated.conf.sample is a good sample for you
Avatar of ghjm

ASKER

Thanks for your comment. Of course, I already know I could use static routes and change them manaully when the connection goes down. What I am looking for is someone with practical experience in dynamic routing to help me set expectations and plan what kind of configuration to use.
As much as it breaks my heart to admit it, Linux is _not_ the answer here.  In the latest and greatest kernel release there is experimental support for the kinds of things you will need to make this work.  If you are supporting a mission critical environment (why else use redundant links?) you need something a little more robust than experimental Linux support.

I strongly suggest that you consider a pair of commercial routers, that support OSPF.  Both Bay and Cisco make such routers.

If you really want redundancy, purchase 4 Cisco 2500 series routers and configure a pair at each end of the link with HSRP (Hot-Standby Router Protocol).  HSRP boasts a failover time of as little as three seconds.  Best of all, the clients never know that one or other piece of hardware failed.
Avatar of ghjm

ASKER

Thanks for the suggestion. Yes, I have a mission critical environment, but there are other factors at work. The setup is for a temporary connection while we transfer labs from one facility to another. I don't want to invest $10,000 or more in Cisco routers, particularly since I will have no use for them when the move is complete.

I have actually got something to work with a failover time of about 20 seconds, which is adequate. It is a bit of a hack job, though, so if someone who knows gated in detail happens to show up, I'd still like to know how to configure what I've described. I am baffled by gated.
ASKER CERTIFIED SOLUTION
Avatar of MichaelKrastev
MichaelKrastev

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ghjm

ASKER

I don't see any way to set a preference on a route. The man page for route says that "metric" is not used by recent kernels. How would you go about doing this?
The kernel I have is possibly out of date, 2.0.31 and I see something like this in ifconfig man page, but not a word in route man page. I take your words for granted, and here is what I would do:

1) If interface metric is not taken into account, this means you are not able to do load balancing between those two Linux boxes.

2) if the metric of each entry in the routing table is not taken into account while routing IP packest, then Linux wouldn't able to perform  useful routing functions. I don't believe this is the case.

There one more thing to consider -- you are not interested to use load balancing between the Linux boxes for traffic originates and terminates at these two machines. You want to direct through the better link only the transit traffic. You may still want to try 'route add ...' with different metrics to see the result.

If this is not workable, then routing protocol will do the job. Here I'll repeat that RIP should do the job as well as OSPF, but it will be easier to configure. The trick with RIP is (you can read in the man page) that it keeps only the best route for each destination it is aware of. If we go back, even if the metric of the interface is not used for load balancing, it may be distributed in the routing updates, or you can use /etc/gateways to assign metric for those two destinations -- laser and T1 interfaces of the opposite Linux box. This way, the updates that come through the laser will have higher preference then those received via the T1. No matter which one is received first, after 30 seconds only the laser interface should be in the routing table. In case the laser goes down, after no more than 30 sec the neighbouring Linux will send RIP update message through the T1, and here you will have a route. When the laser is back in service, the neighbour will send update, announcing a new route is available, and the routing table will be updated accordingly.