Redundant WAN links

Posted on 1998-10-14
Last Modified: 2012-06-21
I am setting up a short-haul WAN IP connection between two buildings. I have a 100Mbps wireless infared laser system as my primary link, and a leased T1 as my backup link. I plan to use Linux boxes as routers on both ends. Both the T1 and laser connections will be back-to-back between the two Linux boxes.

We expect that the laser system will go down on a regular basis, for example during heavy rain storms. I want to be able to revert to the T1 with no interruption to client machines and with the absolute minimum possible number of lost packets. The T1 is always up.

I know that this should be doable with routed, gated, etc, but I don't really know how they work. I know the basics of IP routing, the piece I'm missing is how to set up routing for two links so that packets travel over whichever one is running.

Can someone tell me how this is all supposed to work?
Question by:ghjm

Expert Comment

ID: 1587183
i suggest you use static route betewwen the 2 linux box.

if you only use static route. then just use "route add" command to add a new route and use "route delete " to remove the original route.  and you need change the static route table of the machines of the 2 networks.

if you use rip in the 2 networks. then you need write a gated.conf to broadcast the rip information to all other boxes in the 2 networks. the /etc/gated.conf.sample is a good sample for you

Author Comment

ID: 1587184
Thanks for your comment. Of course, I already know I could use static routes and change them manaully when the connection goes down. What I am looking for is someone with practical experience in dynamic routing to help me set expectations and plan what kind of configuration to use.

Expert Comment

ID: 1587185
As much as it breaks my heart to admit it, Linux is _not_ the answer here.  In the latest and greatest kernel release there is experimental support for the kinds of things you will need to make this work.  If you are supporting a mission critical environment (why else use redundant links?) you need something a little more robust than experimental Linux support.

I strongly suggest that you consider a pair of commercial routers, that support OSPF.  Both Bay and Cisco make such routers.

If you really want redundancy, purchase 4 Cisco 2500 series routers and configure a pair at each end of the link with HSRP (Hot-Standby Router Protocol).  HSRP boasts a failover time of as little as three seconds.  Best of all, the clients never know that one or other piece of hardware failed.
Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.


Author Comment

ID: 1587186
Thanks for the suggestion. Yes, I have a mission critical environment, but there are other factors at work. The setup is for a temporary connection while we transfer labs from one facility to another. I don't want to invest $10,000 or more in Cisco routers, particularly since I will have no use for them when the move is complete.

I have actually got something to work with a failover time of about 20 seconds, which is adequate. It is a bit of a hack job, though, so if someone who knows gated in detail happens to show up, I'd still like to know how to configure what I've described. I am baffled by gated.

Accepted Solution

MichaelKrastev earned 100 total points
ID: 1587187
I wouldn't hesitate to recommend Linux boxes for this routing task. And you will not be the first in this area ...
I am not sure about the T1 interfaces; I presume you have high speed serial cards.
If both interfaces, 100 Mbps to your laser and the T1, are (supposed to be) connected to one Linux box and at the opposite side you have another Linux terminating these two lines.
If this is the situation, why use routing protocol. Just add static routes, set higher preference to your faster line and lower preference to T1. As these are directly attached interfaces you should get immediate failover at the moment when the lernel detect the laser is down. And the recolvery should be instantaneous.
If you insist on routing protocols, make sure you understand the difference between link-state and distance-vector routing protocols and know what to expect from them.

OSPF as it was suggested provide you with near instantaneous failover, consume more memory and CPU power, but this should not be an issue with your two Linux boxes and two lines. On UNIX machines there is gated that supports OSPF.

RIP (and its updated variant RIP2) is distance-vector protocol and by definition (RFC1058) you may have to wait up to 30 sec before updates are propagated, unless the router (or the routing process on UNIX box) supports split horizont and reverse poisoning features. Also, in case a router suddenly goes down, then the timeout is 6 times 30 sec, e.g. 180 sec. There is routed on every UNIX (and Linux) and it is much easier to configure than gated.

But again, in your simple configuiration, RIP's timeouts should not worry you. Your Linux boxes are directly attached and the routing process wiil be immediately notified (and should switch to the operating link).

Routed shouldn't take you more than 15 minutes (including reading the man page).

Author Comment

ID: 1587188
I don't see any way to set a preference on a route. The man page for route says that "metric" is not used by recent kernels. How would you go about doing this?

Expert Comment

ID: 1587189
The kernel I have is possibly out of date, 2.0.31 and I see something like this in ifconfig man page, but not a word in route man page. I take your words for granted, and here is what I would do:

1) If interface metric is not taken into account, this means you are not able to do load balancing between those two Linux boxes.

2) if the metric of each entry in the routing table is not taken into account while routing IP packest, then Linux wouldn't able to perform  useful routing functions. I don't believe this is the case.

There one more thing to consider -- you are not interested to use load balancing between the Linux boxes for traffic originates and terminates at these two machines. You want to direct through the better link only the transit traffic. You may still want to try 'route add ...' with different metrics to see the result.

If this is not workable, then routing protocol will do the job. Here I'll repeat that RIP should do the job as well as OSPF, but it will be easier to configure. The trick with RIP is (you can read in the man page) that it keeps only the best route for each destination it is aware of. If we go back, even if the metric of the interface is not used for load balancing, it may be distributed in the routing updates, or you can use /etc/gateways to assign metric for those two destinations -- laser and T1 interfaces of the opposite Linux box. This way, the updates that come through the laser will have higher preference then those received via the T1. No matter which one is received first, after 30 seconds only the laser interface should be in the routing table. In case the laser goes down, after no more than 30 sec the neighbouring Linux will send RIP update message through the T1, and here you will have a route. When the laser is back in service, the neighbour will send update, announcing a new route is available, and the routing table will be updated accordingly.

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have seen several blogs and forum entries elsewhere state that because NTFS volumes do not support linux ownership or permissions, they cannot be used for anonymous ftp upload through the vsftpd program.   IT can be done and here's how to get i…
Note: for this to work properly you need to use a Cross-Over network cable. 1. Connect both servers S1 and S2 on the second network slots respectively. Note that you can use the 1st slots but usually these would be occupied by the Service Provide…
Email security requires an ever evolving service that stays up to date with counter-evolving threats. The Email Laundry perform Research and Development to ensure their email security service evolves faster than cyber criminals. We apply our Threat…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question