petranator2011
asked on
Reliable Backup Static Routing using Object Tracking
Hi All,
I've been tasked with making a proper failover on the a router running Cisco IOS 12.4
The connectivity consists of a T1 (serial) primary connection, and another firewall hosting a VPN over an internet connection for the backup connection.
I've built it out, and it all works great... except....
This T1 is a little schitzophrenic. Every now and then it drops a packet or two (like 3 times a minute.) This has no effect whatsoever on our primary use of this connection, which is telnet traffic for an AS/400.
The problem is, the tracked SLA changes from up to down and back three times a minute. Meaning it changes the routing three times a minute. This kind of behaviour is VERY disruptive to the AS/400 traffic.
Here's my wish - I want the SLA to ONLY switch state if it loses say 10 consecutive pings. I thought the answer was the "threshold" quantity on the SLA, but it not only seems to have no effect (state still changing) but much of my reading says it's connected to a "hysterisis" function - which I don't really understand. Even if I crank the "threshold" up to ridiculous quantities (30000 say) it still logs the tracked object as changing state just as frequently.
The "frequency" is just how often the SLA pings. I've increased this quantity too, but really it's like Russian roulette as to whether it gets a good ping or a bad ping when it goes off.
Can anyone tell me what I'm missing here?
Thanks,
Nate
I've been tasked with making a proper failover on the a router running Cisco IOS 12.4
The connectivity consists of a T1 (serial) primary connection, and another firewall hosting a VPN over an internet connection for the backup connection.
I've built it out, and it all works great... except....
This T1 is a little schitzophrenic. Every now and then it drops a packet or two (like 3 times a minute.) This has no effect whatsoever on our primary use of this connection, which is telnet traffic for an AS/400.
The problem is, the tracked SLA changes from up to down and back three times a minute. Meaning it changes the routing three times a minute. This kind of behaviour is VERY disruptive to the AS/400 traffic.
Here's my wish - I want the SLA to ONLY switch state if it loses say 10 consecutive pings. I thought the answer was the "threshold" quantity on the SLA, but it not only seems to have no effect (state still changing) but much of my reading says it's connected to a "hysterisis" function - which I don't really understand. Even if I crank the "threshold" up to ridiculous quantities (30000 say) it still logs the tracked object as changing state just as frequently.
The "frequency" is just how often the SLA pings. I've increased this quantity too, but really it's like Russian roulette as to whether it gets a good ping or a bad ping when it goes off.
Can anyone tell me what I'm missing here?
Thanks,
Nate
ASKER
ip sla monitor 1
type echo protocol ipIcmpEcho 10.25.2.1 source-ipaddr 10.25.22.3
timeout 1000
threshold 30000
frequency 15
ip sla monitor schedule 1 life forever start-time now
************
As I understand it, that should set the timeout on each ping to 1000ms, it should repeat every 15 seconds, and as I mentioned before - changing the frequency seems to have no affect on how the trackable object works at all. Currently I have it set for 30000, whatever unit that is in.
Nate
type echo protocol ipIcmpEcho 10.25.2.1 source-ipaddr 10.25.22.3
timeout 1000
threshold 30000
frequency 15
ip sla monitor schedule 1 life forever start-time now
************
As I understand it, that should set the timeout on each ping to 1000ms, it should repeat every 15 seconds, and as I mentioned before - changing the frequency seems to have no affect on how the trackable object works at all. Currently I have it set for 30000, whatever unit that is in.
Nate
either increase the frequency of the ping or the amount of the threshold
the threshold is in milliseconds
I'd try a frequency of 1, or a threshold of 150,000
the threshold is in milliseconds
I'd try a frequency of 1, or a threshold of 150,000
for clarity...
take the number of failed pings you want to trigger failover, multiply it by the ping frequency, then multiply is by 1000
so for 10 failed pings at 15 second intervals, 10 * 15 * 1000 = 150,000
This means you could be down for 150 seconds before failover
for your particular requirement, I would have a more frequent ping (1 per second) and have it lose no more than 30
so for 30 failed pings at 1 second intervals, 30 * 1 * 1000 = 30,000 this should be more appropriate for your T1
I would have called out a fault on your T1 a long time ago...
take the number of failed pings you want to trigger failover, multiply it by the ping frequency, then multiply is by 1000
so for 10 failed pings at 15 second intervals, 10 * 15 * 1000 = 150,000
This means you could be down for 150 seconds before failover
for your particular requirement, I would have a more frequent ping (1 per second) and have it lose no more than 30
so for 30 failed pings at 1 second intervals, 30 * 1 * 1000 = 30,000 this should be more appropriate for your T1
I would have called out a fault on your T1 a long time ago...
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you all.
Billy