Link to home
Start Free TrialLog in
Avatar of Ian Arakel
Ian ArakelFlag for India

asked on

Link issue

Hi Team,

We have two sites Site A and Site B connected via two physical links that belongs to the same ISP.
The routing protocol that we use is OSPF.


Site A=========================Site B

Issue:
Yesterday we had multiple teams reporting disconnectivity of sessions between servers based at A and B.
The OSPF neighborship was intact due to which we did not suspect an issue with the physical media.
For precautionary purposes, we tagged a docket with the ISP to confirm if their media was clean.
We were updated that there were issues observed in a certain link in the cloud connecting their POP devices which lead to the outage which was later rectified.

Query:
Why were no OSPF flap logs received on the L3 core where the links are terminated?
Avatar of giltjr
giltjr
Flag of United States of America image

Are the two links defined as two independent L3 connections?  Or is the ISP bonding the links so they appears as a single logical link, like an Etherchannel?
Avatar of Ian Arakel

ASKER

They are two independent L3 links.
No bonding exists at the ISP end.
What are your timeout values?

Can you find out from your ISP if there was a hard break or if the issue was intermittent?  When we see something like this the problem is intermittent, so it causes performance issues, but does not cause issues with OSPF.
The ISP has given the below statement

“Network Engineering have identified the issue on a link between our xxx POP and our xxx POP. We have isolated the traffic and you should see your circuits up.”

Kindly enlighten on the timeout values point that you have stated.

Below are timers configured:

 Timer intervals configured, Hello 10, Dead 40, Wait 40, Retransmit 5
 oob-resync timeout 40

Just to update that we have two SVI's defined at the A and B location for vlan x and vlan y and we have OSPF running on the same.
Based on their statement I would assume that the issue was not a hard break, but something that would cause performance issues, such has increased latency and/or random dropped packets.  That could cause session disconnects, but still allow OSPF to get messages through.
Hi there,

Kindly confirm if there are cases where OSPF messages get through even during link degradation.

P.N.: point to point source based ping was working fine during the outage span which was really strange.
If a link is hard down, that is NO traffic makes it through for  more than your OSPF timers are setup for then OSPF will remove the routes from.

However if the link is just experiencing degradation for some reason: link flapping, packet drops, high link utilization for more than just a few seconds, then  odds are some OSPF messages will get through and prevent timer pops.

If your pings were working, then OSPF messages would also most likely make it through.  Also if pings where working during the "outage"  then the link was not hard down, but degraded.  Technically a big difference, but a hard down is actually better (when you have a back-up/alternat link) than a degraded link.
Hi Expert,

Just one last query.
A link degradation should ideally have reflected some kind of packet drops in the end to end source based ping.
Here this did not happen.
Now the scenario is that we had tagged a docket with the vendor just as a precautionary measure but coincidentally it seems there was an outage at their end, which when fixed rectified the situation.

My Query:
As a network admin, how can we figure out an issue in the ISP cloud when the end to end ping works fine?
Also in real time, has anyone  in this forum experienced such weird scenarios coz this is definitely a first timer for me.
ASKER CERTIFIED SOLUTION
Avatar of giltjr
giltjr
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks a lot.Appreciate the same.
Thankyou.