Link to home
Start Free TrialLog in
Avatar of MightyMikey
MightyMikeyFlag for United States of America

asked on

Cisco P2P Over T1 Circuit dropping connection randomly

Salutations Experts,

I have a predicament on hands after spending weeks setting up the network for a few offsite locations.

I had asked for help on the subject before and with the solutions given I was able to deploy the network with ease. However, we are now having an issue with the connection dropping out from time to time. I believe it is an hardware issue. Either the Cisco 1721 is defective or the WIC is defective. At my supervisors request he wants me to post up the configuration for both sites and have the experts take a look. I may have overlooked a configuration command which is giving us the problem.

The way the network is setup is as follows. Branch A uses a Cisco 2821 with a dual serial port WIC. It acts as the central hub for the two other branches (Branch B and Branch C use Cisco 1721) to connect with each other and provides their internet connection as well. The issue is only at Branch B.

Here are the two config files:

Branch A

Using 1691 out of 245752 bytes
!
version 12.4
service timestamps debug datetime msec
service timestamps log datetime msec
service password-encryption
!
hostname Branch A
!
boot-start-marker
boot-end-marker
!
card type t1 0 0
! card type command needed for slot/vwic-slot 0/1
enable secret 5 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
!
no aaa new-model
!
resource policy
!
no network-clock-participate wic 0
ip subnet-zero
!
!
ip cef
!
!
login on-failure log
login on-success log
!
!
!
controller T1 0/0/0
 framing esf
 clock source internal
 linecode b8zs
 channel-group 0 timeslots 1-24
!
controller T1 0/0/1
 framing esf
 clock source internal
 linecode b8zs
 channel-group 0 timeslots 1-24
!
!
interface GigabitEthernet0/0
 description Branch A
 ip address x.x.0.250 x.x.255.0
 duplex auto
 speed auto
 no mop enabled
!
interface GigabitEthernet0/1
 no ip address
 shutdown
 duplex auto
 speed auto
!
interface Serial0/0/0:0
 description T1 Circuit to Branch B
 ip address x.x.101.251 x.x.255.0
 encapsulation ppp
!
interface Serial0/0/1:0
 description T1 Circuit to Branch C
 ip address x.x.101.250 x.x.255.0
 encapsulation ppp
!
router rip
 version 2
 network x.x.0.0
 network x.x.1.0
 network x.x.4.0
 network x.x.101.0
 no auto-summary
!
ip default-gateway x.x.0.2
ip classless
ip route 0.0.0.0 0.0.0.0 x.x.0.2
ip route x.x.1.0 x.x.255.0 x.x.101.253
ip route x.x.4.0 x.x.255.0 x.x.101.252
!
no ip http server
!
!
control-plane
!
!
line con 0
 password 7 xxxxxxxxxxxxxxxxxxxxxxx
 login
line aux 0
line vty 0
 password 7 xxxxxxxxxxxxxxxxxxxxxxx
 login
line vty 1 4
 no login
!
scheduler allocate 20000 1000
!
end


Branch B

Using 1239 out of 29688 bytes
!
version 12.3
service timestamps debug uptime
service timestamps log uptime
service password-encryption
!
hostname Branch B
!
boot-start-marker
boot-end-marker
!
enable secret 5 xxxxxxxxxxxxxxxxxxxxxx
enable password 7 xxxxxxxxxxxxxxxxxxxxxxx
!
no aaa new-model
ip subnet-zero
!
!
!
ip dhcp pool Branch B
   network x.x.4.0 x.x.255.0
   default-router x.x.4.250
   dns-server x.x.0.10
   domain-name DOMAIN.local
!
ip dhcp pool Branch B
!
!
no ip domain lookup
ip cef
no scripting tcl init
no scripting tcl encdir
!
!
!
!
interface FastEthernet0
 description Branch B LAN
 ip address x.x.4.250 x.x.255.0
 speed auto
 full-duplex
!
interface Serial0
 description T1 Circuit to Branch A
 ip address x.x.101.252 x.x.255.0
 encapsulation ppp
 service-module t1 timeslots 1-24
!
router rip
 version 2
 network x.x.0.0
 network x.x.1.0
 network x.x.4.0
 network x.x.101.0
 no auto-summary
!
ip default-gateway x.x.101.251
ip classless
ip route 0.0.0.0 0.0.0.0 x.x.101.251
no ip http server
!
!
!
control-plane
!
!
line con 0
 password 7 xxxxxxxxxxxxxxxxxxxxxx
 login
line aux 0
line vty 0 4
 password 7 xxxxxxxxxxxxxxxxxxxxxx
 login
!
!
end



Is there a diagnostic tool either built into the router or third party software that can run an analysis of our routers. To determine if it is an hardware issue and is just a matter of replacing the router or WIC.

Thank you very much.
Avatar of Soulja
Soulja
Flag of United States of America image

Are you seeing any type of errors on the serial between A and B?
Avatar of MightyMikey

ASKER

None that I am aware of. Does the router keep a log off the errors or are you referring to errors its displays in real time when it goes down?

Let me add that it goes down maybe 1 to 2 times a day. A previous Cisco 1721 we had on there kept going down every 10-25 mins.

So far today it has not gone down at all. It went down twice yesterday. Once in the morning and I believe one other time in the afternoon. We don't have a tech in those locations and I travel there rarely unless needed. We have just told the staff to power down and power back up the router to establish the connection once again.
Update: I have just been notified that the router has been going down today. They reset it about every 45 minutes. The branch was not notifying us here at corporate office. We just found out because we asked out of curiosity.
SOLUTION
Avatar of Soulja
Soulja
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
When you look an sh interface serialx/x on the router's do they show any errors?
From Site B
Serial0 is up, line protocol is up
  Hardware is PQUICC with Fractional T1 CSU/DSU
  Description: T1 Circuit to Branch A
  Internet address is x.x.101.252/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 5/255, rxload 6/255
  Encapsulation PPP, LCP Open
  Open: CDPCP, IPCP, loopback not set
  Last input 00:00:00, output 00:00:00, output hang never
  Last clearing of "show interface" counters 00:01:44
  Input queue: 1/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/0 (size/max total/threshold/drops)
     Conversations  0/2/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 42000 bits/sec, 20 packets/sec
  5 minute output rate 32000 bits/sec, 20 packets/sec
     5041 packets input, 1471422 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
     1 input errors, 1 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     4856 packets output, 1166135 bytes, 0 underruns
     0 output errors, 0 collisions, 1 interface resets



From Site A
Serial0/0/0:0 is up, line protocol is up
  Hardware is GT96K Serial
  Description: T1 Circuit to Branch B
  Internet address is x.x.101.251/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 3/255, rxload 1/255
  Encapsulation PPP, LCP Open
  Open: CDPCP, IPCP, loopback not set
  Keepalive set (10 sec)
  Last input 00:00:05, output 00:00:01, output hang never
  Last clearing of "show interface" counters 3w6d
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 24973
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/24675 (size/max total/threshold/drops)
     Conversations  0/42/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 0 bits/sec, 2 packets/sec
  5 minute output rate 20000 bits/sec, 2 packets/sec
     3540393 packets input, 581177844 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 31 giants, 0 throttles
     272 input errors, 272 CRC, 58 frame, 22 overrun, 0 ignored, 96 abort
     4891201 packets output, 142536288 bytes, 0 underruns
     0 output errors, 0 collisions, 72 interface resets
     0 output buffer failures, 0 output buffers swapped out
     77 carrier transitions
  Timeslot(s) Used:1-24, SCC: 0, Transmitter delay is 0 flags



They just had to reset it again right now at Site B at 2:40 pm CST
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The cables were tested and all showed green. I believe the only cables I don't recall I tested were the drops made from the DMARK to the server room. Two cables were run. Site A to Site C has not been reported to have any issues but I will double check tomorrow.

I have a feeling that it is the WIC. I need to double check if we have a spare available to use.

I will also have the ISP run tests on the circuit.

I will post my findings as soon as possible.

Thanks for your help. I appreciate it. Been in hot water because the deployment was not successful.
Avatar of pergr
pergr

Hi there,

On the Branch A router, change under the 'controller T1 0/0/0' the setting

clock source internal
to
clock source line

Particularly, you will want to do this towards the problematic branch.

I can see in Branch B that there is only a Serial interface, which means there is an external CSU/DSU. It is possible this has been set up as 'master' for clocking, and then the T1 interface on Branch A has to take clock from the line, and not also try to me master (with 'internal' clock).
I have Branch 2 Cisco 2821 as clock source internal and Branch B/C Cisco 1721 as clock source line. I will double check just to be sure.

Thanks
@pergr I haven't made the changes just yet. I got in contact with our ISP and they will be running tests on the circuit. Looking at each Serial at each Branch. That Branch A is the one with the highest Input errors. The same amount of input errors also match the CRC errors. However, at the other Branches show very low input errors and little to non CRC errors.

For example at Branch B, it currently shows:

Serial0 is up, line protocol is up
  Hardware is PQUICC with Fractional T1 CSU/DSU
  Description: T1 Circuit to Branch A
  Internet address is x.x.101.252/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 1/255, rxload 40/255
  Encapsulation PPP, LCP Open
  Open: CDPCP, IPCP, loopback not set
  Last input 00:00:00, output 00:00:00, output hang never
  Last clearing of "show interface" counters 00:34:42
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/0 (size/max total/threshold/drops)
     Conversations  0/28/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 243000 bits/sec, 18 packets/sec
  5 minute output rate 12000 bits/sec, 6 packets/sec
     52266 packets input, 57918828 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
     29 input errors, 0 CRC, 29 frame, 0 overrun, 0 ignored, 0 abort
     38988 packets output, 7156189 bytes, 0 underruns
     0 output errors, 0 collisions, 1 interface resets
     0 output buffer failures, 0 output buffers swapped out
     1 carrier transitions
     DCD=up  DSR=up  DTR=up  RTS=up  CTS=up


Branch A (Includes Serials to both Branches B and C)
Serial0/0/0:0 is up, line protocol is up
  Hardware is GT96K Serial
  Description: T1 Circuit to Branch C
  Internet address is x.x.101.251/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 33/255, rxload 1/255
  Encapsulation PPP, LCP Open
  Open: CDPCP, IPCP, loopback not set
  Keepalive set (10 sec)
  Last input 00:00:01, output 00:00:01, output hang never
  Last clearing of "show interface" counters 3w6d
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 28582
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/28023 (size/max total/threshold/drops)
     Conversations  0/42/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 3000 bits/sec, 3 packets/sec
  5 minute output rate 203000 bits/sec, 8 packets/sec
     4154931 packets input, 650050314 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 35 giants, 0 throttles
     1470267 input errors, 1470267 CRC, 628024 frame, 351366 overrun, 0 ignored, 1105061 abort
     5852983 packets output, 719141126 bytes, 0 underruns
     0 output errors, 0 collisions, 107 interface resets
     0 output buffer failures, 0 output buffers swapped out
     99 carrier transitions
  Timeslot(s) Used:1-24, SCC: 0, Transmitter delay is 0 flags
Serial0/0/1:0 is up, line protocol is up
  Hardware is GT96K Serial
  Description: T1 Circuit to Branch C
  Internet address is x.x.101.250/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation PPP, LCP Open
  Open: CDPCP, IPCP, loopback not set
  Keepalive set (10 sec)
  Last input 00:00:16, output 00:00:00, output hang never
  Last clearing of "show interface" counters 3w6d
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 7216
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/7216 (size/max total/threshold/drops)
     Conversations  0/43/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 0 bits/sec, 1 packets/sec
  5 minute output rate 0 bits/sec, 1 packets/sec
     6960773 packets input, 552242992 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 1 giants, 0 throttles
     64391 input errors, 64391 CRC, 29099 frame, 14569 overrun, 0 ignored, 47431 abort
     8006586 packets output, 123261153 bytes, 0 underruns
     0 output errors, 0 collisions, 5 interface resets
     0 output buffer failures, 0 output buffers swapped out
     19 carrier transitions
  Timeslot(s) Used:1-24, SCC: 1, Transmitter delay is 0 flags


Since the connection is never dropped at Branch C I suspect the Port on the WIC at Branch A that connects to Branch B is the problem. The Cisco 2821 at Branch A has two WICs, each WIC has two ports on them. We are only using one WIC. (If it sounds confusing I apologize.)
Okay, yes, replace the wic on site A.
@Soulja Fortunately we have another WIC already installed on the router. It is just a matter of connecting to it remotely. Making the changes to the config and watch if we get any problems. Which I hope we do not. I will update as soon as I find out.

Thanks
ISP reported status is green with the circuit. They had to reset in the morning again at Branch B. I changed the clock source at Branch A from internal to line. If it goes down again I will be switching the connection over from T1 Controller 0/0 to T1 Controller 0/1 if the connection drops again.

I will report back soon.
You should clear the counters, and see if they are growing again.

That way you do not need to wait for it to go down.
@pergr

I don't know how to do that but I did reload all the routers and which reset everything back to zero. So far Branch B has reported 34 input errors, 33 frame errors and 1 abort with 1 interface reset. The rest are at zero.

At Branch A zero errors and just one interface reset.
Without reloading, the Cisco command is 'clear counters'.
@Pergr

So far the network as remained up and operational with no downtime. Nearing the 24 hour mark. I will be monitoring it through out the week. I will accept your solution and award points in a day or two. To confirm changing the clock source is the very thing that stabalized the network.

However,I do have one question. How is it that one Serial from Branch A to Branch C remain Clock Source Internal  without that connection going down. But the one Serial from Branch A to Branch B needs to be Clock Source Line for it to work. I would suspect they would both have to be the same. Unless the ISP offers the clock timing on one circuit and not the other.
The T1 service from the transmission supplier is "asynchronous" meaning they only transmit what you send - effectively they are transporting the T1 signal within a larger SONET signal that can fit the T1.

The T1 signal itself needs to be synchronized between the end points only. The end points can be either the routers themselves, if they have T1 ports, or there can be external CSU/DSU and a serial port connection to the router.

Possibly Branch C has some other equipment involved, which is configured as master (like an external CSU/DSU). Alternatively, the two routers in A and C may just happen to have internal clocks that run almost exactly with the same speed...
It just went down again.

This is showing on Serial 0/0/0:0 at Branch A.

26 input errors, 26 CRC, 6 frame, 3 overrun, 0 ignored, 8 abort

Going to switch over from T1 0/0 to T1 0/1 when we can afford a planned downtime. Probably this afternoon.
I notice a setting "no-network-clock-participate" on wic 0 and wic 1. Any ideas if this is giving is the problems?
You will want to remove that, for the 'clock source line' to actually work.
I did but it came down again some time after hours. I checked remotely. I suspect that only the port is bad and not the whole T1/E1 VWIC. So I plan to only move Branch B to the other card that is installed so the entire network isn't down. Hopefully that will fix the issue.

Ill post findings as soon as I can.

Thanks to everyone for your help so far.
Ok, our ISP did not detect any errors or signs of trouble on their T1 circuit. In conjunction with our routers showing the errors past their equipment I made changes to Branch A. I moved Branch B connection from Serial 0/0/0:0 on VWIC 0 to Serial 0/1/0:0 on VWIC 1. It immediately established the connection which is a good sign. I will be monitoring the connection for the next few days.

Also, it is a P2P connection so the ISP does not supply the clock timings. So I reverted Branch A back to Clock Source Internal. Also made both WIC's network-participate enabled.
Well the connection remained in good active status for what I will guess is 36+ hours but went down sometime around 1 a.m. on May 13th.

Here is the "sh int Serial" report from Branch A, it uses a VWIC2-2MFT-T1/E1

Serial0/1/0:0 is up, line protocol is up
  Hardware is GT96K Serial
  Description: T1 Circuit to Branch B
  Internet address is x.x.x.251/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation PPP, LCP Open
  Open: CDPCP, IPCP, loopback not set
  Keepalive set (10 sec)
  Last input 00:00:03, output 00:00:02, output hang never
  Last clearing of "show interface" counters 1d21h
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 11781
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/11769 (size/max total/threshold/drops)
     Conversations  0/26/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 0 bits/sec, 1 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     2179446 packets input, 307112165 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 4 giants, 0 throttles
     455 input errors, 455 CRC, 153 frame, 61 overrun, 0 ignored, 225 abort
     2983131 packets output, 3103844299 bytes, 0 underruns
     0 output errors, 0 collisions, 31 interface resets
     0 output buffer failures, 0 output buffers swapped out
     10 carrier transitions
  Timeslot(s) Used:1-24, SCC: 0, Transmitter delay is 0 flags

It had very little errors through out the 36+ hours but looks like a huge jump after it had gone down.

Here is the "sh int Serial" report from Branch B. (it shows no errors but I believe that is because the unit was powered down.) It uses a WIC 1DSU-T1 V2

Serial0 is up, line protocol is up
  Hardware is PQUICC with Fractional T1 CSU/DSU
  Description: T1 Circuit to Branch A
  Internet address is x.x.x.252/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 1/255, rxload 2/255
  Encapsulation PPP, LCP Open
  Open: CDPCP, IPCP, loopback not set
  Last input 00:00:00, output 00:00:00, output hang never
  Last clearing of "show interface" counters 00:33:47
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/0 (size/max total/threshold/drops)
     Conversations  0/3/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 16000 bits/sec, 2 packets/sec
  5 minute output rate 0 bits/sec, 2 packets/sec
     12332 packets input, 7211603 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     11335 packets output, 1255655 bytes, 0 underruns
     0 output errors, 0 collisions, 1 interface resets
     0 output buffer failures, 0 output buffers swapped out
     1 carrier transitions
     DCD=up  DSR=up  DTR=up  RTS=up  CTS=up


For now, my boss is determine to replace the WIC at Branch B and not A. That is the direction we are going at. I will post results as soon as I can. If anyone has anything to add please do. I appreciate all the help I have received so far. You are saving my job.Seriously. It's tough here.
Did you try to change the cables?

Make sure none of the cable is closer than 10cm from any power cable.

It could also be a problem with the physical port in the providers equipment.
Can you clarify how the provider delivers the T1; is it SHDSL, fiber, or over IP?
The way it is set up.

DMARK BOX ------------> TWO-ETHERNET JACK SURFACE MOUNT IN DMARK ROOM-------------------->TWO-ETHERNET JACK SURFACE MOUNT IN SERVER ROOM---------------------->CISCO2821

The DMARK box has two ports. One for each branch. The two cables that connect the DMARK box to the SURFACE MOUNT have been tested and are good. The same goes for the two cables that connect SURFACE MOUNT to the Cisco 2821. I believe the problem lies on the cable run that connects the surface mounts.

Branch A is actually a sister company in an entire different building than the one I work in. So the next time I travel out there I plan to reverse the cables that connect the DMARK to the SURFACE MOUNT and watch for Branch C to go down instead of Branch B. I can also test the run with a cable tester but I've learned that even if the cable is tested and results in good there could be an interference issue.

As far as how the T1 is delivered it is by copper from what my boss told me. I called our sales/tech rep to get the correct term telcomm companies use to have a clear response for you.

Thanks.
After clearing out the counters on the routers after it drops the connection, it is Branch A that keeps getting errors. Not a high amount as it did on the other port/WIC. I soon learned that the router at Branch B we have had for a long time now and always had the same problems. I did not know this before I cam aboard. I will be at Branch A on Thursday to test my theory of the cable runs, however I managed to get an approval to replace the Cisco 1721 router with a Cisco 1921 as well as a new WIC for it.

Once this has all taken place I will be awarding points for those who helped track down the issue.

Thank you for your patience.