Link to home
Start Free TrialLog in
Avatar of sunray_2003
sunray_2003Flag for United States of America

asked on

This one Beats me ..

Here is the setup

Cisco router 3660 (My location)  <------TI line ------> Smart Jack (Remote location) <------T1 cable -----> Adtran 3250
router  <-------Ethernet cable -----> SMC Switch  <------- Ethernet cable ----> Polycom (video conferencing equipment)

The SMC switch is placed as we have another Adtran dual port router connected to it..
Basic problem is entire network going down hence I have disconnected the dual port router from SMC switch to tackle the
problem.

At the Local location , I have got Netcrunch Management tool and I have got the above setup so that the tool can ping each IP
address related to the above network and check if there is connectivity or not.
For the past one week , the network going was down in the sense, I am seeing RED icons in my management tool because the tool cannot ping those IP addresses.

Friday morning , I disconnected the dual port router from the switch so that I can track the actual problem. Friday morning
till Monday morning , No problem. Everything got pinged correctly , good condition. I tried to connect to the polycom unit at
the remote location and all good. Now around 1 hr back , the network went down.

Present situation:
------------------
a) Cannot ping any of the routers shown in the above configuration

b)
Show interface on the CISCO router shows this for that T1 card

**************
Serial1/0:0 is up, line protocol is down
  Hardware is DSX1
  Internet address is 172.16.90.10/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation PPP, loopback not set
  Keepalive set (10 sec)
  LCP ACKsent
  Closed: IPCP, CDPCP
  Last input 00:00:01, output 00:00:01, output hang never
  Last clearing of "show interface" counters 5d19h
  Input queue: 0/75/868/0 (size/max/drops/flushes); Total output drops: 45692
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/0 (size/max total/threshold/drops)
     Conversations  0/8/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 0 bits/sec, 0 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     725797 packets input, 131736485 bytes, 0 no buffer
     Received 0 broadcasts, 61 runts, 0 giants, 0 throttles
     868 input errors, 118 CRC, 740 frame, 0 overrun, 0 ignored, 612 abort
     884055 packets output, 334823229 bytes, 0 underruns
     0 output errors, 0 collisions, 11404 interface resets
     0 output buffer failures, 0 output buffers swapped out
     19 carrier transitions
  Timeslot(s) Used:1-24, Transmitter delay is 0 flags
*************

Telnetting to router and giving "sh controller t1"

********
T1 1/0 is up.
  Applique type is Channelized T1
  Cablelength is long gain36 0db
  Description: D1 T1
  No alarms detected.
  alarm-trigger is not set
  Framing is ESF, Line Code is B8ZS, Clock Source is Line.
  Data in current interval (333 seconds elapsed):
     0 Line Code Violations, 0 Path Code Violations
     0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins
     0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs
  Total Data (last 24 hours)
     0 Line Code Violations, 0 Path Code Violations,
     0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins,
     0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs
********

c) At the back of the CISCO router, there are no alarms . Going to the AT&T website and doing a circuit test , no problem.

d) Almost positive that the system never went down from Friday morning till Monday morning because Netcrunch always shows alerts and it didnot have any alert for those IP address during those times.

What I think:
--------------

a) I read in CISCO website about possible reasons why "line protocol could be down" and I saw one of the reasons that
hardware can be getting failed. If that is the case , not sure why it was working great from Friday morning till monday
morning. I am sure I remember there are other reasons for line protocol to go down.

b) Adtran router has got OS version 5.0 and I see they have version 7 now. Remember once Adtran support asked me to upgrade the OS only if there is a problem in the router or If i need to have additional features. May be it is time for me to check that but again not sure how from friday morning till monday morning it was fine..

c) Possible cable problems and Human intervention but None uses this system except for me and my Supervisor and neither of us did any change..

d) Can repeated pinging by netcrunch be an issue.. BTW , this situation never happened before installing this software. I am
yet to stop this management tool and check though I am ruling that option out as I have got another T1 line in my cisco
router which is setup in this management tool and it never fails..

Any suggestion that I can try would be appreciated.
Planning to collect all details as I have to travel to that location to do any changes or implement any suggestion..

Thanks

SR
ASKER CERTIFIED SOLUTION
Avatar of Dr-IP
Dr-IP

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Dr-IP
Dr-IP

Typo, it should have been "pin 1 with pin 4, and pin 2 with pin 5.

SOLUTION
Avatar of Les Moore
Les Moore
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Just a word of experience, 95% of the time when I run into issues like this, it’s Telco related. I test my gear with loop backs first, test the cables with my cable tester, and if all of that checks out I get on the phone and make them send a tech out to test on site. Which frequently they are reluctant to do even though it almost always turns out they have some kind of problem with the line.  
Agree with Dr-IP. The telco will most assuredly "remind" you that you could incur the cost of the dispatch if they don't find anything wrong...that's why you test your own equipment first. Then open a trouble ticket and don't let them off the hook until you run clean with no errors for 48 hours. The longer they have a ticket open, the more "incentive" they get to close it, and only you can let them close it.

Avatar of sunray_2003

ASKER

Sorry guys , I am yet to look at your suggestions. This morning when I came to my office , I am seeing all the connections are back up and I am able to ping and connect to polycom system.
Will this happen if a telco line is having errors i mean go down and up ..

******** Latest config details ******

Serial1/0:0 is up, line protocol is up
  Hardware is DSX1
  Internet address is 172.16.90.10/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation PPP, loopback not set
  Keepalive set (10 sec)
  LCP Open
  Listen: CDPCP
  Open: IPCP
  Last input 00:13:14, output 00:00:08, output hang never
  Last clearing of "show interface" counters 6d16h
  Input queue: 0/75/1156/0 (size/max/drops/flushes); Total output drops: 68008
  Queueing strategy: weighted fair
  Output queue: 0/1000/64/0 (size/max total/threshold/drops)
     Conversations  0/8/256 (active/max active/max total)
     Reserved Conversations 0/0 (allocated/max allocated)
     Available Bandwidth 1152 kilobits/sec
  5 minute input rate 0 bits/sec, 0 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     791338 packets input, 133182265 bytes, 0 no buffer
     Received 0 broadcasts, 68 runts, 0 giants, 0 throttles
     1156 input errors, 142 CRC, 1003 frame, 0 overrun, 0 ignored, 888 abort
     950712 packets output, 336302043 bytes, 0 underruns
     0 output errors, 0 collisions, 16981 interface resets
     0 output buffer failures, 0 output buffers swapped out
     21 carrier transitions
  Timeslot(s) Used:1-24, Transmitter delay is 0 flags

***********************
O/P for  "sh controller t1"

T1 1/0 is up.
  Applique type is Channelized T1
  Cablelength is long gain36 0db
  Description: D1 T1
  No alarms detected.
  alarm-trigger is not set
  Framing is ESF, Line Code is B8ZS, Clock Source is Line.
  Data in current interval (741 seconds elapsed):
     0 Line Code Violations, 0 Path Code Violations
     0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins
     0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs
  Total Data (last 24 hours)
     0 Line Code Violations, 0 Path Code Violations,
     0 Slip Secs, 14 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins,
     3 Errored Secs, 0 Bursty Err Secs, 1 Severely Err Secs, 15 Unavail Secs
T1 1/1 is up.


*********************

SR
Yes. Look at the "carrier transitions"
Before:
>  19 carrier transitions
Today:
>   21 carrier transitions

It appears to be getting worse, or the carrier was conducting tests that caused these errors that were not there the first post:
   Total Data (last 24 hours)
       0 Line Code Violations, 0 Path Code Violations,
 >    0 Slip Secs, 14 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins,               <<
 >    3 Errored Secs, 0 Bursty Err Secs, 1 Severely Err Secs, 15 Unavail Secs    <<
So If I understand correctly , because of line error , the lines can go up sometimes and I can do connections during that time and it can go down during which I cannot even ping.. Is that correct ?

So if the above understanding is correct, my first step is to contact telco and report about those problems ??
>if the above understanding is correct
Correct.
> my first step is to contact telco and report about those problems ??
Absolutely! Refer back to my 4 recommendations above. Step 1 was to report these specific errors to the telco, open a trouble ticket with then and do not allow them to close it until you are satisfied that everything works well for at least 48 hours (2 business days, not weekend)..
OK.. I will do

Thanks and will post back regarding the status..
Also since the Adtran router is UP in the remote location , i telnetted to it and got this information
giving

router#sh interface t1 1/1                              
t1 1/1 is UP            
  T1 coding is B8ZS, framin                          
  Clock source is line, FDL type is ANSI                                        
  Line build-out is 0dB                      
  No remote loopbacks, No network loopbacks                                          
  Acceptance of remote loopback requests enabled                                                

  DS0 Status: 123456789012345678901234
              NNNNNNNNNNNNNNNNNNNNNNNN
  Status Legend: '-' = DS0 is unallocated
                 'N' = DS0 is dedicated (nailed)

  Line Status: -- No Alarms --

  Current Performance Statistics:
    0 Errored Seconds, 0 Bursty Errored Seconds
    0 Severely Errored Seconds, 0 Severely Errored Frame Seconds
    0 Unavailable Seconds, 0 Path Code Violations
    0 Line Code Violations, 0 Controlled Slip Seconds
    0 Line Errored Seconds, 0 Degraded Minutes

  TDM group 1, line protocol is UP
    Encapsulation PPP (ppp 1)
    26497 packets input, 7353951 bytes, 0 no buffer
    0 runts, 0 giants, 0 throttles
    1 input errors, 0 CRC, 1 frame
    0 abort, 0 ignored, 0 overruns
    25792 packets output, 7033278 bytes, 0 underruns
    0 input clock glitches, 0 output clock glitches
    0 carrier lost, 0 cts lost

*********

Just giving more information so that you can deduce anything more from this.  I guess that
1 input error is something to be concerned about ??

Lrmoore,

>> 4) timing issues - which end of this P2P T1 is providing clocking? We went through this exercise in your lab, didn't we?

I never worked on the clock for this particular T1 card on the CISCO end aswell on the Adtran. As you can see, clock source has been line all the time..
Dr.IP or Lrmoore,

I am going to do the loopback test as outlined by Dr.IP in his comment.

I am in the process of doing the loopback plug and I guess the only other thing to do is this

>>change the encapsulation for the serial port to HDLC if that is not what it is set to, hint don’t save it so you can revert back to the original configuration by rebooting the router,

Can anyone give instruction for this ?

>    0 carrier lost, 0 cts lost
This lack of lost carrier at the remote location points even more toward an issue with your local loop at the 3660 end..

I would not suggest changing away from ppp encapsulation because the remote Adtran end is still PPP. I'm not sure it will even do HDLC as I believe that is Cisco proprietary..
As you guys say, It will be better if I do checking on our end first , right ?

I thought of doing 2 tests

a) Loopback plug suggestion
b) Change the T1 cable between smart jack and the CISCO router

Lrmoore,

I am going to change encapsulation to HDLC just to do the loopback test at the CISCO router end.. So it doesnot matter if the Adtran is set at ppp ??
If you are doing a loop back test on the Cisco to make sure it is working properly, no problem. It would even be better if you can get someone to put a loop back plug into the remote smart jack where the T1 is you cant test the line too all the way back to the Cisco router. IE, Cisco----patch cable----local smart jack----{T1}---remote smart jack----loop back plug.

The best time to do this by the way is when the line is down, that way if it tests OK and you put every thing  back and it’s still down you can feel pretty sure where the issue isn’t.  
Ah.. I think i mistook your first comment. I assumed when you said about loopback plug ,is that I take one RJ 45 and have the pin connections as pin1 and pin4 , pin2 and pin 5 and put the plug at the T1 card in CISCO to find if the line comes up or not..

So what I have said above is different from your
>>  Cisco----patch cable----local smart jack----{T1}---remote smart jack----loop back plug.

Right ? or I am getting confused ?
Yes, that's the first step, but you can once you know the router is OK you can also test the line by connecting the router back up, and putting a loop back on plug in the far sides smart jack. If the line is good, the protocol should come up, if not it’s probably got a problem.
So you are saying the first step is , put the plug in the cisco router first and see what happens. is it ?
if yes , then back to my question couple of comments back is to how to change the encapsulation to HDLC ?

If the above statement should/can be done, not sure why lrmoore says
>> I would not suggest changing away from ppp encapsulation because the remote Adtran end is still PPP. I'm not sure it will even do HDLC as I believe that is Cisco proprietary..


Is it because he  assumes that I will be doing your second step of remote smart jack ?

To change the encapsulation to HDLC do the following commands.

Config t
Interface Serial1/0:0
Encapsulation HDLC
End

Then do a “show interface Serial1/0:0” and see if it’s up. If it is, and someone on the other side can put a loop back into the smart jack hook the router back up to the T1 and get them to plug the loop back in and see if the interface comes up, if it doesn’t, providing the loop back is properly made the line is the issue, and if it does come up it’s probably the Adtran.

The reason to change to HDLC encapsulation is because unlike PPP it will come up with a loop back, and you can ping the interface which sends the signal through the loop verifying it function. By the way you should be able to verify the loop back plug on the Adtran by checking to see if the T1 controller shows the line up, but you probably won’t see the serial come up.

PS when you are done, turn the Cisco on and off to return it to it's original configuration.
Dr-IP,

So if I understand this correctly there are 2 methods

one is to use loopback plug at the CISCO end
second is to use the plug at the remote end ..

So according to first, I do these

Config t
Interface Serial1/0:0
Encapsulation HDLC
End

Then do a “show interface Serial1/0:0” and see if it’s up.

If it comesup fine then , there is no problem at the CISCO end , correct ?

So if my understanding above is correct,  do I have to remote the T1 cable , put in the loopback plug and then do the above commands to change the encapsulation or it doesnot matter ?
"one is to use loopback plug at the CISCO end" That is for testing the Cisco router.
"second is to use the plug at the remote end .." Once you know the router is good, you can use it and a loopback on the far end of the T1 to test the T1 by connecting the router back up, putting a loopback in the far end smart jack, and seeing if the protocal comes up.

"If it comesup fine then , there is no problem at the CISCO end , correct ?" Yes

"So if my understanding above is correct,  do I have to remote the T1 cable , put in the loopback plug and then do the above commands to change the encapsulation or it doesnot matter ?" Yes, take the T1 out and put in the loopback, and issue the command to change to HDLC.



Glad I am able to understand some router lingos..
will get back to you guys.. thanks for hanging with me
Hey guys

I put the loopback plug at back of CISCO router.

I gave the commands to change encapsulation and then issued show interface serial1/1:1 command and i get this

*******
Serial1/1:1 is up, line protocol is up (looped)
  Hardware is DSX1
  Internet address is 172.16.96.10/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 254/255, txload 1/255, rxload 1/255
  Encapsulation HDLC, loopback not set
  Keepalive set (10 sec)

*******

Does it prove there is no problem with CISCO ?

SR
Yes, this looks like the Cisco CSU/DSU module is OK.
Now, plug the cable back into the DSU, and find the other end of that cable, and place a FEmale looback plug on it.
You should get the same (looped) condition. If not, replace cable.
If yes, then plug that cable back in where it was, and find the next end (perhaps it goes into a patch panel, then to another closet, then to the smart-jack) and put the female plug on the end of the cable. Do this all the way to the T1 smartjack. This will prove your cabling is OK between the smartjack and the Cisco router.
If you have someone at the other site with a male looback plug, they can plug it into the smart-jack
At each step along the way, if you get (looped) condition, then all is good. If not, you have nailed down the problem.
Now that i had tested the loopback at the cisco end,
I tried to do the same on the other end .. I had plugged the loopback plug at the remote site smartjack..

and then I gave these again

Config t
Interface Serial1/0:0
Encapsulation HDLC
End

Then did a “show interface Serial1/1:1”

and it says


******
Serial1/1:1 is up, line protocol is down.
  Hardware is DSX1
  Internet address is 172.16.96.10/24
  MTU 1500 bytes, BW 1536 Kbit, DLY 20000 usec,
     reliability 254/255, txload 1/255, rxload 1/255
  Encapsulation HDLC, loopback not set
  Keepalive set (10 sec)
**********

Am i doing the correct procedure ?

Why I got this question is when I plugged the loopback plug at the remote end (first to start with)  , DIDNOT change the encapsulation to HDLC and tried
show interface serial1/1:1

it first said

Serial1/1:1 is up, line protocol is down (looped)

but when I changed the encapsulation to HDLC it says

Serial1/1:1 is up, line protocol is down

>Interface Serial1/0:0
>Serial1/1:1 is up, line protocol is down

Do you see the difference in these two? They are two different interfaces.

If you see the loop, then don't see the loop with the hard loop plug at the other end, then this makes it obvious that the only thing in between is the Telco, and the line keeps going up and down.  

Call the telco now and open a trouble ticket. Keep the loopback plugs handy and offer to test with a "hard loop" at the smartjack.

Sorry for the confusion.

I am doing the tests ONLY on serial1/1:1..

So here is my question.

I put the loopback plug at cisco end, changed the encapsulation to hdlc and line protocol showed as
line protocol is up (looped)

So should I be following the same procedure at the remote smart jack end ie... put the loopback plug at the remote smartjack end , then change the command for that serial interface in cisco to hdlc and then check show interface..

or I just have to put the loopback plug and donot have to change the encapsulation to hdlc and see the interface..
You don't have to change the encapsulation for the interface to see the (looped) condition. With the existing PPP, you should see
      >Serial1/1:1 is up, line protocol is down (looped)

The only difference is that with HDLC, the line protocol also comes up:
      >Serial1/1:1 is up, line protocol is up (looped)

All you are looking for at this point is the (looped). The fewer configuration changes you make, the better.

So since I saw looped with the encapsulation ppp after plugging the loopback plug at the remote end , it says there should be nothing wrong in the smart jack end and T1 line should be at fault.

My supervisor is down there at the remote site testing the T1 cable from smart jack to the router and other cables so I should get news from him shortly

SR
Thanks guys. We are planning to open ticket and call the telco guys.. Thanks for ur inputs and patience. Expect one more easy Q from me in this TA in couple of days.

SR
Lrmoore, Dr-IP.

As usual the AT&T guys have done the initial tests they do and closed the ticket. We opened it again and specifically given about the input errors and hopefully they would do some more testing again .

What is puzzling is this. My supervisor had been to the remote site where the adtran router is. He rebooted the entire system and now the T1 line is backup and everything looks normal. The same happened last week but it went down the next day.. I am suspecting the same with happen tommorrow aswell.

Just curious to know why this might happen. Can you guys think of anything.. Why does the router get back to business once rebooting and then goes down..  Is there anything to do with clock synchroniztion on the T1 line ..  
It could have everything to do with the clocking of the T1. Reboot the router, and it may take 24 hours for the clocking to be so out of sync that you start losing the connection..
Ah...

Just out of curiosity .. Is clock issue the main reason most of the time when T1 line is bad or there could be others.. Could you shed some light on that in your experience..
Most of the time (in my humble experience) the telco will swear they did nothing, but everything magically clears up after you press them to conduct extensive testing.  Sometimes they will admit to "facilites" issues (telco-speak for the actual copper under the street near your building, up to the smart jack) or they have a cross-connect (DAX) configured wrong somewhere.
Clocking is normally only an issue on dedicated point-point leased lines which are generally falling out in favor of Frame-relay or other newer technologies. Some providers provide the T1 clock source to both ends, some providers require you to provide your own clocking on one end. You really need to verify this with the telco.
The next most common issue is wiring between the smartjack and the router/DSU. You've pretty much ruled this out with your hard loop testing procedure.
Thanks so much for really shedding sunlight with your experience..

SR
Lrmoore,

My supervisor just got  a voicemail from AT&T . From what he understands it looks like "we have to set the clocking".
As far as I know , we never changed anything with respect to clock on CISCO end or Adtran end.
This T1 line has been working superbly for the past one year and this line going up and down started recently.  

Except for testing the adtran router ( during which i changed clock source as "internal" ) in my lab , the clock source has always been "line"..

Not sure if I have to work on some more.. If yes, will post a Q (another easy hotcake for u)

SR
If you have to set the clocking, then you know from the lab exercise exactly what you need to do..
I have no answer as to how it was working for 1 year without problems.. unless AT&T was providing the clocking, then some tech auditing their circuit configs turned it off. Now that it is the way it was supposed to be, of course there is no problem on their end. I've seen stranger things happen...
just a guess --- <8-}
Thanks again Lrmoore.

Will check tommorrow what they say in the ticket and see how it goes. I am trying to understand from what I see and your experience , the actuality of the situation..
lrmoore,

We got hold of an AT&T guy over the phone. We have asked him to do additional testing and kept the tickets open. He has escalated the issue and hopefully send a LEC to come to our location for further testing.
I also heard the voicemail that we left by AT&T yesterday and from what I understood it looks like AT&T is not providing the clock and hence If I get a confirmation on that , I would let CISCO provide the clock by changing the source as "internal"
and leave adtran (remote end) to be "line".

Just waiting for LEC to stop by..

SR