Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1450
  • Last Modified:

Cisco 2821 dropping out for about 60 seconds.

We have a Cisco 2821 router that drops out for about 60 seconds at random times.  We can go for over a week without the issue, or on certain days have the issue multiple times in a day.  It always comes back after around 60 to 120seconds (seems like a reboot, but logs look different when we manually reboot it).

I have turned on logging (trap level 7) and debuging for gigabitethernet0/0 and 0/1.  I have only received the following log entries when it goes down:
From Syslog logs:
04-21-2011      14:50:38      Local7.Notice      67.90.229.225      24: Apr 21 22:50:37.419: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up

From Cisco Network Assistant Log:
Description: Interface GigabitEthernet0/0, changed state to up

Explanation:  The interface hardware went either up or down.  

Recommended Action:  If the state change was unexpected, confirm the configuration settings
  for the interface.  

Type: LINK-3-UPDOWN

.....
Gigabitethernet0/0 is connect to our ISP's switch
Gigabitethernet0/1 is connect to our internal HP switch
....

This is our running-config:

Building configuration...

Current configuration : 6397 bytes
!
! Last configuration change at 14:58:11 PCTime Thu Apr 21 2011 by root
!
version 12.4
service timestamps debug datetime msec
service timestamps log datetime msec
no service password-encryption
service sequence-numbers
!
hostname Router
!
boot-start-marker
boot-end-marker
!
logging buffered 52000 debugging
enable secret 5 $----------
enable password --------
!
no aaa new-model
!
resource policy
!
clock timezone PCTime -8
clock summer-time PCTime date Apr 6 2003 2:00 Oct 26 2003 2:00
!
!
ip cef
!
!
ip domain name yourdomain.com
ip name-server 65.106.1.196
ip name-server 65.106.7.196
!
!
!
voice-card 0
 no dspfarm
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
crypto pki trustpoint TP-self-signed-728726024
 enrollment selfsigned
 subject-name cn=IOS-Self-Signed-Certificate-728726024
 revocation-check none
 rsakeypair TP-self-signed-728726024
!
!
crypto pki certificate chain TP-self-signed-728726024
 certificate self-signed 02
  3082024B 308201B4 A0030201 02020102 300D0609 2A864886 F70D0101 04050030
  30312E30 2C060355 04031325 494F532D 53656C66 2D536967 6E65642D 43657274
  69666963 6174652D 37323837 32363032 34301E17 0D303831 31313231 36303034
  385A170D 32303031 30313030 30303030 5A303031 2E302C06 03550403 1325494F
  532D5365 6C662D53 69676E65 642D4365 72746966 69636174 652D3732 38373236
  30323430 819F300D 06092A86 4886F70D 01010105 0003818D 00308189 02818100
  C39E6FB2 0A6D3799 C819D8B0 80498444 D39D8C47 C4F8F92C 402F2463 3FF328D3
  2AD8504E 1DA90353 82913F5F 7FB498AB 7201804B 8A153AD1 9B27692B 86EAE98C
  45B0D8D6 36F33A1E 961F82D5 201DA670 49A4F868 3FBFE71C A3672497 AAF0FD84
  FFCA37C8 71F2509B 23C5186A B389AA42 AE86F8A1 7B50ABBC 5BE78957 46C1FFA3
  02030100 01A37530 73300F06 03551D13 0101FF04 05300301 01FF3020 0603551D
  11041930 17821552 6F757465 722E796F 7572646F 6D61696E 2E636F6D 301F0603
  551D2304 18301680 14F7E776 B76C36C3 900BA32C FC804CEF 2DEC399F 3E301D06
  03551D0E 04160414 F7E776B7 6C36C390 0BA32CFC 804CEF2D EC399F3E 300D0609
  2A864886 F70D0101 04050003 81810067 0A53614D B9A56AFB 67646E3E A8D2CE46
  275DA699 6F938485 55F8DBF1 9FD37AED 082E6452 53390DB6 3C58D3ED 9C3C6D04
  C3EED752 12FFA2E3 6E65DE36 61C5E48C B488D929 34A531E3 2A9692D5 85519C4F
  AC925F8A 1269BEEB C2E5F863 A115A189 481E5599 88822E2B 898EF40A 39410AAF
  03269BB8 387E89F4 4DA7212F 48D496
  quit
username root privilege 15 password 0 --------
!
!
class-map match-all HRDP
 match access-group 104
class-map match-all TAIS
 match access-group 103
class-map match-all Win4Net
 match access-group 102
!
!
policy-map SharedBW
 class Win4Net
   police 3000000
 class TAIS
   police 10000000
 class HRDP
   police 262000 conform-action transmit  exceed-action drop
policy-map RateLimit
 class class-default
   police 20000000
  service-policy SharedBW
!
!
!
!
!
!
!
interface GigabitEthernet0/0
 description $ETH-SW-LAUNCH$$INTF-INFO-GE 0/0$$ETH-LAN$
 ip address 65.47.20.166 255.255.255.252
 ip access-group 101 out
 ip nbar protocol-discovery
 ip flow ingress
 ip flow egress
 duplex full
 speed 100
 no mop enabled
 service-policy input RateLimit
 service-policy output RateLimit
!
interface GigabitEthernet0/1
 description $ETH-WAN$
 ip address 64.55.69.193 255.255.255.224 secondary
 ip address 67.90.229.225 255.255.255.224
 ip access-group 101 out
 ip nbar protocol-discovery
 ip flow ingress
 ip flow egress
 rate-limit input access-group 103 10480000 10485760 10485760 conform-action transmit exceed-action drop
 rate-limit input access-group 102 3144000 3145728 3145728 conform-action transmit exceed-action drop
 rate-limit output access-group 103 10480000 10485760 10485760 conform-action transmit exceed-action drop
 rate-limit output access-group 102 3144000 3145728 3145728 conform-action transmit exceed-action drop
 ip route-cache flow
 duplex auto
 speed auto
 no mop enabled
!
ip route 0.0.0.0 0.0.0.0 65.47.20.165
!
ip flow-cache timeout active 1
ip flow-export source GigabitEthernet0/1
ip flow-export version 5
ip flow-export destination 67.90.229.227 2055
ip flow-top-talkers
 top 100
 sort-by bytes
 cache-timeout 2000
!
ip http server
ip http access-class 23
ip http authentication local
ip http secure-server
ip http timeout-policy idle 60 life 86400 requests 10000
!
logging trap debugging
logging 67.90.229.227
access-list 101 permit ip 67.90.229.224 0.0.0.31 any
access-list 101 permit ip any any
access-list 102 permit ip host 67.90.229.230 any
access-list 102 permit icmp host 67.90.229.230 any
access-list 102 permit ip any host 67.90.229.230
access-list 102 permit icmp any host 67.90.229.230
access-list 103 permit ip host 67.90.229.238 any
access-list 103 permit ip host 67.90.229.239 any
access-list 103 permit ip host 67.90.229.240 any
access-list 103 permit ip host 67.90.229.241 any
access-list 103 permit ip host 67.90.229.242 any
access-list 103 permit ip host 67.90.229.243 any
access-list 103 permit ip host 67.90.229.244 any
access-list 103 permit ip host 67.90.229.245 any
access-list 103 permit ip host 67.90.229.246 any
access-list 103 permit ip host 67.90.229.247 any
access-list 103 permit ip host 67.90.229.236 any
access-list 103 permit ip host 64.55.69.197 any
access-list 104 permit ip host 67.90.229.235 any
snmp-server community public RO
!
!
!
!
!
!
control-plane
!
!
!
!
!
!
!
!
!
!
banner login ^C
-----------------------------------------------------------------------
Cisco Router and Security Device Manager (SDM) is installed on this device.
This feature requires the one-time use of the username "cisco"
with the password "cisco". The default username and password have a privilege level of 15.

Please change these publicly known initial credentials using SDM or the IOS CLI.
Here are the Cisco IOS commands.

username <myuser>  privilege 15 secret 0 <mypassword>
no username cisco

Replace <myuser> and <mypassword> with the username and password you want to use.

For more information about SDM please follow the instructions in the QUICK START
GUIDE for your router or go to http://www.cisco.com/go/sdm 
-----------------------------------------------------------------------
^C
!
line con 0
 login local
line aux 0
line vty 0 4
 access-class 23 in
 privilege level 15
 password --------
 login local
 transport input telnet ssh
line vty 5 15
 access-class 23 in
 privilege level 15
 password --------
 login local
 transport input telnet ssh
!
scheduler allocate 20000 1000
!
end
----------------------------------------------------------

I seem to get VERY little logs from this, even with debug turned on for the interfaces.  If I turn debug all on, it is to much data.  What else can I do to diagnose this problem?  What other data could I provide that would help?

Thanks,
John




0
jokert
Asked:
jokert
  • 12
  • 8
2 Solutions
 
lrmooreCommented:
What kind of device is physically connected to Gig 0/0?
Is it a cable modem, ISP-provided router, DSL modem, other?
Looks like a physical issue between Gig 0/0 and the external device.
Can you provide output to "show interface gig 0/0"
0
 
jokertAuthor Commented:
Cisco Catalyst 2950 provided by our ISP.  We have a fiber line that runs to the 2950 which tuns in to an electrical handoff (ethernet) to the 2821.

sh int:
GigabitEthernet0/0 is up, line protocol is up
  Hardware is MV96340 Ethernet, address is 0018.b946.6c60 (bia 0018.b946.6c60
  Description: $ETH-SW-LAUNCH$$INTF-INFO-GE 0/0$$ETH-LAN$
  Internet address is 65.47.20.166/30
  MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,
     reliability 255/255, txload 6/255, rxload 5/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 100Mb/s, media type is T
  output flow-control is XON, input flow-control is XON
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:27, output 00:00:00, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 2270000 bits/sec, 431 packets/sec
  5 minute output rate 2626000 bits/sec, 551 packets/sec
     1515456 packets input, 1176385971 bytes, 0 no buffer
     Received 53 broadcasts, 0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 watchdog, 0 multicast, 0 pause input
     0 input packets with dribble condition detected
     1761177 packets output, 929176344 bytes, 0 underruns
     0 output errors, 0 collisions, 2 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 pause output
     0 output buffer failures, 0 output buffers swapped out
GigabitEthernet0/1 is up, line protocol is up
  Hardware is MV96340 Ethernet, address is 0018.b946.6c61 (bia 0018.b946.6c61
  Description: $ETH-WAN$
  Internet address is 67.90.229.225/27
  MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,
     reliability 255/255, txload 111/255, rxload 119/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 100Mb/s, media type is T
  output flow-control is XON, input flow-control is XON
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:00, output 00:00:00, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 46917000 bits/sec, 4718 packets/sec
  5 minute output rate 43634000 bits/sec, 4336 packets/sec
     14524635 packets input, 287134358 bytes, 0 no buffer
     Received 3191 broadcasts, 0 runts, 0 giants, 0 throttles
     2 input errors, 0 CRC, 0 frame, 0 overrun, 2 ignored
     0 watchdog, 0 multicast, 0 pause input
     0 input packets with dribble condition detected
     13467917 packets output, 3649821057 bytes, 0 underruns
     0 output errors, 0 collisions, 1 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 3 pause output
     0 output buffer failures, 0 output buffers swapped out
----

Is there any additional logging I should turn on?  Could you please explain how a physical issue would cause the short drops.  I will most likely be sending my Admin up there right now to replace the ethernet cable.

Thanks!  all help is appreciated...
0
 
jokertAuthor Commented:
something else to note.  We cannot connect (telnet) to the gigabitethernet0/1 when the device is down.  I had my admin unplug the ethernet link between gig 0/0 and the Cisco 2950 and the result was different than what I see from our problem.

I am running a constant ping to different devices.  When we have our problem I cannot ping the router on the IP 67.90.229.225.  If I unplug the cable from gig0/0 I can still ping our router.

In short I am not able to telnet or HTTP (SDM) to the router during the 60 to 120 seconds it is down.  I do both telnet and HTTP to the 67.90.229.225 IP

Hope this makes sense... If not I can try explaining myself again
0
When ransomware hits your clients, what do you do?

MSPs: Endpoint security isn’t enough to prevent ransomware.
As the impact and severity of crypto ransomware attacks has grown, Webroot has fought back, not just by building a next-gen endpoint solution capable of preventing ransomware attacks but also by being a thought leader.

 
jon1966Commented:
I suspect the 2950 is reloading and the two algx.net circuits traverse through that 2950 physically which is why both interfaces are unreachable when the 2950 is reloading.

All your outbound traffic is destined to 65.47.20.165 instead of being load balanced via the command "!ip route 0.0.0.0 0.0.0.0 65.47.20.165".

You need to change your flow export to version 9 if you are going to attempt egress flows with the command "ip flow egress" as egress is not supported on version 5.  On g0/1 you have ip flow ingress (v9 and now v5 also on very recent IOS) and ip route-cache flow (v5) which are ingress flow commands.
0
 
jon1966Commented:
Hi, just wanted to rewrite this a bit more clearer...

I suspect the 2950 is reloading and the two algx.net circuits traverse through that 2950 physically which is why both interfaces are unreachable when the 2950 is reloading.

All your outbound traffic is destined to 65.47.20.165 instead of being load balanced due to the command "!ip route 0.0.0.0 0.0.0.0 65.47.20.165".

I noticed that you are using the commands "ip flow egress" which is a version 9 netflow command but within your configuration you specify version 5.  You need to change your flow export to version 9 if you are going to attempt egress flows.  
0
 
jokertAuthor Commented:
I change flow-export to version 9.

We reset the 2950 (power) to see if it cause the same issues.  It was somewhat different and gave the below syslog output:

This is what it looks likes when we reset the 2950 (I was able to ping 67.90.229.225 during this):
04-22-2011      08:42:12      Local7.Notice      67.90.229.225      28: 000036: Apr 22 16:42:10.931: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-22-2011      08:42:07      Local7.Notice      67.90.229.225      27: 000035: Apr 22 16:42:06.063: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down

This is the actual problem (at this point I cannot ping 67.90.229.225):
04-21-2011      14:50:38      Local7.Notice      67.90.229.225      24: Apr 21 22:50:37.419: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up

I am very confused on when we have this problem, I cannot ping or connect over gig0/1 67.90.229.225 IP when it says that gig0/0 is the one coming backup.  I feel I do not get logs when gig0/1 goes up or down.  I have PRTG net monitor on the 2821 as well and it shows a memory and CPU drop for the 60sec that I cannot connect.  It honestly seems like some sort of internal reset, are there logs for this?  If we manually reset the 2821 I get different log results than the actual error, but it seems like a reset of the device.

 network layout
This is the layout simplified.  We do not have a setup for load balancing that I know of and only have the gateway of 65.47.20.165 available to us.

thanks!
0
 
jon1966Commented:
Hi, would you send over the output from "sh cdp neighbors" for each device involved?
0
 
jokertAuthor Commented:
Router#sh cdp neighbors gigabitethernet0/0
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
LBLKWACXH00.CL.us.xo.net
                 Gig 0/0            174          S I      WS-C2950G Fas 0/1


Router#sh cdp neighbors gigabitethernet0/1
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
HP ProCurve Switch 2626(001321-a31200)
                 Gig 0/1            141           S       HP 2626   11
0
 
jokertAuthor Commented:
Is there any other logging I can or should turn on?  A way to find if the device is reseting.  I get so very few log entries from syslog that it seems hard to troubleshoot.  The only debug logging turned on is for gig0/0 and gig0/1 and I still have not seen any debug info come over.  I want to know more about what the router is doing right before the issue.  Here are the syslog's i've gotten (some are already above) but this is the total logs.. seems weak to me.

04-22-2011      09:14:15      Local7.Notice      67.90.229.225      25: Apr 22 17:14:16.899: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-22-2011      08:53:41      Local7.Notice      67.90.229.225      24: Apr 22 16:53:43.407: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-22-2011      08:42:12      Local7.Notice      67.90.229.225      28: 000036: Apr 22 16:42:10.931: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-22-2011      08:42:07      Local7.Notice      67.90.229.225      27: 000035: Apr 22 16:42:06.063: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
04-21-2011      14:50:38      Local7.Notice      67.90.229.225      24: Apr 21 22:50:37.419: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-21-2011      14:23:23      Local7.Notice      67.90.229.225      24: Apr 21 22:23:23.395: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-21-2011      13:54:19      Local7.Info      67.90.229.225      27: Apr 21 21:54:20.000: %SYS-6-CLOCKUPDATE: System clock has been updated from 13:55:48 PCTime Thu Apr 21 2011 to 13:54:20 PCTime Thu Apr 21 2011, configured from console by root on console.
04-21-2011      13:54:18      Local7.Info      67.90.229.225      26: Apr 21 21:55:47.787: %SYS-6-CLOCKUPDATE: System clock has been updated from 13:55:47 PCTime Thu Apr 21 2011 to 13:55:47 PCTime Thu Apr 21 2011, configured from console by root on console.
04-21-2011      13:54:18      Local7.Info      67.90.229.225      25: Apr 21 21:55:47.275: %SYS-6-CLOCKUPDATE: System clock has been updated from 21:55:47 UTC Thu Apr 21 2011 to 13:55:47 PCTime Thu Apr 21 2011, configured from console by root on console.
04-21-2011      13:44:39      Local7.Notice      67.90.229.225      24: Apr 21 21:46:08.895: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-21-2011      13:38:22      Local7.Notice      67.90.229.225      23: Apr 21 21:39:52.099: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-21-2011      13:35:37      Local7.Notice      67.90.229.225      22: Apr 21 21:37:07.399: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-21-2011      13:22:01      Local7.Notice      67.90.229.225      24: Apr 21 21:23:29.919: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-21-2011      11:56:35      Local7.Notice      67.90.229.225      23: Apr 21 19:58:03.979: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-21-2011      11:45:38      Local7.Notice      67.90.229.225      22: Apr 21 19:47:07.419: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-21-2011      10:45:37      Local7.Notice      67.90.229.225      22: Apr 21 18:47:06.391: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-21-2011      07:05:32      Local7.Notice      67.90.229.225      22: Apr 21 15:07:00.251: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-20-2011      16:15:37      Local7.Notice      67.90.229.225      22: Apr 21 00:17:03.399: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-20-2011      10:24:42      Local7.Notice      67.90.229.225      22: Apr 20 18:26:06.403: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to up
04-20-2011      07:25:41      Local7.Notice      67.90.229.225      30: Apr 20 15:26:32.648: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-18-2011      08:32:27      Local7.Notice      67.90.229.225      29: Apr 18 16:33:19.003: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-16-2011      13:40:29      Local7.Notice      67.90.229.225      25: Apr 16 21:41:24.442: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
04-15-2011      07:17:18      Local7.Notice      67.90.229.225      24: Apr 15 15:18:14.282: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-14-2011      14:19:26      Local7.Notice      67.90.229.225      23: Apr 14 22:20:23.430: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
04-14-2011      09:01:20      Local7.Notice      67.90.229.225      22: Apr 14 17:02:16.491: %SYS-5-CONFIG_I: Configured from console by root on vty0 (67.90.229.227)
0
 
jon1966Commented:
According to your description and the diagram, there is no way that if G0/0 is unplugged that you can still ping the router, unless there is an alternate path to it not described or displayed.

I am suspecting that your admin is unplugging the wrong cable during the test.

I really feel that the 2950 is reloading.  The next time it happens, do a "sh ver" on the 2950 to display its uptime, that should correspond with the outage.
0
 
jon1966Commented:
Just to be clear, when the admin unplugs G0/0, is the ping coming from the LAN or WAN side?
0
 
jokertAuthor Commented:
I have no access to configure the 2950, it was provided by our ISP.  It is physically i our building so I can remove the cables or power, but not log onto it.

When I have my guy pull the plug on gig0/0 I am still able to ping 67.90.229.225, because i'm the workstation in the above diagram.  To answer the question i am pinging from the LAN side.  I cannot ping outside internet IP's though.  Right after the 2950 is the outside world.

I don't think it is the 2950 (although I wish it was) that is the cause of the problem.  I can turn that off all the way and still access our 2821 without issues (just not internet access).  But when the problem happens, I cannot acess the 2821 at all.

Tomorrow (saturday) I am going to install another switch after the 2950 so I can ping from the outside to gig0/0.  I can't just plug things into the 2950 because the ISP only has the one port active.
0
 
jokertAuthor Commented:
Ok.. so I don't have access to the 2950, but I did try the sh ver command on our 2821.  It shows this:

Cisco IOS Software, 2800 Software (C2800NM-ADVIPSERVICESK9-M), Version 12.4(9)T,
 RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2006 by Cisco Systems, Inc.
Compiled Fri 16-Jun-06 22:22 by prod_rel_team

ROM: System Bootstrap, Version 12.4(1r) [hqluong 1r], RELEASE SOFTWARE (fc1)

Router uptime is 4 hours, 27 minutes
System returned to ROM by bus error at PC 0x4253B1D4, address 0x23 at 08:52:02 P
CTime Fri Apr 22 2011
System restarted at 08:53:13 PCTime Fri Apr 22 2011


System image file is "flash:c2800nm-advipservicesk9-mz.124-9.T.bin"

There may be something in that.  8:53 is the last time we had the problem.  Does it mean it rebooted due to a ROM bus error?  could that be it?
0
 
jokertAuthor Commented:
More info:

Router#show context

System was restarted by bus error at PC 0x4253B1D4, address 0x23 at 08:52:02 PCT
ime Fri Apr 22 2011
2800 Software (C2800NM-ADVIPSERVICESK9-M), Version 12.4(9)T, RELEASE SOFTWARE (f
c1)
Technical Support: http://www.cisco.com/techsupport
Compiled Fri 16-Jun-06 22:22 by prod_rel_team
Image text-base: 0x400B100C, data-base: 0x43480000


Stack trace from system failure:
FP: 0x46C29CE0, RA: 0x4253B1D4
FP: 0x46C29D20, RA: 0x4253A0A4
FP: 0x46C29D38, RA: 0x4250DE38
FP: 0x46C29D98, RA: 0x425168D8
FP: 0x46C29DB0, RA: 0x41A3A8AC
FP: 0x46C29E00, RA: 0x41A3BC9C
FP: 0x46C29E20, RA: 0x41A3BD58
FP: 0x46C29E50, RA: 0x41A3BEFC

Fault History Buffer:
2800 Software (C2800NM-ADVIPSERVICESK9-M), Version 12.4(9)T, RELEASE SOFTWARE (f
c1)
Technical Support: http://www.cisco.com/techsupport
Compiled Fri 16-Jun-06 22:22 by prod_rel_team
Signal = 10, Code = 0x45F20000, Uptime 17:31:57
$0 : 00000000, AT : 45C40000, v0 : 00000000, v1 : 00000000
a0 : 00000000, a1 : 3F646C0F, a2 : 3F646C0E, a3 : 470DFA58
t0 : 00000001, t1 : 46C29CF8, t2 : 47F6F2A0, t3 : 3F646BEE
t4 : 40096588, t5 : 46238C60, t6 : 46238C5C, t7 : 46238C58
s0 : 00000020, s1 : 483E4960, s2 : 470DFB68, s3 : 46238C00
s4 : 45AE0000, s5 : 00000001, s6 : 00002E56, s7 : 481F2D30
t8 : 470DFA68, t9 : 00000000, k0 : 3040A801, k1 : A000F000
gp : 45C4CDEC, sp : 46C29CE0, s8 : 00000001, ra : 4253B1A4
EPC : 4253B1D4, SREG : 3400FF03, Cause : 0000080C
Error EPC : BFC00E8C, BadVaddr : 00000023
CacheErr : 006B7BF2, DErrAddr0 : 05F1AEB0,
                            DErrAddr1 : 06848C60
0
 
jokertAuthor Commented:
Partial info from sh stat:

Router#show stacks
Minimum process stacks:
Free/Size   Name
4872/6000   USB Startup
5336/6000   Inspect Init Msg
5324/6000   SPAN Subsystem
5336/6000   FLEX DSPRM boot download main
5148/6000   DIB error message
5332/6000   SASL MAIN
2304/3000   allegro libretto init
3212/12000  Init
59124/60000  script background loader
5336/6000   vidb clone Process
5140/6000   RADIUS INITCONFIG
5296/6000   MOP Protocols
2092/3000   Rom Random Update Process
7764/12000  HTTP CP
33684/36000  TCP Command
8412/12000  Virtual Exec
9660/12000  SSH Process

Interrupt level stacks:
Level    Called Unused/Size  Name
  1   144953595   6052/9000  Network interfaces
  2   181443858   8564/9000  DMA/Timer Interrupt
  3           1   8324/9000  PA Management Int Handler
  4         157   8620/9000  Console Uart
  5           0   9000/9000  External Interrupt
  7     4529064   8572/9000  NMI Interrupt Handler

Spurious interrupts: 5924

System was restarted by bus error at PC 0x4253B1D4, address 0x23 at 08:52:02 PCT
ime Fri Apr 22 2011
2800 Software (C2800NM-ADVIPSERVICESK9-M), Version 12.4(9)T, RELEASE SOFTWARE (f
c1)
Technical Support: http://www.cisco.com/techsupport
Compiled Fri 16-Jun-06 22:22 by prod_rel_team
Image text-base: 0x400B100C, data-base: 0x43480000


Stack trace from system failure:
FP: 0x46C29CE0, RA: 0x4253B1D4
FP: 0x46C29D20, RA: 0x4253A0A4
FP: 0x46C29D38, RA: 0x4250DE38
FP: 0x46C29D98, RA: 0x425168D8
FP: 0x46C29DB0, RA: 0x41A3A8AC
FP: 0x46C29E00, RA: 0x41A3BC9C
FP: 0x46C29E20, RA: 0x41A3BD58
FP: 0x46C29E50, RA: 0x41A3BEFC


***************************************************
******* Information of Last System Crash **********
***************************************************



%ALIGN-1-FATAL: Illegal access to a low address 08:52:02 PCTime Fri Apr 22 2011
addr=0x23, pc=0x4253B1D4 , ra=0x4253B1A4 , sp=0x46C29CE0

%ALIGN-1-FATAL: Illegal access to a low address 08:52:02 PCTime Fri Apr 22 2011
addr=0x23, pc=0x4253B1D4 , ra=0x4253B1A4 , sp=0x46C29CE0


08:52:02 PCTime Fri Apr 22 2011: TLB (store) exception, CPU signal 10, PC = 0x4
253B1D4



--------------------------------------------------------------------
   Possible software fault. Upon reccurence,  please collect
   crashinfo, "show tech" and contact Cisco Technical Support.
--------------------------------------------------------------------


-Traceback= 0x4253B1D4 0x4253A0A4 0x4250DE38 0x425168D8 0x41A3A8AC 0x41A3BC9C 0x
41A3BD58 0x41A3BEFC
$0 : 00000000, AT : 45C40000, v0 : 00000000, v1 : 00000000
a0 : 00000000, a1 : 3F646C0F, a2 : 3F646C0E, a3 : 470DFA58
t0 : 00000001, t1 : 46C29CF8, t2 : 47F6F2A0, t3 : 3F646BEE
t4 : 40096588, t5 : 46238C60, t6 : 46238C5C, t7 : 46238C58
s0 : 00000020, s1 : 483E4960, s2 : 470DFB68, s3 : 46238C00
s4 : 45AE0000, s5 : 00000001, s6 : 00002E56, s7 : 481F2D30
t8 : 470DFA68, t9 : 00000000, k0 : 3040A801, k1 : A000F000
gp : 45C4CDEC, sp : 46C29CE0, s8 : 00000001, ra : 4253B1A4
EPC  : 4253B1D4, ErrorEPC : BFC00E8C, SREG     : 3400FF03
MDLO : 00000000, MDHI     : 00000000, BadVaddr : 00000023
CacheErr : 006B7BF2, DErrAddr0 : 05F1AEB0, DErrAddr1 : 06848C60
Cause 0000080C (Code 0x3): TLB (store) exception


=== Start of Crashinfo Collection (08:52:02 PCTime Fri Apr 22 2011) ===

For image:
Cisco IOS Software, 2800 Software (C2800NM-ADVIPSERVICESK9-M), Version 12.4(9)T,
RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2006 by Cisco Systems, Inc.
Compiled Fri 16-Jun-06 22:22 by prod_rel_team



========= Show Alignment =============================
Alignment data for:
2800 Software (C2800NM-ADVIPSERVICESK9-M), Version 12.4(9)T, RELEASE SOFTWARE (f
c1)
Technical Support: http://www.cisco.com/techsupport
Compiled Fri 16-Jun-06 22:22 by prod_rel_team

No alignment data has been recorded.

Total Spurious Accesses 2, Recorded 2

Address  Count  Traceback
      23      1  0x4253B1C0  0x4253A0A4  0x4250DE38  0x425168D8
                 0x41A3A8AC  0x41A3BC9C  0x41A3BD58  0x41A3BEFC
       8      1  0x4253B1D0  0x4253A0A4  0x4250DE38  0x425168D8
                 0x41A3A8AC  0x41A3BC9C  0x41A3BD58  0x41A3BEFC

0
 
jon1966Commented:
OK, wow, I was wrong, it is the 2821 reloading due to a bus error.
0
 
jokertAuthor Commented:
It says if the address is not in the range shown in sh regions that it is due to IOS accessing a invalid memory region.  My region was 0x23.

Any suggestions on what to do now to fix a BUS error?

You suggestion of running sh ver was what unleased all the bus error info.  You would think a BUS error would be sent over with the SYSLOG logs.
0
 
jon1966Commented:
do you have a cisco tac profile?  the easiest next step is to update the ios.
0
 
jokertAuthor Commented:
I was looking for any additional information as to why the router was crashing.  The "show version" command suggested in the post had an BUS ERROR in it that no other logs were showing.  Additional information pointed to IOS accessing a bad memory range which the next step is updating the IOS.
0
 
jokertAuthor Commented:
Thanks jon1966, this was VERY helpful!
0
 
jon1966Commented:
Just for further ammo, I ran the error on cisco.com and found the following:


4-23-2011-8-47-20-AM.jpg
0

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

  • 12
  • 8
Tackle projects and never again get stuck behind a technical roadblock.
Join Now