securenetworks
asked on
Cisco Router Reboots Randomly - Need Help ASAP
I currently have a Cisco 2821 at a remote site that seems to reboot every 10 hours or so. The problem either lies with the hardware or the software. Here is the pertinent information (sho context) and (sho region):
-------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---
cb_iup#sho context
System was restarted by bus error at PC 0x410B01AC, address 0x15A3C78B at 07:45:29 UTC Tue Oct 24 20
06
2800 Software (C2800NM-ADVSECURITYK9-M), Version 12.4(5), RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Compiled Tue 01-Nov-05 00:52 by alnguyen
Image text-base: 0x400A1B78, data-base: 0x41C20000
Stack trace from system failure:
FP: 0x44039238, RA: 0x410B01AC
FP: 0x44039258, RA: 0x410ED72C
FP: 0x440393C8, RA: 0x410BBE44
FP: 0x44039488, RA: 0x410BD99C
FP: 0x440394B8, RA: 0x410C083C
FP: 0x440396A8, RA: 0x40EF50DC
FP: 0x440396F8, RA: 0x40EF3880
FP: 0x44039748, RA: 0x40EF3B50
Fault History Buffer:
2800 Software (C2800NM-ADVSECURITYK9-M), Version 12.4(5), RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Compiled Tue 01-Nov-05 00:52 by alnguyen
Signal = 10, Code = 0x43470000, Uptime 03:32:02
$0 : 00000000, AT : 00000000, v0 : 00020000, v1 : 00120000
a0 : 15A3C78B, a1 : 450C72CC, a2 : 00000006, a3 : 3F695EAC
t0 : 00000000, t1 : 00000001, t2 : 00000006, t3 : 440395D0
t4 : 441B2B4C, t5 : 44039620, t6 : 4403961C, t7 : 44039618
s0 : 000005C4, s1 : 450C72CC, s2 : 441B2B4C, s3 : 3F695E98
s4 : 00000000, s5 : 4402C4CC, s6 : 000005C4, s7 : 00000000
t8 : 44039490, t9 : 00000000, k0 : 3040A801, k1 : A000F000
gp : 432D6460, sp : 44039238, s8 : 40081951, ra : 410ED72C
EPC : 410B01AC, SREG : 3400FF03, Cause : 00000010
Error EPC : BFC00BBC, BadVaddr : 15A3C78B
CacheErr : E0C6A6E3, DErrAddr0 : 050C7300,
DErrAddr1 : 04793960
cb_iup#sho region
Region Manager:
Start End Size(b) Class Media Name
0x0F400000 0x0FFFFFFF 12582912 Iomem R/W iomem:(uncached_iomem_regi on)
0x3F400000 0x3FFFFFFF 12582912 Iomem R/W iomem
0x40000000 0x4F3FFFFF 255852544 Local R/W main
0x4000F000 0x41C1FFFF 29429760 IText R/O main:text
0x41C20000 0x432D005F 23789664 IData R/W main:data
0x432D0060 0x436F035F 4326144 IBss R/W main:bss
0x436F0360 0x4F3FFFFF 198245536 Local R/W main:heap
0x80000000 0x8F3FFFFF 255852544 Local R/W main:(main_k0)
0xA0000000 0xAF3FFFFF 255852544 Local R/W main:(main_k1)
Free Region Manager:
Start End Size(b) Class Media Name
-------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------
It looks as though the the router is trying to address 0x15A3C78B which is well outside the memory regions shown by the router. So it is possible it is an IOS issue. I am fairly new at this and would like a 2nd opinion. If we cannot identify the problem within the next two hours we will be ordering a new Cisco router to ensure uptime at this location. Thanks in advance.
--------------------------
cb_iup#sho context
System was restarted by bus error at PC 0x410B01AC, address 0x15A3C78B at 07:45:29 UTC Tue Oct 24 20
06
2800 Software (C2800NM-ADVSECURITYK9-M),
Technical Support: http://www.cisco.com/techsupport
Compiled Tue 01-Nov-05 00:52 by alnguyen
Image text-base: 0x400A1B78, data-base: 0x41C20000
Stack trace from system failure:
FP: 0x44039238, RA: 0x410B01AC
FP: 0x44039258, RA: 0x410ED72C
FP: 0x440393C8, RA: 0x410BBE44
FP: 0x44039488, RA: 0x410BD99C
FP: 0x440394B8, RA: 0x410C083C
FP: 0x440396A8, RA: 0x40EF50DC
FP: 0x440396F8, RA: 0x40EF3880
FP: 0x44039748, RA: 0x40EF3B50
Fault History Buffer:
2800 Software (C2800NM-ADVSECURITYK9-M),
Technical Support: http://www.cisco.com/techsupport
Compiled Tue 01-Nov-05 00:52 by alnguyen
Signal = 10, Code = 0x43470000, Uptime 03:32:02
$0 : 00000000, AT : 00000000, v0 : 00020000, v1 : 00120000
a0 : 15A3C78B, a1 : 450C72CC, a2 : 00000006, a3 : 3F695EAC
t0 : 00000000, t1 : 00000001, t2 : 00000006, t3 : 440395D0
t4 : 441B2B4C, t5 : 44039620, t6 : 4403961C, t7 : 44039618
s0 : 000005C4, s1 : 450C72CC, s2 : 441B2B4C, s3 : 3F695E98
s4 : 00000000, s5 : 4402C4CC, s6 : 000005C4, s7 : 00000000
t8 : 44039490, t9 : 00000000, k0 : 3040A801, k1 : A000F000
gp : 432D6460, sp : 44039238, s8 : 40081951, ra : 410ED72C
EPC : 410B01AC, SREG : 3400FF03, Cause : 00000010
Error EPC : BFC00BBC, BadVaddr : 15A3C78B
CacheErr : E0C6A6E3, DErrAddr0 : 050C7300,
DErrAddr1 : 04793960
cb_iup#sho region
Region Manager:
Start End Size(b) Class Media Name
0x0F400000 0x0FFFFFFF 12582912 Iomem R/W iomem:(uncached_iomem_regi
0x3F400000 0x3FFFFFFF 12582912 Iomem R/W iomem
0x40000000 0x4F3FFFFF 255852544 Local R/W main
0x4000F000 0x41C1FFFF 29429760 IText R/O main:text
0x41C20000 0x432D005F 23789664 IData R/W main:data
0x432D0060 0x436F035F 4326144 IBss R/W main:bss
0x436F0360 0x4F3FFFFF 198245536 Local R/W main:heap
0x80000000 0x8F3FFFFF 255852544 Local R/W main:(main_k0)
0xA0000000 0xAF3FFFFF 255852544 Local R/W main:(main_k1)
Free Region Manager:
Start End Size(b) Class Media Name
--------------------------
It looks as though the the router is trying to address 0x15A3C78B which is well outside the memory regions shown by the router. So it is possible it is an IOS issue. I am fairly new at this and would like a 2nd opinion. If we cannot identify the problem within the next two hours we will be ordering a new Cisco router to ensure uptime at this location. Thanks in advance.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Given the deferral notice, it looks like a serious bug in both versions. If you have Smartnet, you can send a crashinfo file to Cisco for analysis, but the fix will still be an updated IOS. I'd go ahead with the fix while waiting for the analysis.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Yes, I had seen both of those when I logged into our Cisco account and did some research. In re-reading I don't believe it to be a hardware error because the faulty memory address shown in the error message would have to indicate a valid memory address within a valid addressable range shown in the 'sho region' command. The address is outside those bounds. We have a couple older versions of IOS software but unfortunately we don't have channel partner status currently and are unable to download a newer version. It just bothers me that there were no errors with the original 'ipbase' IOS and it just started throwing errors without even a config change. Could you imagine faulty T1's being able to affect a router as the T1's we have at the site are bouncing quite a bit as an issue separate from the IOS issue?
I can't imaging a bouncing T1 causing this type issue. It could be part of the software bugs that cause the bounce before crashing.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
cb_iup#sho flash
-#- --length-- -----date/time------ path
1 14892052 Dec 15 2005 18:48:22 +00:00 c2800nm-ipbase-mz.124-3a.b
2 1649 Dec 15 2005 18:56:14 +00:00 sdmconfig-28xx.cfg
3 4052480 Dec 15 2005 18:56:34 +00:00 sdm.tar
4 812032 Dec 15 2005 18:56:50 +00:00 es.tar
5 1007616 Dec 15 2005 18:57:06 +00:00 common.tar
6 1038 Dec 15 2005 18:57:18 +00:00 home.shtml
7 113152 Dec 15 2005 18:57:30 +00:00 home.tar
8 511939 Dec 15 2005 18:57:42 +00:00 128MB.sdf
9 19312988 Oct 23 2006 15:21:58 +00:00 c2800nm-advsecurityk9-mz.1
10 217839 Oct 24 2006 04:11:48 +00:00 crashinfo_20061024-041149
11 195959 Oct 24 2006 07:45:28 +00:00 crashinfo_20061024-074529
12 207203 Oct 24 2006 14:43:10 +00:00 crashinfo_20061024-144311
22667264 bytes available (41349120 bytes used)
I had been running the 'ipbase' IOS, I updated to the 'advsecurity' IOS last night, then today changed the boot file back to 'ipbase'. There was a large number of crashdump files (150+) occurring under 'ipbase', I deleted those, then updated to 12.4(5) and was still receiving crashdumps so I just recently reverted back to 'ipbase' so the crashdump files below are most likely from 12.4(5)