Link to home
Start Free TrialLog in
Avatar of acameron
acameron

asked on

Netra X1 rebooting all the time

I have a Netra X1 here that was dropped on my desk as "broken" I hooked it up to my console port and logged in, about every 10 minutes or so the thing just reboots itself!  No reason, no warning just blam reboots.  I dont see any errors in the logs and the warranty has run out.  Anyone have any ideas here?
Avatar of shivsa
shivsa
Flag of United States of America image

try to take to ok prompt.
with <STOP A>
if this does not work then try
~# .
once it come to ok prompt, try to reset-all. and try boot kadb.
check for the messages or errors and post it here.
Avatar of acameron
acameron

ASKER

None of those commands work, I'm a little new to sun so please be specific.  Half the Linux commands I know dont even seem to work.
If u are accessing netra remotely u can get the ok promot like this.
 get into the "telnet" program (hitting ctrl-], you should get the "telnet> " prompt)
 enter the command
      send break, this will drop to ok prompt.
I am in via a console port.
fujbwcraw console login: root
Password:
Last login: Thu Nov  6 14:14:07 on console
Nov  6 14:21:49 fujbwcraw login: ROOT LOGIN /dev/console
Sun Microsystems Inc.   SunOS 5.8       Generic February 2000
# Type  'go' to resume
ok reset-all
Res
LOM event: +0h2m16s host reset
etting ...


Did reset all, now what?
now grab the messages when it try to boot.
at ok prompt type boot kadb
once it will drop to kadb prompt.
type
$<msgbuf

Here are the results.  Nothing looks overly bad.

kadb[0]: $<msgbuf
SunOS Release 5.8 Version Generic_108528-15 64-bit
3000007d823:    Copyright 1983-2001 Sun Microsystems, Inc.  All rights reserved.

3000007d403:    Ethernet address = 0:3:ba:5:e3:de
3000007cfe0:    mem = 130200K (0x7f26000)
3000007cbc0:    avail mem = 119955456
3000007c7a3:    root nexus = Sun Netra X1 (UltraSPARC-IIe 500MHz)
3000007c383:    pcipsy0 at root: UPA 0x1f 0x0
30000223ee3:    pcipsy0 is /pci@1f,0
30000223ac2:    PCI-device: ide@d, uata0
300002236a3:    uata0 is /pci@1f,0/ide@d
30000223280:    dad0 at pci10b9,52290
30000222e60:     target 0 lun 0
30000222a43:    dad0 is /pci@1f,0/ide@d/dad@0,0
30000222620:            <ST340824A cyl 19156 alt 2 hd 16 sec 255>
30000222207:    root on /pci@1f,0/ide@d/disk@0,0:a fstype ufs
300001f7da2:    PCI-device: isa@7, ebus0
300001f7980:    su0 at ebus0: offset 0,3f8
300001f7563:    su0 is /pci@1f,0/isa@7/serial@0,3f8
300001f7140:    su1 at ebus0: offset 0,2e8
300001f6d23:    su1 is /pci@1f,0/isa@7/serial@0,2e8
300001f6900:    cpu0: SUNW,UltraSPARC-IIe (upaid 0 impl 0x13 ver 0x14 clock 500
MHz)
300001f64df:    dmfe0: Davicom DM9102 (v1.12): type "ether" mac address 00:03:ba
:05:e3:de
300001f60c2:    PCI-device: ethernet@c, dmfe0
300005a7c23:    dmfe0 is /pci@1f,0/ethernet@c
300005a77ff:    dmfe1: Davicom DM9102 (v1.12): type "ether" mac address 00:03:ba
:05:e3:df
300005a73e2:    PCI-device: ethernet@5, dmfe1
300005a6fc3:    dmfe1 is /pci@1f,0/ethernet@5
300005a68e3:    dump on /dev/dsk/c0t0d0s1 size 1025 MB
300005a64c0:    NOTICE: dmfe0: PHY 1 link down
300005a60a0:    NOTICE: dmfe1: PHY 1 link down
300007dfc40:    lom0 at ebus0: offset 0,8010
300007df823:    lom0 is /pci@1f,0/isa@7/SUNW,lomh@0,8010
300007df3ff:    11/6/2003 20:7:39 GMT LOM time reference
300007dfda2:    pseudo-device: tod0
300005a6623:    tod0 is /pseudo/tod@0
300007df982:    pseudo-device: pm0
300005a7123:    pm0 is /pseudo/pm@0
300005a6200:    power0 at ebus0: offset 0,2000
300005a7963:    power0 is /pci@1f,0/isa@7/power@0,2000
300005a6a42:    pseudo-device: vol0
300001f6223:    vol0 is /pseudo/vol@0
try to put :c now on kadb.
it will continue to boot.
capture any failure/warning/error this time, and post it here, then we can get the idea why it is looping in reboot.

kadb[0]: :C
bad modifier
kadb[0]: :c

#
INIT: New run level: 6
The system is coming down.  Please wait.
System services are now being stopped.
Print services stopped.
Nov  7 06:09:00 fujbwcraw syslogd: going down on signal 15
The system is down.
syncing file systems... done
rebooting...
Res
LOM event: +15h2m11s host reset
etting ...


Sun Netra X1 (UltraSPARC-IIe 500MHz), No Keyboard
OpenBoot 4.0, 128 MB memory installed, Serial #50717662.
Ethernet address 0:3:ba:5:e3:de, Host ID: 8305e3de.



Executing last command: boot                                          
Boot device: disk  File and args:
SunOS Release 5.8 Version Generic_108528-15 64-bit
Copyright 1983-2001 Sun Microsystems, Inc.  All rights reserved.
configuring IPv4 interfaces: dmfe0.
Hostname: fujbwcraw
NOTICE: dmfe0: PHY 1 link down
NOTICE: dmfe1: PHY 1 link down
The system is coming up.  Please wait.
checking ufs filesystems
/dev/rdsk/c0t0d0s7: is clean.
11/7/2003 11:11:11 GMT LOM time reference
starting rpc services: rpcbind done.
Setting netmask of dmfe0 to 255.255.255.248
Setting default IPv4 interface for multicast: add net 224.0/4: gateway fujbwcraw
syslog service starting.
Print services started.
Starting the NetIQ Endpoint.
/dev/bd.off: not a serial device.
volume management starting.
fujbwcraw: bad value
Wnn6: Key License Server started....

Nihongo Multi Client Server (Wnn6 R2.34)
Can't get host information from hostname
Nov  7 06:11:58 fujbwcraw /usr/lib/snmp/snmpdx: unable to get my IP address: gethostbyname(fujbwcraw) failed [h_errno: try again(2)]
The system is ready.
SWAG:  I can not say this is the reason BUT, in a previous life as a Unix administrator, I MIGHT have heard of someone playing a joke on a co-worker that he was not too fond of.  They modified the crontab for root so that every 30 minutes it would perform an "init 6."  I know this was years ago, but you might want to check your crontab...just in case.
Checked em all, nothing.  Which is a bummer as an easy fix would have been nice ;)
everything looks fine here.

i do not understand why it is rebooting all the time.
run level 6 is reboot. and i think there might some reference some where when it reboot itself.

was this system used for somekind of testing or all, since some may set some kind of tests which keep booting the system in every 10 minutes.
look at /etc/inittab the start of it should be something like this: there should be one line like this.

id:3:initdefault:







check your watchdog reset setting to capture your error most of X1 system has this problem and you need to replace your board or your CPU (X1 is also v100)
ASKER CERTIFIED SOLUTION
Avatar of modulo
modulo

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial