redhat server got hanging black screen

I have a 4 nodes oracle cluster with redhat "Red Hat Enterprise Linux AS release 4 (Nahant Update 8)" on them. Last Sunday night, two of four nodes got dead and I had to reboot them manually.

I got the message log file as below.

May 17 04:02:04 palmdale syslogd 1.4.1: restart.
May 17 04:02:05 palmdale snmpd[22041]: NET-SNMP version 5.1.2 
May 17 08:34:32 palmdale shutdown: shutting down for system reboot
May 17 08:34:32 palmdale init: Switching to runlevel: 6
May 17 08:34:33 palmdale udevd[2258]: udev done!
May 17 08:34:33 palmdale udevd[2258]: udev done!
May 17 08:34:33 palmdale gpm[7302]: *** info [mice.c(1766)]: 
May 17 08:34:33 palmdale gpm[7302]: imps2: Auto-detected intellimouse PS/2
May 17 08:34:34 palmdale haldaemon: haldaemon -TERM succeeded
May 17 08:34:34 palmdale udevd[2258]: udev done!
May 17 08:34:34 palmdale messagebus: messagebus -TERM succeeded
May 17 08:34:34 palmdale atd: atd shutdown succeeded
May 17 08:34:34 palmdale xfs[7332]: terminating 
May 17 08:34:34 palmdale xfs: xfs shutdown succeeded
May 17 08:34:34 palmdale gpm: gpm shutdown succeeded
May 17 08:34:34 palmdale init.crs: Shutting down Oracle Cluster Ready Services (CRS):
May 17 08:34:34 palmdale logger: Oracle CRSD 9696 set to stop
May 17 08:34:34 palmdale logger: Oracle CRSD 9696 shutdown completed
May 17 08:34:34 palmdale logger: Oracle EVMD set to stop
May 17 08:34:34 palmdale logger: Oracle CSSD being stopped
May 17 08:34:34 palmdale init.crs: May 17 08:34:34.818 | ERR | failed to connect to daemon, errno(111)
May 17 08:34:42 palmdale udevd[2258]: udev done!
May 17 08:34:46 palmdale snmpd[22041]: [smux_accept] accepted fd 17 from 10.20.7.21:3661 
May 17 08:34:46 palmdale snmpd[22041]: refused smux peer: oid SNMPv2-SMI::zeroDotZero, password , descr NSGS 
May 17 08:34:55 palmdale init.crs: Stopping resources. This could take several minutes.
May 17 08:34:55 palmdale init.crs: Successfully stopped CRS resources.
May 17 08:34:55 palmdale init.crs: Stopping CSSD.
May 17 08:34:55 palmdale init.crs: Shutting down CSS daemon.
May 17 08:34:55 palmdale init.crs: Shutdown request successfully issued.
May 17 08:34:55 palmdale init.crs: Shutdown has begun. The daemons should exit soon.
May 17 08:34:55 palmdale rc: Stopping init.crs:  succeeded
May 17 08:34:55 palmdale ocfs2: Stopping Oracle Cluster File System (OCFS2) 
May 17 08:34:55 palmdale kernel: ocfs2_dlm: Node 1 leaves domain 5DF64C3C5E76486A9289B88B660059A2
May 17 08:34:55 palmdale kernel: ocfs2_dlm: Nodes in domain ("5DF64C3C5E76486A9289B88B660059A2"): 0 2 3 
May 17 08:35:01 palmdale kernel: ocfs2_dlm: Node 1 leaves domain 416F9689CEAB4D239862CACAF04935A6
May 17 08:35:01 palmdale kernel: ocfs2_dlm: Nodes in domain ("416F9689CEAB4D239862CACAF04935A6"): 0 2 3 
May 17 08:35:02 palmdale kernel: ocfs2: Unmounting device (65,33) on (node 0)
May 17 08:35:08 palmdale kernel: ocfs2: Unmounting device (8,113) on (node 0)
May 17 08:35:13 palmdale kernel: ocfs2: Unmounting device (8,97) on (node 0)
May 17 08:35:18 palmdale kernel: ocfs2: Unmounting device (8,209) on (node 0)
May 17 08:35:24 palmdale kernel: ocfs2: Unmounting device (8,193) on (node 0)
May 17 08:35:30 palmdale kernel: ocfs2: Unmounting device (8,177) on (node 0)
May 17 08:35:36 palmdale kernel: ocfs2: Unmounting device (8,161) on (node 0)
May 17 08:35:37 palmdale ocfs2: Unable failed
May 17 08:35:37 palmdale ocfs2: [
May 17 08:35:37 palmdale ocfs2: 
May 17 08:35:42 palmdale ocfs2: Retry stopping Oracle Cluster File System (OCFS2) 
May 17 08:35:43 palmdale ocfs2: Unable failed
May 17 08:35:43 palmdale ocfs2: [
May 17 08:35:43 palmdale ocfs2: 
May 17 08:35:48 palmdale ocfs2: Retry stopping Oracle Cluster File System (OCFS2) 
May 17 08:35:49 palmdale ocfs2: Unable failed
May 17 08:35:49 palmdale ocfs2: [
May 17 08:35:49 palmdale ocfs2: 
May 17 08:35:54 palmdale rc: Stopping ocfs2:  failed
May 17 08:35:54 palmdale mountd[7239]: Caught signal 15, un-registering and exiting.
May 17 08:35:54 palmdale nfs: rpc.mountd shutdown succeeded
May 17 08:35:54 palmdale nfs: nfsd -2 succeeded
May 17 08:35:54 palmdale nfs: rpc.rquotad shutdown succeeded
May 17 08:35:55 palmdale nfs: Shutting down NFS services:  succeeded
May 17 08:35:55 palmdale sshd: sshd shutdown succeeded
May 17 08:35:55 palmdale sendmail: sm-client shutdown succeeded
May 17 08:35:55 palmdale sendmail: sendmail shutdown succeeded
May 17 08:35:55 palmdale smartd: smartd shutdown failed
May 17 08:35:55 palmdale snmpd[22041]: Received TERM or STOP signal...  shutting down... 
May 17 08:35:55 palmdale snmpd: snmpd shutdown succeeded
May 17 08:35:55 palmdale snmptrapd: snmptrapd shutdown succeeded
May 17 08:35:55 palmdale vsftpd: vsftpd shutdown succeeded
May 17 08:35:55 palmdale xinetd[7192]: Exiting...
May 17 08:35:55 palmdale xinetd: xinetd shutdown succeeded
May 17 08:35:56 palmdale crond: crond shutdown succeeded
May 17 08:35:56 palmdale autofs: Stopping automount:
May 17 08:35:56 palmdale autofs: automount shutdown succeeded
May 17 08:35:56 palmdale autofs: [  
May 17 08:35:56 palmdale autofs: 
May 17 08:35:56 palmdale rc: Stopping autofs:  succeeded
May 17 08:35:56 palmdale umount: umount: /u07: device is busy
May 17 08:35:56 palmdale netfs: Unmounting network block filesystems:  failed
May 17 08:36:03 palmdale umount: umount: /u07: device is busy
May 17 08:36:03 palmdale netfs: Unmounting network block filesystems (retry):  failed
May 17 08:36:10 palmdale umount: umount: /u07: device is busy
May 17 08:36:10 palmdale netfs: Unmounting network block filesystems (retry):  failed
May 17 08:36:17 palmdale netfs: Unmounting NFS filesystems:  succeeded
May 17 08:36:19 palmdale nfslock: lockd -KILL succeeded
May 17 08:36:19 palmdale rpc.statd[5511]: Caught signal 15, un-registering and exiting.
May 17 08:36:20 palmdale nfslock: rpc.statd shutdown succeeded
May 17 08:36:20 palmdale irqbalance: irqbalance shutdown succeeded
May 17 08:36:20 palmdale portmap: portmap shutdown succeeded
May 17 08:36:20 palmdale kernel: Kernel logging (proc) stopped.
May 17 08:36:20 palmdale kernel: Kernel log daemon terminating.
May 17 08:36:21 palmdale syslog: klogd shutdown succeeded
May 17 08:36:21 palmdale exiting on signal 15
May 17 08:42:13 palmdale syslogd 1.4.1: restart.
May 17 08:42:13 palmdale syslog: syslogd startup succeeded
May 17 08:42:13 palmdale kernel: klogd 1.4.1, log source = /proc/kmsg started.
May 17 08:42:13 palmdale kernel: Bootdata ok (command line is ro root=/dev/vg0/lv00 rhgb quiet numa=off)
May 17 08:42:13 palmdale kernel: Linux version 2.6.9-89.ELsmp (mockbuild@hs20-bc2-5.build.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-11)) #1 SMP Mon Apr 20 10:33:05 EDT 2009
May 17 08:42:13 palmdale kernel: BIOS-provided physical RAM map:
May 17 08:42:13 palmdale kernel:  BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 0000000000100000 - 00000000f57f6800 (usable)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000f57f6800 - 00000000f5800000 (ACPI data)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fdc00000 - 00000000fdc01000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fdc10000 - 00000000fdc11000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fdc20000 - 00000000fdc21000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fdc30000 - 00000000fdc31000 (reserved)
May 17 08:42:13 palmdale syslog: klogd startup succeeded
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fec10000 - 00000000fec11000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fec20000 - 00000000fec21000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 0000000100000000 - 00000007fffff000 (usable)
May 17 08:42:13 palmdale kernel: NUMA turned off
May 17 08:42:13 palmdale kernel: Faking a node at 0000000000000000-00000007fffff000
May 17 08:42:13 palmdale kernel: Bootmem setup node 0 0000000000000000-00000007fffff000
May 17 08:42:13 palmdale kernel: DMI 2.3 present.
May 17 08:42:13 palmdale kernel: ACPI: PM-Timer IO Port: 0x908
May 17 08:42:13 palmdale kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
May 17 08:42:13 palmdale kernel: Processor #0 15:1 APIC version 16
May 17 08:42:13 palmdale kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
May 17 08:42:13 palmdale kernel: Processor #2 15:1 APIC version 16
May 17 08:42:13 palmdale kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] disabled)
May 17 08:42:13 palmdale kernel: ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] disabled)
May 17 08:42:13 palmdale kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
May 17 08:42:13 palmdale kernel: Processor #1 15:1 APIC version 16
May 17 08:42:13 palmdale kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
May 17 08:42:13 palmdale kernel: Processor #3 15:1 APIC version 16
May 17 08:42:13 palmdale kernel: ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] disabled)
May 17 08:42:13 palmdale kernel: ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] disabled)
May 17 08:42:13 palmdale kernel: ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
May 17 08:42:13 palmdale kernel: Setting APIC routing to flat
May 17 08:42:13 palmdale kernel: ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
May 17 08:42:13 palmdale kernel: IOAPIC[0]: apic_id 4, version 17, address 0xfec00000, GSI 0-23
May 17 08:42:13 palmdale kernel: ACPI: IOAPIC (id[0x05] address[0xfec10000] gsi_base[24])
May 17 08:42:13 palmdale kernel: IOAPIC[1]: apic_id 5, version 17, address 0xfec10000, GSI 24-27
May 17 08:42:13 palmdale kernel: ACPI: IOAPIC (id[0x06] address[0xfec20000] gsi_base[28])
May 17 08:42:13 palmdale irqbalance: irqbalance startup succeeded
May 17 08:42:13 palmdale kernel: IOAPIC[2]: apic_id 6, version 17, address 0xfec20000, GSI 28-31
May 17 08:42:13 palmdale kernel: ACPI: IOAPIC (id[0x07] address[0xfdc00000] gsi_base[32])
May 17 08:42:13 palmdale kernel: IOAPIC[3]: apic_id 7, version 17, address 0xfdc00000, GSI 32-35
May 17 08:42:13 palmdale kernel: ACPI: IOAPIC (id[0x08] address[0xfdc10000] gsi_base[36])
May 17 08:42:13 palmdale kernel: IOAPIC[4]: apic_id 8, version 17, address 0xfdc10000, GSI 36-39
May 17 08:42:13 palmdale kernel: ACPI: IOAPIC (id[0x09] address[0xfdc20000] gsi_base[40])
May 17 08:42:13 palmdale kernel: IOAPIC[5]: apic_id 9, version 17, address 0xfdc20000, GSI 40-43
May 17 08:42:13 palmdale kernel: ACPI: IOAPIC (id[0x0a] address[0xfdc30000] gsi_base[44])
May 17 08:42:13 palmdale kernel: IOAPIC[6]: apic_id 10, version 17, address 0xfdc30000, GSI 44-47
May 17 08:42:13 palmdale kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
May 17 08:42:13 palmdale kernel: Using ACPI (MADT) for SMP configuration information
May 17 08:42:13 palmdale kernel: Allocating PCI resources starting at f6000000 (gap: f5800000:8400000)
May 17 08:42:13 palmdale kernel: Checking aperture...
May 17 08:42:13 palmdale kernel: CPU 0: aperture @ 4000000 size 32 MB
May 17 08:42:13 palmdale kernel: Aperture from northbridge cpu 0 too small (32 MB)
May 17 08:42:13 palmdale kernel: No AGP bridge found
May 17 08:42:13 palmdale kernel: Mapping aperture over 65536 KB of RAM @ 4000000
May 17 08:42:13 palmdale kernel: Built 1 zonelists
May 17 08:42:13 palmdale kernel: Kernel command line: ro root=/dev/vg0/lv00 rhgb quiet numa=off console=tty0
May 17 08:42:13 palmdale kernel: Initializing CPU#0
May 17 08:42:13 palmdale kernel: PID hash table entries: 4096 (order: 12, 131072 bytes)
May 17 08:42:13 palmdale kernel: time.c: Using 3.579545 MHz PM timer.
May 17 08:42:13 palmdale kernel: time.c: Detected 2599.941 MHz processor.
May 17 08:42:13 palmdale kernel: Console: colour VGA+ 80x25
May 17 08:42:13 palmdale kernel: Dentry cache hash table entries: 4194304 (order: 13, 33554432 bytes)
May 17 08:42:13 palmdale kernel: Inode-cache hash table entries: 2097152 (order: 12, 16777216 bytes)
May 17 08:42:13 palmdale kernel: Memory: 32800388k/33554428k available (2146k kernel code, 0k reserved, 1348k data, 208k init)
May 17 08:42:13 palmdale kernel: Calibrating delay using timer specific routine.. 5207.47 BogoMIPS (lpj=2603736)
May 17 08:42:13 palmdale kernel: Security Scaffold v1.0.0 initialized
May 17 08:42:13 palmdale kernel: SELinux:  Initializing.
May 17 08:42:13 palmdale kernel: selinux_register_security:  Registering secondary module capability
May 17 08:42:13 palmdale portmap: portmap startup succeeded
May 17 08:42:13 palmdale kernel: Capability LSM initialized as secondary

Open in new window

Jason YuAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Jason YuAuthor Commented:
The kernel log file is as below:

May 17 08:34:55 palmdale kernel: ocfs2_dlm: Node 1 leaves domain 5DF64C3C5E76486A9289B88B660059A2
May 17 08:34:55 palmdale kernel: ocfs2_dlm: Nodes in domain ("5DF64C3C5E76486A9289B88B660059A2"): 0 2 3 
May 17 08:35:01 palmdale kernel: ocfs2_dlm: Node 1 leaves domain 416F9689CEAB4D239862CACAF04935A6
May 17 08:35:01 palmdale kernel: ocfs2_dlm: Nodes in domain ("416F9689CEAB4D239862CACAF04935A6"): 0 2 3 
May 17 08:35:02 palmdale kernel: ocfs2: Unmounting device (65,33) on (node 0)
May 17 08:35:08 palmdale kernel: ocfs2: Unmounting device (8,113) on (node 0)
May 17 08:35:13 palmdale kernel: ocfs2: Unmounting device (8,97) on (node 0)
May 17 08:35:18 palmdale kernel: ocfs2: Unmounting device (8,209) on (node 0)
May 17 08:35:24 palmdale kernel: ocfs2: Unmounting device (8,193) on (node 0)
May 17 08:35:30 palmdale kernel: ocfs2: Unmounting device (8,177) on (node 0)
May 17 08:35:36 palmdale kernel: ocfs2: Unmounting device (8,161) on (node 0)
May 17 08:36:20 palmdale kernel: Kernel logging (proc) stopped.
May 17 08:36:20 palmdale kernel: Kernel log daemon terminating.
May 17 08:42:13 palmdale kernel: klogd 1.4.1, log source = /proc/kmsg started.
May 17 08:42:13 palmdale kernel: Bootdata ok (command line is ro root=/dev/vg0/lv00 rhgb quiet numa=off)
May 17 08:42:13 palmdale kernel: Linux version 2.6.9-89.ELsmp (mockbuild@hs20-bc2-5.build.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-11)) #1 SMP Mon Apr 20 10:33:05 EDT 2009
May 17 08:42:13 palmdale kernel: BIOS-provided physical RAM map:
May 17 08:42:13 palmdale kernel:  BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 0000000000100000 - 00000000f57f6800 (usable)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000f57f6800 - 00000000f5800000 (ACPI data)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fdc00000 - 00000000fdc01000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fdc10000 - 00000000fdc11000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fdc20000 - 00000000fdc21000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fdc30000 - 00000000fdc31000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fec10000 - 00000000fec11000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fec20000 - 00000000fec21000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
May 17 08:42:13 palmdale kernel:  BIOS-e820: 0000000100000000 - 00000007fffff000 (usable)
May 17 08:42:13 palmdale kernel: ACPI: RSDP (v002 HP                                    ) @ 0x00000000000f4f20
May 17 08:42:13 palmdale kernel: ACPI: XSDT (v001 HP     A01      0x00000002 Ò 0x0000162e) @ 0x00000000f57f6c00
May 17 08:42:13 palmdale kernel: ACPI: FADT (v003 HP     A01      0x00000002 Ò 0x0000162e) @ 0x00000000f57f6c80
May 17 08:42:13 palmdale kernel: ACPI: MADT (v001 HP     00000083 0x00000002  0x00000000) @ 0x00000000f57f6900
May 17 08:42:13 palmdale kernel: ACPI: SPCR (v001 HP     SPCRRBSU 0x00000001 Ò 0x0000162e) @ 0x00000000f57f6a00
May 17 08:42:13 palmdale kernel: ACPI: SRAT (v001 HP     A01      0x00000001  0x00000000) @ 0x00000000f57f6a80

Open in new window

Jason YuAuthor Commented:
My oracle dba says it's not an oracle database error and it's a share disk error cause we have mounted share disk for RMAN backup. Could anybody help me find the reason why both server got dead.

Thanks.
Zephyr ICTCloud ArchitectCommented:
According to the logs there was a lock on the NFS mounted disks "nfslock: lockd -KILL succeeded"

Maybe there was an interruption Sunday night to the NFS servers which caused the "death" of your servers, that seems like the most obvious culprit at the moment, the logs don't tell much at the moment.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

Jason YuAuthor Commented:
How could we avoid this?
Zephyr ICTCloud ArchitectCommented:
That's a good question, first we would need to establish if this was really the issue, are there any log entries on the NFS server from the time the issue presented itself? That would be helpful in seeing if it was a problem at that time on that server/appliance as well...

Maybe there was too much load because of backups to this storage..?
Jason YuAuthor Commented:
I check that error in the lot, the time is several hours be4 the server dead. The server died in about 10: 40. WAS that the cause of the failure?
Zephyr ICTCloud ArchitectCommented:
I'm sorry, I didn't quite understand that...

Do you mean you found an error in the NFS server log a few hours before the issue happened on the server that ended up dead?

It could be yes... I'd really need more info to say that.
Jason YuAuthor Commented:
the nfslock error showed many times before the failure:

May 17 08:36:20 palmdale nfslock: rpc.statd shutdown succeeded
May 17 08:36:20 palmdale irqbalance: irqbalance shutdown succeeded
May 17 08:36:20 palmdale portmap: portmap shutdown succeeded
May 17 08:36:20 palmdale kernel: Kernel logging (proc) stopped.
May 17 08:36:20 palmdale kernel: Kernel log daemon terminating.
May 17 08:36:21 palmdale syslog: klogd shutdown succeeded
May 17 08:36:21 palmdale exiting on signal 15

May 17 08:42:13 palmdale kernel: SCSI device sdc: 536870912 512-byte hdwr sectors (274878 MB)
May 17 08:42:14 palmdale kernel: SCSI device sdc: drive cache: write through
May 17 08:42:14 palmdale nfslock: rpc.statd startup succeeded
May 17 08:42:14 palmdale kernel: SCSI device sdc: 536870912 512-byte hdwr sectors (274878 MB)
May 17 08:42:14 palmdale kernel: SCSI device sdc: drive cache: write through
May 17 08:42:14 palmdale kernel:  sdc: sdc1
May 17 08:42:14 palmdale kernel: Attached scsi disk sdc at scsi1, channel 0, id 0, lun 27
May 17 08:42:14 palmdale kernel:   Vendor: NETAPP    Model: LUN               Rev: 7350

The server failed at about 10:30 PM when there is no record showed in the message log file.

May 17 08:43:13 palmdale last message repeated 2 times
May 17 08:43:14 palmdale logger: Running CRSD with TZ =
May 17 08:43:14 palmdale logger: Oracle CSS Family monitor starting.
May 17 08:43:15 palmdale logger: Oracle CSS restart. 0, 1
May 17 08:43:21 palmdale udevd[2252]: udev done!
May 17 08:51:23 palmdale snmpd[7146]: [smux_accept] accepted fd 17 from 10.20.7.21:32836
May 17 08:51:23 palmdale snmpd[7146]: refused smux peer: oid SNMPv2-SMI::zeroDotZero, password , descr NSGS
May 17 11:55:51 palmdale kernel: warning: many lost ticks.
May 17 11:55:51 palmdale kernel: Your time source seems to be instable or some driver is hogging interupts
May 17 11:55:51 palmdale kernel: rip __do_softirq+0x4d/0xd0
May 18 05:17:20 palmdale syslogd 1.4.1: restart.
May 18 05:17:20 palmdale syslog: syslogd startup succeeded
May 18 05:17:20 palmdale kernel: klogd 1.4.1, log source = /proc/kmsg started.
May 18 05:17:20 palmdale kernel: Bootdata ok (command line is ro root=/dev/vg0/lv00 rhgb quiet numa=off)
May 18 05:17:20 palmdale kernel: Linux version 2.6.9-89.ELsmp (mockbuild@hs20-bc2-5.build.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-11)) #1 SMP Mon Apr 20 10:33:05 EDT 2009
May 18 05:17:20 palmdale kernel: BIOS-provided physical RAM map:
May 18 05:17:20 palmdale kernel:  BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
May 18 05:17:20 palmdale kernel:  BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
May 18 05:17:20 palmdale kernel:  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
May 18 05:17:20 palmdale kernel:  BIOS-e820: 0000000000100000 - 00000000f57f6800 (usable)
May 18 05:17:20 palmdale kernel:  BIOS-e820: 00000000f57f6800 - 00000000f5800000 (ACPI data)
May 18 05:17:20 palmdale kernel:  BIOS-e820: 00000000fdc00000 - 00000000fdc01000 (reserved)
May 18 05:17:20 palmdale kernel:  BIOS-e820: 00000000fdc10000 - 00000000fdc11000 (reserved)
May 18 05:17:20 palmdale kernel:  BIOS-e820: 00000000fdc20000 - 00000000fdc21000 (reserved)
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux

From novice to tech pro — start learning today.