• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 398
  • Last Modified:

what happened on this server, hard disk failure or just some bad blocks.

My EDI server stopped working all of sudden, it's a red hat linux. When I was rebooting server, it found some errors on the FS and was successful for correcting them. Although it boot up and enters into OS without problem, I still want to know some precautions strategies for prevent this kind of error happens again. Could anyone give me some advice on how to check the healthy status of this server. Here is the log file of the system.

thanks.



May  5 04:02:15 luna syslogd 1.4.1: restart.
May  5 05:02:20 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 05:20:13 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 05:36:29 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 06:27:55 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 06:44:47 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 07:18:59 luna ntpd[3353]: time reset +1.373396 s
May  5 07:22:54 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 07:23:57 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 07:31:29 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 08:04:53 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 08:38:33 luna ntpd[3353]: time reset +0.352645 s
May  5 08:42:28 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 09:04:23 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 10:04:12 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 10:21:09 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 10:38:12 luna ntpd[3353]: time reset +0.468421 s
May  5 10:42:19 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 10:43:22 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 10:51:54 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 11:45:57 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 12:03:51 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 12:20:04 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 13:11:26 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 13:45:28 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 14:20:18 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 14:36:41 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 15:28:13 luna ntpd[3353]: time reset +0.262163 s
May  5 15:32:29 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 15:33:35 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 15:48:16 luna ntpd[3353]: time reset +0.130489 s
May  5 15:51:58 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 15:53:04 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 16:54:08 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 19:11:32 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 19:28:41 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 19:44:44 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 20:18:55 luna ntpd[3353]: time reset +1.716473 s
May  5 20:22:23 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 20:24:31 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 20:40:46 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 22:38:32 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 23:12:42 luna ntpd[3353]: time reset +1.191759 s
May  5 23:16:26 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 23:17:31 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 23:28:09 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 00:31:23 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 00:48:54 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 01:22:34 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 01:39:38 luna ntpd[3353]: time reset +1.063791 s
May  6 01:43:33 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 01:44:03 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 01:59:43 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 03:35:29 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 04:09:44 luna ntpd[3353]: time reset +0.970253 s
May  6 04:13:10 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 04:33:22 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 05:30:08 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 05:47:34 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 06:04:19 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 06:38:30 luna ntpd[3353]: time reset +0.652795 s
May  6 06:42:01 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 06:43:04 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 07:10:59 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 07:29:37 luna avahi-daemon[3903]: Invalid query packet.
May  6 07:30:17 luna last message repeated 7 times
May  6 07:49:39 luna avahi-daemon[3903]: Invalid query packet.
May  6 08:09:19 luna last message repeated 14 times
May  6 08:32:16 luna last message repeated 9 times
May  6 08:36:47 luna last message repeated 24 times
May  6 08:37:27 luna last message repeated 9 times
May  6 09:21:01 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 09:42:02 luna avahi-daemon[3903]: Invalid query packet.
May  6 09:42:42 luna last message repeated 7 times
May  6 09:55:10 luna ntpd[3353]: time reset +1.077468 s
May  6 09:58:29 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 10:00:37 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 10:14:37 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 12:42:41 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 12:54:05 luna avahi-daemon[3903]: Invalid query packet.
May  6 12:54:09 luna last message repeated 6 times
May  6 12:59:57 luna ntpd[3353]: time reset +1.406363 s
May  6 13:04:13 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 13:05:17 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 13:12:50 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 13:23:44 luna avahi-daemon[3903]: Invalid query packet.
May  6 13:24:25 luna last message repeated 7 times
May  6 14:13:42 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 14:48:26 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 15:04:56 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 15:39:10 luna ntpd[3353]: time reset +1.119836 s
May  6 15:42:40 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 15:44:47 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 15:54:29 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 16:38:23 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 16:55:38 luna ntpd[3353]: time reset +0.429475 s
May  6 16:59:59 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 17:01:05 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 17:12:59 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 17:33:50 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 18:16:32 luna ntpd[3353]: time reset +0.655949 s
May  6 18:20:43 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 18:21:16 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 18:22:52 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 18:43:23 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 19:42:42 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 20:16:30 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 20:22:56 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452128 in dir #78447464
May  6 20:22:56 luna kernel: Aborting journal on device dm-4.
May  6 20:22:56 luna kernel: ext3_abort called.
May  6 20:22:56 luna kernel: EXT3-fs error (device dm-4): ext3_journal_start_sb: Detected aborted journal
May  6 20:22:56 luna kernel: Remounting filesystem read-only
May  6 20:22:58 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452124 in dir #78448586
May  6 20:22:58 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452126 in dir #78448586
May  6 20:22:58 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452125 in dir #78448586
May  6 20:22:59 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452105 in dir #78447723
May  6 20:23:02 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452127 in dir #78447518
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452097 in dir #78448625
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452099 in dir #78447250
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452123 in dir #78447250
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452100 in dir #78447250
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452098 in dir #78447250
May  6 20:50:39 luna ntpd[3353]: time reset +0.714664 s
May  6 20:54:53 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 21:23:54 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 21:41:08 luna ntpd[3353]: time reset +0.301325 s
May  6 21:44:57 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 22:06:55 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 22:51:30 luna kernel: printk: 9 messages suppressed.
May  6 22:51:30 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452128 in dir #78447464
May  6 22:51:32 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452124 in dir #78448586
May  6 22:51:32 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452126 in dir #78448586
May  6 22:51:32 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452125 in dir #78448586
May  6 22:51:33 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452105 in dir #78447723
May  6 22:51:34 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452127 in dir #78447518
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452097 in dir #78448625
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452099 in dir #78447250
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452123 in dir #78447250
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452100 in dir #78447250
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452098 in dir #78447250
May  6 22:51:41 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452106 in dir #78448565
May  6 23:22:01 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 23:39:09 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 23:55:29 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 00:29:38 luna ntpd[3353]: time reset +0.419783 s
May  7 00:33:16 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 01:07:07 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 01:59:05 luna ntpd[3353]: time reset -0.226514 s
May  7 02:03:21 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 02:03:58 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 02:05:30 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 03:14:29 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 03:48:55 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 04:05:43 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 04:39:53 luna ntpd[3353]: time reset +0.513939 s
May  7 04:43:15 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 04:45:22 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 05:14:25 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 06:26:24 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 06:43:31 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 06:59:31 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 07:36:27 luna avahi-daemon[3903]: Invalid query packet.
May  7 07:37:07 luna last message repeated 9 times
May  7 07:49:44 luna avahi-daemon[3903]: Invalid query packet.
May  7 07:50:42 luna last message repeated 15 times
May  7 07:50:45 luna ntpd[3353]: time reset +0.743814 s
May  7 07:54:57 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 08:12:29 luna avahi-daemon[3903]: Invalid query packet.
May  7 08:13:09 luna last message repeated 8 times
May  7 08:14:38 luna avahi-daemon[3903]: Invalid query packet.
May  7 08:15:18 luna last message repeated 7 times
May  7 08:30:06 luna gconfd (root-20075): starting (version 2.14.0), pid 20075 user 'root'
May  7 08:30:06 luna gconfd (root-20075): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
May  7 08:30:06 luna gconfd (root-20075): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 1
May  7 08:30:06 luna gconfd (root-20075): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2
May  7 08:30:07 luna gconfd (root-20075): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 0
May  7 08:30:08 luna nm-system-settings: Loaded plugin ifcfg-rh: (c) 2007 - 2008 Red Hat, Inc.  To report bugs please use the NetworkManager mailing list.
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth1 ...
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh:     read connection 'System eth1'
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh:     read connection 'System eth0'
May  7 08:38:35 luna avahi-daemon[3903]: Invalid query packet.
May  7 08:39:06 luna last message repeated 15 times
May  7 08:39:15 luna last message repeated 6 times
May  7 08:52:04 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
0
Jason Yu
Asked:
Jason Yu
  • 6
  • 4
  • 2
  • +3
9 Solutions
 
Jason YuAuthor Commented:
here is the copy screen when it was rebooting. it says found fs errors.
Luna-FS-error.JPG
0
 
atechnicnateCommented:
It can be either hard disk failure or due to a read-only mounted folder or possibly something else.  It's hard to say without more data but I'd probably start with an fsck on it.
0
 
rindiCommented:
Most servers have utilities you can install in your OS that can give you the state of the disks in the array and also other hardware status. Also, Dell servers have an iDRAC, or HP's a lights-out installed (often), and those can also give you the hardware status. Just check the manuals of your hardware for more details.
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
MazdajaiCommented:
what type of disk is dm-4?
0
 
Jason YuAuthor Commented:
DM-4 is a local partition. It's a logic volume.
0
 
atechnicnateCommented:
If your drive is smart capable you could use some of the smartmon tools.
0
 
MazdajaiCommented:
Have you run a full disk diagnostic test?
0
 
Jason YuAuthor Commented:
not yet, how to do a disk diagnostic. I remember last time I press F2 when the server boot up, is this the correct way.

thanks.
0
 
Vijay Pratap SinghCommented:
Try to run e2fsck this will fix the inode table.
0
 
rindiCommented:
As I said, most server's include tools to diagnose your disks. They are usually a part of the RAID controller and management software. It totally depends on your server so you must check it's manuals for details.

For normal PC's you would boot it using the UBCD and then running the HD manufacturer's diagnostic utility, but the RAID controllers of servers mask those disks from the utilities, so you would first have to remove each disk from the RAID controller and attach them to a non-RAID controller, then run the diagnostics.

http://ultimatebootcd.com
http://pharry.org/data/ubcd523.iso
0
 
Jason YuAuthor Commented:
Thanks a lot, I will create a case with Dell and run a diagnostic test on it.
0
 
MazdajaiCommented:
It should be F12, not sure if it has changed.
0
 
Jason YuAuthor Commented:
Can I do a e2fsck online from inside the OS?

I remember when the issue was happening, I was trying to umount this partition and do a fsck command. however, it didn't let me umount this partition. If I meet this case again, what should I do, thanks.
0
 
serialbandCommented:
Boot into single user mode with init 1 and you can run fsck on the boot/kernel partition

or from grub

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/3/html/System_Administration_Guide/s1-rescuemode-booting-single.html
0
 
MazdajaiCommented:
Quickest way is to force a fsck after reboot -

shutdown -rF now

Open in new window

0
 
Jason YuAuthor Commented:
Thanks experts here, I appreciate your valuable posts and advice.
0

Featured Post

Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

  • 6
  • 4
  • 2
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now