Solved

what happened on this server, hard disk failure or just some bad blocks.

Posted on 2013-05-14
16
358 Views
Last Modified: 2013-05-31
My EDI server stopped working all of sudden, it's a red hat linux. When I was rebooting server, it found some errors on the FS and was successful for correcting them. Although it boot up and enters into OS without problem, I still want to know some precautions strategies for prevent this kind of error happens again. Could anyone give me some advice on how to check the healthy status of this server. Here is the log file of the system.

thanks.



May  5 04:02:15 luna syslogd 1.4.1: restart.
May  5 05:02:20 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 05:20:13 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 05:36:29 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 06:27:55 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 06:44:47 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 07:18:59 luna ntpd[3353]: time reset +1.373396 s
May  5 07:22:54 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 07:23:57 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 07:31:29 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 08:04:53 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 08:38:33 luna ntpd[3353]: time reset +0.352645 s
May  5 08:42:28 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 09:04:23 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 10:04:12 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 10:21:09 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 10:38:12 luna ntpd[3353]: time reset +0.468421 s
May  5 10:42:19 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 10:43:22 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 10:51:54 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 11:45:57 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 12:03:51 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 12:20:04 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 13:11:26 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 13:45:28 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 14:20:18 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 14:36:41 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 15:28:13 luna ntpd[3353]: time reset +0.262163 s
May  5 15:32:29 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 15:33:35 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 15:48:16 luna ntpd[3353]: time reset +0.130489 s
May  5 15:51:58 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 15:53:04 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 16:54:08 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 19:11:32 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 19:28:41 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 19:44:44 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 20:18:55 luna ntpd[3353]: time reset +1.716473 s
May  5 20:22:23 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 20:24:31 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 20:40:46 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 22:38:32 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 23:12:42 luna ntpd[3353]: time reset +1.191759 s
May  5 23:16:26 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  5 23:17:31 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  5 23:28:09 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 00:31:23 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 00:48:54 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 01:22:34 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 01:39:38 luna ntpd[3353]: time reset +1.063791 s
May  6 01:43:33 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 01:44:03 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 01:59:43 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 03:35:29 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 04:09:44 luna ntpd[3353]: time reset +0.970253 s
May  6 04:13:10 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 04:33:22 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 05:30:08 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 05:47:34 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 06:04:19 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 06:38:30 luna ntpd[3353]: time reset +0.652795 s
May  6 06:42:01 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 06:43:04 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 07:10:59 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 07:29:37 luna avahi-daemon[3903]: Invalid query packet.
May  6 07:30:17 luna last message repeated 7 times
May  6 07:49:39 luna avahi-daemon[3903]: Invalid query packet.
May  6 08:09:19 luna last message repeated 14 times
May  6 08:32:16 luna last message repeated 9 times
May  6 08:36:47 luna last message repeated 24 times
May  6 08:37:27 luna last message repeated 9 times
May  6 09:21:01 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 09:42:02 luna avahi-daemon[3903]: Invalid query packet.
May  6 09:42:42 luna last message repeated 7 times
May  6 09:55:10 luna ntpd[3353]: time reset +1.077468 s
May  6 09:58:29 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 10:00:37 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 10:14:37 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 12:42:41 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 12:54:05 luna avahi-daemon[3903]: Invalid query packet.
May  6 12:54:09 luna last message repeated 6 times
May  6 12:59:57 luna ntpd[3353]: time reset +1.406363 s
May  6 13:04:13 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 13:05:17 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 13:12:50 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 13:23:44 luna avahi-daemon[3903]: Invalid query packet.
May  6 13:24:25 luna last message repeated 7 times
May  6 14:13:42 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 14:48:26 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 15:04:56 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 15:39:10 luna ntpd[3353]: time reset +1.119836 s
May  6 15:42:40 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 15:44:47 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 15:54:29 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 16:38:23 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 16:55:38 luna ntpd[3353]: time reset +0.429475 s
May  6 16:59:59 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 17:01:05 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 17:12:59 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 17:33:50 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 18:16:32 luna ntpd[3353]: time reset +0.655949 s
May  6 18:20:43 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 18:21:16 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 18:22:52 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 18:43:23 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 19:42:42 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 20:16:30 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 20:22:56 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452128 in dir #78447464
May  6 20:22:56 luna kernel: Aborting journal on device dm-4.
May  6 20:22:56 luna kernel: ext3_abort called.
May  6 20:22:56 luna kernel: EXT3-fs error (device dm-4): ext3_journal_start_sb: Detected aborted journal
May  6 20:22:56 luna kernel: Remounting filesystem read-only
May  6 20:22:58 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452124 in dir #78448586
May  6 20:22:58 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452126 in dir #78448586
May  6 20:22:58 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452125 in dir #78448586
May  6 20:22:59 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452105 in dir #78447723
May  6 20:23:02 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452127 in dir #78447518
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452097 in dir #78448625
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452099 in dir #78447250
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452123 in dir #78447250
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452100 in dir #78447250
May  6 20:23:03 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452098 in dir #78447250
May  6 20:50:39 luna ntpd[3353]: time reset +0.714664 s
May  6 20:54:53 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 21:23:54 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 21:41:08 luna ntpd[3353]: time reset +0.301325 s
May  6 21:44:57 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 22:06:55 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 22:51:30 luna kernel: printk: 9 messages suppressed.
May  6 22:51:30 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452128 in dir #78447464
May  6 22:51:32 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452124 in dir #78448586
May  6 22:51:32 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452126 in dir #78448586
May  6 22:51:32 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452125 in dir #78448586
May  6 22:51:33 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452105 in dir #78447723
May  6 22:51:34 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452127 in dir #78447518
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452097 in dir #78448625
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452099 in dir #78447250
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452123 in dir #78447250
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452100 in dir #78447250
May  6 22:51:40 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452098 in dir #78447250
May  6 22:51:41 luna kernel: EXT3-fs error (device dm-4): ext3_lookup: unlinked inode 78452106 in dir #78448565
May  6 23:22:01 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  6 23:39:09 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  6 23:55:29 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 00:29:38 luna ntpd[3353]: time reset +0.419783 s
May  7 00:33:16 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 01:07:07 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 01:59:05 luna ntpd[3353]: time reset -0.226514 s
May  7 02:03:21 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 02:03:58 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 02:05:30 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 03:14:29 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 03:48:55 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 04:05:43 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 04:39:53 luna ntpd[3353]: time reset +0.513939 s
May  7 04:43:15 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 04:45:22 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 05:14:25 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 06:26:24 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 06:43:31 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 06:59:31 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
May  7 07:36:27 luna avahi-daemon[3903]: Invalid query packet.
May  7 07:37:07 luna last message repeated 9 times
May  7 07:49:44 luna avahi-daemon[3903]: Invalid query packet.
May  7 07:50:42 luna last message repeated 15 times
May  7 07:50:45 luna ntpd[3353]: time reset +0.743814 s
May  7 07:54:57 luna ntpd[3353]: synchronized to LOCAL(0), stratum 10
May  7 08:12:29 luna avahi-daemon[3903]: Invalid query packet.
May  7 08:13:09 luna last message repeated 8 times
May  7 08:14:38 luna avahi-daemon[3903]: Invalid query packet.
May  7 08:15:18 luna last message repeated 7 times
May  7 08:30:06 luna gconfd (root-20075): starting (version 2.14.0), pid 20075 user 'root'
May  7 08:30:06 luna gconfd (root-20075): Resolved address "xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
May  7 08:30:06 luna gconfd (root-20075): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 1
May  7 08:30:06 luna gconfd (root-20075): Resolved address "xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2
May  7 08:30:07 luna gconfd (root-20075): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 0
May  7 08:30:08 luna nm-system-settings: Loaded plugin ifcfg-rh: (c) 2007 - 2008 Red Hat, Inc.  To report bugs please use the NetworkManager mailing list.
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth1 ...
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh:     read connection 'System eth1'
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
May  7 08:30:08 luna nm-system-settings:    ifcfg-rh:     read connection 'System eth0'
May  7 08:38:35 luna avahi-daemon[3903]: Invalid query packet.
May  7 08:39:06 luna last message repeated 15 times
May  7 08:39:15 luna last message repeated 6 times
May  7 08:52:04 luna ntpd[3353]: synchronized to 10.10.4.10, stratum 3
0
Comment
Question by:Jason Yu
  • 6
  • 4
  • 2
  • +3
16 Comments
 

Author Comment

by:Jason Yu
ID: 39166232
here is the copy screen when it was rebooting. it says found fs errors.
Luna-FS-error.JPG
0
 
LVL 5

Assisted Solution

by:atechnicnate
atechnicnate earned 112 total points
ID: 39166246
It can be either hard disk failure or due to a read-only mounted folder or possibly something else.  It's hard to say without more data but I'd probably start with an fsck on it.
0
 
LVL 88

Assisted Solution

by:rindi
rindi earned 112 total points
ID: 39166275
Most servers have utilities you can install in your OS that can give you the state of the disks in the array and also other hardware status. Also, Dell servers have an iDRAC, or HP's a lights-out installed (often), and those can also give you the hardware status. Just check the manuals of your hardware for more details.
0
Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

 
LVL 21

Assisted Solution

by:Mazdajai
Mazdajai earned 166 total points
ID: 39166858
what type of disk is dm-4?
0
 

Author Comment

by:Jason Yu
ID: 39166866
DM-4 is a local partition. It's a logic volume.
0
 
LVL 5

Assisted Solution

by:atechnicnate
atechnicnate earned 112 total points
ID: 39166871
If your drive is smart capable you could use some of the smartmon tools.
0
 
LVL 21

Assisted Solution

by:Mazdajai
Mazdajai earned 166 total points
ID: 39166878
Have you run a full disk diagnostic test?
0
 

Author Comment

by:Jason Yu
ID: 39167056
not yet, how to do a disk diagnostic. I remember last time I press F2 when the server boot up, is this the correct way.

thanks.
0
 
LVL 6

Assisted Solution

by:Vijay Pratap Singh
Vijay Pratap Singh earned 55 total points
ID: 39167083
Try to run e2fsck this will fix the inode table.
0
 
LVL 88

Accepted Solution

by:
rindi earned 112 total points
ID: 39167187
As I said, most server's include tools to diagnose your disks. They are usually a part of the RAID controller and management software. It totally depends on your server so you must check it's manuals for details.

For normal PC's you would boot it using the UBCD and then running the HD manufacturer's diagnostic utility, but the RAID controllers of servers mask those disks from the utilities, so you would first have to remove each disk from the RAID controller and attach them to a non-RAID controller, then run the diagnostics.

http://ultimatebootcd.com
http://pharry.org/data/ubcd523.iso
0
 

Author Comment

by:Jason Yu
ID: 39169483
Thanks a lot, I will create a case with Dell and run a diagnostic test on it.
0
 
LVL 21

Expert Comment

by:Mazdajai
ID: 39169498
It should be F12, not sure if it has changed.
0
 

Author Comment

by:Jason Yu
ID: 39169564
Can I do a e2fsck online from inside the OS?

I remember when the issue was happening, I was trying to umount this partition and do a fsck command. however, it didn't let me umount this partition. If I meet this case again, what should I do, thanks.
0
 
LVL 28

Assisted Solution

by:serialband
serialband earned 55 total points
ID: 39169611
Boot into single user mode with init 1 and you can run fsck on the boot/kernel partition

or from grub

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/3/html/System_Administration_Guide/s1-rescuemode-booting-single.html
0
 
LVL 21

Assisted Solution

by:Mazdajai
Mazdajai earned 166 total points
ID: 39169903
Quickest way is to force a fsck after reboot -

shutdown -rF now

Open in new window

0
 

Author Closing Comment

by:Jason Yu
ID: 39212077
Thanks experts here, I appreciate your valuable posts and advice.
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I am a long time windows user and for me it is normal to have spaces in directory and file names. Changing to Linux I found myself frustrated when I moved my windows data over to my new Linux computer. The problem occurs when at the command line.…
Introduction We as admins face situation where we need to redirect websites to another. This may be required as a part of an upgrade keeping the old URL but website should be served from new URL. This document would brief you on different ways ca…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.

823 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question