oldcar53
asked on
Damage from UPS failure?
The UPS associated with our server (at our hosting provider) developed a problem and failed, during an electrical storm. We are running CentOS 5.x.
Subsequently, after cominig back online, our server began crashing randomly but not infrequently. There seemed to be a problem with Interrupt 169 which pertains to the raid controller. The controller was replaced, but another crash occurred. The kernel was then rebooted with irqpoll. There was one additional 'soft lockup', but no crash, since that point.
Question:
Could a UPS failure cause this sort of problem?
(I'm new at this posting-questions thing, so have probably left a lot out.)
Roger Ide
Subsequently, after cominig back online, our server began crashing randomly but not infrequently. There seemed to be a problem with Interrupt 169 which pertains to the raid controller. The controller was replaced, but another crash occurred. The kernel was then rebooted with irqpoll. There was one additional 'soft lockup', but no crash, since that point.
Question:
Could a UPS failure cause this sort of problem?
(I'm new at this posting-questions thing, so have probably left a lot out.)
Roger Ide
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi-
There was no check of the filesystems, and the last event was Monday at 3 am.
The nature of the problem prevented any log-writing. I was able to run dmesg on one crash, as I caught it on the way down:
irq 169: nobody cared (try booting with the "irqpoll" option)
[<c044ea52>] __report_bad_irq+0x2b/0x69
[<c044ec49>] note_interrupt+0x1b9/0x1f0
[<c044e215>] handle_IRQ_event+0x45/0x8c
[<c044e339>] __do_IRQ+0xdd/0x118
[<c044e25c>] __do_IRQ+0x0/0x118
[<c04074c4>] do_IRQ+0x9b/0xc3
[<c040597a>] common_interrupt+0x1a/0x20
[<c05339f3>] acpi_processor_idle_simple +0x174/0x2 97
[<c040597a>] common_interrupt+0x1a/0x20
[<c053387f>] acpi_processor_idle_simple +0x0/0x297
[<c0403d14>] cpu_idle+0x9f/0xb9
=======================
handlers:
[<c058e26d>] (usb_hcd_irq+0x0/0x50)
[<f88db346>] (aac_rx_intr_message+0x0/0 x55 [aacraid])
Disabling IRQ #169
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter reset request. SCSI hang ?
INFO: task kjournald:490 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
kjournald D 00007CBB 2788 490 19 515 469 (L-TLB)
dff96ed4 00000046 701fe859 00007cbb 00000005 00000000 1834632e 0000000a
dff69550 701ff33a 00007cbb 00000ae1 00000002 dff6965c c37f0788 c39ac040
105d6eda dff4a4c4 c37f1128 c37f75cc 00000020 00000001 dff4a4bc 105d6eda
Call Trace:
[<c0621468>] io_schedule+0x36/0x59
[<c04790db>] sync_buffer+0x30/0x33
[<c062163f>] __wait_on_bit+0x33/0x58
[<c04790ab>] sync_buffer+0x0/0x33
[<c04790ab>] sync_buffer+0x0/0x33
[<c06216c6>] out_of_line_wait_on_bit+0x 62/0x6a
[<c043737c>] wake_bit_function+0x0/0x3c
[<c0479058>] __wait_on_buffer+0x1c/0x1f
[<f88684b3>] journal_commit_transaction +0x4cf/0xf 3c [jbd]
[<c042e621>] lock_timer_base+0x15/0x2f
[<c042e6a0>] try_to_del_timer_sync+0x65 /0x6c
[<f886bd08>] kjournald+0xa1/0x1c2 [jbd]
[<c043734f>] autoremove_wake_function+0 x0/0x2d
[<f886bc67>] kjournald+0x0/0x1c2 [jbd]
[<c043728a>] kthread+0xc0/0xee
[<c04371ca>] kthread+0x0/0xee
[<c0405c87>] kernel_thread_helper+0x7/0 x10
=======================
INFO: task syslogd:2386 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
syslogd D 00007CBA 2340 2386 1 2389 2242 (NOTLB)
f5c0ced0 00000086 3e0e3044 00007cba 00000070 00000080 030a9588 00000007
f5c03550 3e0e3868 00007cba 00000824 00000001 f5c0365c c37e9944 f62ea200
e02e8e68 c37ea2e4 00000001 f5c0cecc c041f0c8 00000000 00000000 ffffffff
Call Trace:
[<c041f0c8>] __wake_up+0x2a/0x3d
[<f886b2c1>] log_wait_commit+0x80/0xc7 [jbd]
[<c043734f>] autoremove_wake_function+0 x0/0x2d
[<f8866679>] journal_stop+0x196/0x1bb [jbd]
[<c0495846>] __writeback_single_inode+0 x199/0x2a5
[<c045d334>] do_writepages+0x2b/0x32
[<c0458e37>] __filemap_fdatawrite_range +0x66/0x72
[<c0495ee4>] sync_inode+0x19/0x24
[<f889e019>] ext3_sync_file+0xb1/0xdc [ext3]
[<c0478c15>] do_fsync+0x41/0x83
[<c0478c74>] __do_fsync+0x1d/0x2b
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task miva:2927 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
miva D 00007CBA 2572 2927 2757 (NOTLB)
e3d1cf44 00000082 a989794c 00007cba f88ad1e0 e3d1cefc 00000000 00000001
f20be000 aa52abe7 00007cba 00c9329b 00000003 f20be10c c37f75cc f60e2040
c044e25c c37f7f6c e749e380 e3d1cf30 00000000 e3d1c000 c048af22 ffffffff
Call Trace:
[<c044e25c>] __do_IRQ+0x0/0x118
[<c048af22>] locks_remove_posix+0x7d/0x 97
[<c062183f>] __mutex_lock_slowpath+0x4d /0x7c
[<c062187d>] .text.lock.mutex+0xf/0x14
[<c0476edc>] generic_file_llseek+0x2a/0 xd2
[<c0476eb2>] generic_file_llseek+0x0/0x d2
[<c04761f5>] vfs_llseek+0x30/0x34
[<c0477077>] sys_lseek+0x38/0x63
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task miva:2928 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
miva D 00007CBA 2524 2928 2895 (NOTLB)
d9ee9b2c 00000086 98dd3236 00007cba c1351cc0 00000000 dcdc0990 00000008
f63d1aa0 99c34660 00007cba 00e6142a 00000000 f63d1bac c37e2b00 f61463c0
00001000 c37e34a0 f65043e0 00000bef 105d52ba c042d7c7 e030160c ffffffff
Call Trace:
[<c042d7c7>] getnstimeofday+0x30/0xb6
[<c0621468>] io_schedule+0x36/0x59
[<c04790ab>] sync_buffer+0x0/0x33
[<c04790db>] sync_buffer+0x30/0x33
[<c062157a>] __wait_on_bit_lock+0x2a/0x 52
[<c04790ab>] sync_buffer+0x0/0x33
[<c0621604>] out_of_line_wait_on_bit_lo ck+0x62/0x 6a
[<c043737c>] wake_bit_function+0x0/0x3c
[<c0479205>] __lock_buffer+0x21/0x24
[<f88666eb>] do_get_write_access+0x4d/0 x462 [jbd]
[<f8866b18>] journal_get_write_access+0 x18/0x26 [jbd]
[<f88a01f3>] ext3_get_blocks_handle+0x6 88/0x8d3 [ext3]
[<f88a0711>] ext3_get_block+0xa2/0xd6 [ext3]
[<c0479436>] __block_prepare_write+0x19 b/0x37e
[<c045c636>] get_page_from_freelist+0x9 6/0x378
[<c04796c4>] block_write_begin+0x88/0xe 6
[<f88a066f>] ext3_get_block+0x0/0xd6 [ext3]
[<f88a1ad8>] ext3_write_begin+0xc2/0x1a 0 [ext3]
[<f88a066f>] ext3_get_block+0x0/0xd6 [ext3]
[<c04595af>] generic_file_buffered_writ e+0x101/0x 58b
[<c042a626>] current_fs_time+0x4a/0x54
[<c0459edf>] __generic_file_aio_write_n olock+0x4a 6/0x52a
[<c0459431>] __generic_file_aio_read+0x 16a/0x1a3
[<c0457ef3>] file_read_actor+0x0/0xd5
[<c0459fbc>] generic_file_aio_write+0x5 9/0xac
[<f889dea1>] ext3_file_write+0x19/0x83 [ext3]
[<c0476312>] do_sync_write+0xb6/0xf1
[<c043734f>] autoremove_wake_function+0 x0/0x2d
[<c044ae8f>] audit_syscall_entry+0x193/ 0x1bd
[<c0476f78>] generic_file_llseek+0xc6/0 xd2
[<c047625c>] do_sync_write+0x0/0xf1
[<c0476b9b>] vfs_write+0xa1/0x143
[<c04771c5>] sys_write+0x3c/0x63
[<c0404f4b>] syscall_call+0x7/0xb
=======================
aacraid: SCSI bus appears hung
INFO: task pdflush:235 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
pdflush D 00007CCA 2664 235 19 236 234 (L-TLB)
dff3ff34 00000046 e00b2607 00007cca 00000000 00000100 00000000 0000000a
dffa4550 e00b3239 00007cca 00000c32 00000003 dffa465c c37f75cc c39ac200
00000000 c37f7f6c 00000000 dffa4550 c38eec50 c37f44cc c39ac200 ffffffff
Call Trace:
[<c062183f>] __mutex_lock_slowpath+0x4d /0x7c
[<c062187d>] .text.lock.mutex+0xf/0x14
[<c0439d75>] down_read+0x8/0x11
[<c047cc52>] sync_supers+0x47/0xb8
[<c045d7c1>] wb_kupdate+0x36/0x130
[<c045dc77>] pdflush+0x0/0x1a1
[<c045dd82>] pdflush+0x10b/0x1a1
[<c045d78b>] wb_kupdate+0x0/0x130
[<c043728a>] kthread+0xc0/0xee
[<c04371ca>] kthread+0x0/0xee
[<c0405c87>] kernel_thread_helper+0x7/0 x10
=======================
INFO: task kjournald:490 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
kjournald D 00007CBB 2788 490 19 515 469 (L-TLB)
dff96ed4 00000046 701fe859 00007cbb 00000005 00000000 1834632e 0000000a
dff69550 701ff33a 00007cbb 00000ae1 00000002 dff6965c c37f0788 c39ac040
105d6eda dff4a4c4 c37f1128 c37f75cc 00000020 00000001 dff4a4bc 105d6eda
Call Trace:
[<c0621468>] io_schedule+0x36/0x59
[<c04790db>] sync_buffer+0x30/0x33
[<c062163f>] __wait_on_bit+0x33/0x58
[<c04790ab>] sync_buffer+0x0/0x33
[<c04790ab>] sync_buffer+0x0/0x33
[<c06216c6>] out_of_line_wait_on_bit+0x 62/0x6a
[<c043737c>] wake_bit_function+0x0/0x3c
[<c0479058>] __wait_on_buffer+0x1c/0x1f
[<f88684b3>] journal_commit_transaction +0x4cf/0xf 3c [jbd]
[<c042e621>] lock_timer_base+0x15/0x2f
[<c042e6a0>] try_to_del_timer_sync+0x65 /0x6c
[<f886bd08>] kjournald+0xa1/0x1c2 [jbd]
[<c043734f>] autoremove_wake_function+0 x0/0x2d
[<f886bc67>] kjournald+0x0/0x1c2 [jbd]
[<c043728a>] kthread+0xc0/0xee
[<c04371ca>] kthread+0x0/0xee
[<c0405c87>] kernel_thread_helper+0x7/0 x10
=======================
INFO: task syslogd:2386 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
syslogd D 00007CBA 2340 2386 1 2389 2242 (NOTLB)
f5c0ced0 00000086 3e0e3044 00007cba 00000070 00000080 030a9588 00000007
f5c03550 3e0e3868 00007cba 00000824 00000001 f5c0365c c37e9944 f62ea200
e02e8e68 c37ea2e4 00000001 f5c0cecc c041f0c8 00000000 00000000 ffffffff
Call Trace:
[<c041f0c8>] __wake_up+0x2a/0x3d
[<f886b2c1>] log_wait_commit+0x80/0xc7 [jbd]
[<c043734f>] autoremove_wake_function+0 x0/0x2d
[<f8866679>] journal_stop+0x196/0x1bb [jbd]
[<c0495846>] __writeback_single_inode+0 x199/0x2a5
[<c045d334>] do_writepages+0x2b/0x32
[<c0458e37>] __filemap_fdatawrite_range +0x66/0x72
[<c0495ee4>] sync_inode+0x19/0x24
[<f889e019>] ext3_sync_file+0xb1/0xdc [ext3]
[<c0478c15>] do_fsync+0x41/0x83
[<c0478c74>] __do_fsync+0x1d/0x2b
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task hald-addon-stor:2624 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
hald-addon-st D 00007CC8 2744 2624 2606 2615 (NOTLB)
f6137e98 00000082 5e52e576 00007cc8 c048c272 e02265bc c3944c8c 0000000a
f634f000 5e538047 00007cc8 00009ad1 00000002 f634f10c c37f0788 c39e5040
00000800 c37f1128 0028bcfd 00000003 dca86005 dfc0fe40 e0235888 ffffffff
Call Trace:
[<c048c272>] dput+0x22/0xed
[<f8879d7b>] scsi_block_when_processing _errors+0x 7a/0xbf [scsi_mod]
[<c043734f>] autoremove_wake_function+0 x0/0x2d
[<f8854dfc>] sd_open+0x69/0x10f [sd_mod]
[<c047dce0>] do_open+0x1de/0x2ce
[<c047df3c>] blkdev_open+0x0/0x44
[<c047df58>] blkdev_open+0x1c/0x44
[<c0474f91>] __dentry_open+0xc7/0x1ab
[<c04750d9>] nameidata_to_filp+0x19/0x2 8
[<c0475113>] do_filp_open+0x2b/0x31
[<c0475157>] do_sys_open+0x3e/0xae
[<c04751f4>] sys_open+0x16/0x18
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task cpanellogd:4172 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
cpanellogd D 00007CCC 2572 4172 1 4234 4137 (NOTLB)
f7c5bd6c 00000082 fe647b4e 00007ccc 00000000 00000001 f7c5bd34 0000000a
f6379550 fe65d004 00007ccc 000154b6 00000001 f637965c c37e9944 f6135740
005e000a c37ea2e4 e02071b0 00000000 105fbc80 c042d7c7 dfd4972c ffffffff
Call Trace:
[<c042d7c7>] getnstimeofday+0x30/0xb6
[<c0621468>] io_schedule+0x36/0x59
[<c04790ab>] sync_buffer+0x0/0x33
[<c04790db>] sync_buffer+0x30/0x33
[<c062157a>] __wait_on_bit_lock+0x2a/0x 52
[<c04790ab>] sync_buffer+0x0/0x33
[<c0621604>] out_of_line_wait_on_bit_lo ck+0x62/0x 6a
[<c043737c>] wake_bit_function+0x0/0x3c
[<c0479205>] __lock_buffer+0x21/0x24
[<f88666eb>] do_get_write_access+0x4d/0 x462 [jbd]
[<f886627c>] __journal_file_buffer+0x11 6/0x1ed [jbd]
[<f8866b18>] journal_get_write_access+0 x18/0x26 [jbd]
[<f889e80b>] ext3_new_inode+0x591/0x971 [ext3]
[<f88acb40>] ext3_permission+0x0/0xa [ext3]
[<c0482dba>] permission+0xa2/0xb5
[<c0484dff>] __link_path_walk+0xcd4/0xd c3
[<f8866ee5>] journal_start+0xae/0xdd [jbd]
[<f88a4c0a>] ext3_create+0x75/0xdc [ext3]
[<c04833cc>] vfs_create+0xca/0x131
[<c0485e3a>] open_namei+0x16a/0x631
[<c04397a9>] lock_hrtimer_base+0x19/0x3 5
[<c0475104>] do_filp_open+0x1c/0x31
[<c0475157>] do_sys_open+0x3e/0xae
[<c04751f4>] sys_open+0x16/0x18
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task tailwatchd:21557 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task _timeout_s ecs" disables this message.
tailwatchd D 00007CCA 2396 21557 1 3073 5334 21526 (NOTLB)
f7968ecc 00200082 c5a70071 00007cca 80000001 00000000 00000001 0000000a
f6139aa0 c5ace8f9 00007cca 0005e888 00000001 f6139bac c37e9944 c3b19e40
f7968f3c c37ea2e4 e7bff000 00000310 f7968f3c ffffffe9 f7968f3c ffffffff
Call Trace:
[<c062183f>] __mutex_lock_slowpath+0x4d /0x7c
[<c062187d>] .text.lock.mutex+0xf/0x14
[<c0485dad>] open_namei+0xdd/0x631
[<c0475104>] do_filp_open+0x1c/0x31
[<c0475157>] do_sys_open+0x3e/0xae
[<c04751f4>] sys_open+0x16/0x18
[<c0404f4b>] syscall_call+0x7/0xb
=======================
aacraid: aac_fib_send: first asynchronous command timed out.
Usually a result of a PCI interrupt routing problem;
update mother board BIOS or consider utilizing one of
the SAFE mode kernel options (acpi, apic etc)
There was no check of the filesystems, and the last event was Monday at 3 am.
The nature of the problem prevented any log-writing. I was able to run dmesg on one crash, as I caught it on the way down:
irq 169: nobody cared (try booting with the "irqpoll" option)
[<c044ea52>] __report_bad_irq+0x2b/0x69
[<c044ec49>] note_interrupt+0x1b9/0x1f0
[<c044e215>] handle_IRQ_event+0x45/0x8c
[<c044e339>] __do_IRQ+0xdd/0x118
[<c044e25c>] __do_IRQ+0x0/0x118
[<c04074c4>] do_IRQ+0x9b/0xc3
[<c040597a>] common_interrupt+0x1a/0x20
[<c05339f3>] acpi_processor_idle_simple
[<c040597a>] common_interrupt+0x1a/0x20
[<c053387f>] acpi_processor_idle_simple
[<c0403d14>] cpu_idle+0x9f/0xb9
=======================
handlers:
[<c058e26d>] (usb_hcd_irq+0x0/0x50)
[<f88db346>] (aac_rx_intr_message+0x0/0
Disabling IRQ #169
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter reset request. SCSI hang ?
INFO: task kjournald:490 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
kjournald D 00007CBB 2788 490 19 515 469 (L-TLB)
dff96ed4 00000046 701fe859 00007cbb 00000005 00000000 1834632e 0000000a
dff69550 701ff33a 00007cbb 00000ae1 00000002 dff6965c c37f0788 c39ac040
105d6eda dff4a4c4 c37f1128 c37f75cc 00000020 00000001 dff4a4bc 105d6eda
Call Trace:
[<c0621468>] io_schedule+0x36/0x59
[<c04790db>] sync_buffer+0x30/0x33
[<c062163f>] __wait_on_bit+0x33/0x58
[<c04790ab>] sync_buffer+0x0/0x33
[<c04790ab>] sync_buffer+0x0/0x33
[<c06216c6>] out_of_line_wait_on_bit+0x
[<c043737c>] wake_bit_function+0x0/0x3c
[<c0479058>] __wait_on_buffer+0x1c/0x1f
[<f88684b3>] journal_commit_transaction
[<c042e621>] lock_timer_base+0x15/0x2f
[<c042e6a0>] try_to_del_timer_sync+0x65
[<f886bd08>] kjournald+0xa1/0x1c2 [jbd]
[<c043734f>] autoremove_wake_function+0
[<f886bc67>] kjournald+0x0/0x1c2 [jbd]
[<c043728a>] kthread+0xc0/0xee
[<c04371ca>] kthread+0x0/0xee
[<c0405c87>] kernel_thread_helper+0x7/0
=======================
INFO: task syslogd:2386 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
syslogd D 00007CBA 2340 2386 1 2389 2242 (NOTLB)
f5c0ced0 00000086 3e0e3044 00007cba 00000070 00000080 030a9588 00000007
f5c03550 3e0e3868 00007cba 00000824 00000001 f5c0365c c37e9944 f62ea200
e02e8e68 c37ea2e4 00000001 f5c0cecc c041f0c8 00000000 00000000 ffffffff
Call Trace:
[<c041f0c8>] __wake_up+0x2a/0x3d
[<f886b2c1>] log_wait_commit+0x80/0xc7 [jbd]
[<c043734f>] autoremove_wake_function+0
[<f8866679>] journal_stop+0x196/0x1bb [jbd]
[<c0495846>] __writeback_single_inode+0
[<c045d334>] do_writepages+0x2b/0x32
[<c0458e37>] __filemap_fdatawrite_range
[<c0495ee4>] sync_inode+0x19/0x24
[<f889e019>] ext3_sync_file+0xb1/0xdc [ext3]
[<c0478c15>] do_fsync+0x41/0x83
[<c0478c74>] __do_fsync+0x1d/0x2b
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task miva:2927 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
miva D 00007CBA 2572 2927 2757 (NOTLB)
e3d1cf44 00000082 a989794c 00007cba f88ad1e0 e3d1cefc 00000000 00000001
f20be000 aa52abe7 00007cba 00c9329b 00000003 f20be10c c37f75cc f60e2040
c044e25c c37f7f6c e749e380 e3d1cf30 00000000 e3d1c000 c048af22 ffffffff
Call Trace:
[<c044e25c>] __do_IRQ+0x0/0x118
[<c048af22>] locks_remove_posix+0x7d/0x
[<c062183f>] __mutex_lock_slowpath+0x4d
[<c062187d>] .text.lock.mutex+0xf/0x14
[<c0476edc>] generic_file_llseek+0x2a/0
[<c0476eb2>] generic_file_llseek+0x0/0x
[<c04761f5>] vfs_llseek+0x30/0x34
[<c0477077>] sys_lseek+0x38/0x63
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task miva:2928 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
miva D 00007CBA 2524 2928 2895 (NOTLB)
d9ee9b2c 00000086 98dd3236 00007cba c1351cc0 00000000 dcdc0990 00000008
f63d1aa0 99c34660 00007cba 00e6142a 00000000 f63d1bac c37e2b00 f61463c0
00001000 c37e34a0 f65043e0 00000bef 105d52ba c042d7c7 e030160c ffffffff
Call Trace:
[<c042d7c7>] getnstimeofday+0x30/0xb6
[<c0621468>] io_schedule+0x36/0x59
[<c04790ab>] sync_buffer+0x0/0x33
[<c04790db>] sync_buffer+0x30/0x33
[<c062157a>] __wait_on_bit_lock+0x2a/0x
[<c04790ab>] sync_buffer+0x0/0x33
[<c0621604>] out_of_line_wait_on_bit_lo
[<c043737c>] wake_bit_function+0x0/0x3c
[<c0479205>] __lock_buffer+0x21/0x24
[<f88666eb>] do_get_write_access+0x4d/0
[<f8866b18>] journal_get_write_access+0
[<f88a01f3>] ext3_get_blocks_handle+0x6
[<f88a0711>] ext3_get_block+0xa2/0xd6 [ext3]
[<c0479436>] __block_prepare_write+0x19
[<c045c636>] get_page_from_freelist+0x9
[<c04796c4>] block_write_begin+0x88/0xe
[<f88a066f>] ext3_get_block+0x0/0xd6 [ext3]
[<f88a1ad8>] ext3_write_begin+0xc2/0x1a
[<f88a066f>] ext3_get_block+0x0/0xd6 [ext3]
[<c04595af>] generic_file_buffered_writ
[<c042a626>] current_fs_time+0x4a/0x54
[<c0459edf>] __generic_file_aio_write_n
[<c0459431>] __generic_file_aio_read+0x
[<c0457ef3>] file_read_actor+0x0/0xd5
[<c0459fbc>] generic_file_aio_write+0x5
[<f889dea1>] ext3_file_write+0x19/0x83 [ext3]
[<c0476312>] do_sync_write+0xb6/0xf1
[<c043734f>] autoremove_wake_function+0
[<c044ae8f>] audit_syscall_entry+0x193/
[<c0476f78>] generic_file_llseek+0xc6/0
[<c047625c>] do_sync_write+0x0/0xf1
[<c0476b9b>] vfs_write+0xa1/0x143
[<c04771c5>] sys_write+0x3c/0x63
[<c0404f4b>] syscall_call+0x7/0xb
=======================
aacraid: SCSI bus appears hung
INFO: task pdflush:235 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
pdflush D 00007CCA 2664 235 19 236 234 (L-TLB)
dff3ff34 00000046 e00b2607 00007cca 00000000 00000100 00000000 0000000a
dffa4550 e00b3239 00007cca 00000c32 00000003 dffa465c c37f75cc c39ac200
00000000 c37f7f6c 00000000 dffa4550 c38eec50 c37f44cc c39ac200 ffffffff
Call Trace:
[<c062183f>] __mutex_lock_slowpath+0x4d
[<c062187d>] .text.lock.mutex+0xf/0x14
[<c0439d75>] down_read+0x8/0x11
[<c047cc52>] sync_supers+0x47/0xb8
[<c045d7c1>] wb_kupdate+0x36/0x130
[<c045dc77>] pdflush+0x0/0x1a1
[<c045dd82>] pdflush+0x10b/0x1a1
[<c045d78b>] wb_kupdate+0x0/0x130
[<c043728a>] kthread+0xc0/0xee
[<c04371ca>] kthread+0x0/0xee
[<c0405c87>] kernel_thread_helper+0x7/0
=======================
INFO: task kjournald:490 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
kjournald D 00007CBB 2788 490 19 515 469 (L-TLB)
dff96ed4 00000046 701fe859 00007cbb 00000005 00000000 1834632e 0000000a
dff69550 701ff33a 00007cbb 00000ae1 00000002 dff6965c c37f0788 c39ac040
105d6eda dff4a4c4 c37f1128 c37f75cc 00000020 00000001 dff4a4bc 105d6eda
Call Trace:
[<c0621468>] io_schedule+0x36/0x59
[<c04790db>] sync_buffer+0x30/0x33
[<c062163f>] __wait_on_bit+0x33/0x58
[<c04790ab>] sync_buffer+0x0/0x33
[<c04790ab>] sync_buffer+0x0/0x33
[<c06216c6>] out_of_line_wait_on_bit+0x
[<c043737c>] wake_bit_function+0x0/0x3c
[<c0479058>] __wait_on_buffer+0x1c/0x1f
[<f88684b3>] journal_commit_transaction
[<c042e621>] lock_timer_base+0x15/0x2f
[<c042e6a0>] try_to_del_timer_sync+0x65
[<f886bd08>] kjournald+0xa1/0x1c2 [jbd]
[<c043734f>] autoremove_wake_function+0
[<f886bc67>] kjournald+0x0/0x1c2 [jbd]
[<c043728a>] kthread+0xc0/0xee
[<c04371ca>] kthread+0x0/0xee
[<c0405c87>] kernel_thread_helper+0x7/0
=======================
INFO: task syslogd:2386 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
syslogd D 00007CBA 2340 2386 1 2389 2242 (NOTLB)
f5c0ced0 00000086 3e0e3044 00007cba 00000070 00000080 030a9588 00000007
f5c03550 3e0e3868 00007cba 00000824 00000001 f5c0365c c37e9944 f62ea200
e02e8e68 c37ea2e4 00000001 f5c0cecc c041f0c8 00000000 00000000 ffffffff
Call Trace:
[<c041f0c8>] __wake_up+0x2a/0x3d
[<f886b2c1>] log_wait_commit+0x80/0xc7 [jbd]
[<c043734f>] autoremove_wake_function+0
[<f8866679>] journal_stop+0x196/0x1bb [jbd]
[<c0495846>] __writeback_single_inode+0
[<c045d334>] do_writepages+0x2b/0x32
[<c0458e37>] __filemap_fdatawrite_range
[<c0495ee4>] sync_inode+0x19/0x24
[<f889e019>] ext3_sync_file+0xb1/0xdc [ext3]
[<c0478c15>] do_fsync+0x41/0x83
[<c0478c74>] __do_fsync+0x1d/0x2b
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task hald-addon-stor:2624 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
hald-addon-st D 00007CC8 2744 2624 2606 2615 (NOTLB)
f6137e98 00000082 5e52e576 00007cc8 c048c272 e02265bc c3944c8c 0000000a
f634f000 5e538047 00007cc8 00009ad1 00000002 f634f10c c37f0788 c39e5040
00000800 c37f1128 0028bcfd 00000003 dca86005 dfc0fe40 e0235888 ffffffff
Call Trace:
[<c048c272>] dput+0x22/0xed
[<f8879d7b>] scsi_block_when_processing
[<c043734f>] autoremove_wake_function+0
[<f8854dfc>] sd_open+0x69/0x10f [sd_mod]
[<c047dce0>] do_open+0x1de/0x2ce
[<c047df3c>] blkdev_open+0x0/0x44
[<c047df58>] blkdev_open+0x1c/0x44
[<c0474f91>] __dentry_open+0xc7/0x1ab
[<c04750d9>] nameidata_to_filp+0x19/0x2
[<c0475113>] do_filp_open+0x2b/0x31
[<c0475157>] do_sys_open+0x3e/0xae
[<c04751f4>] sys_open+0x16/0x18
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task cpanellogd:4172 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
cpanellogd D 00007CCC 2572 4172 1 4234 4137 (NOTLB)
f7c5bd6c 00000082 fe647b4e 00007ccc 00000000 00000001 f7c5bd34 0000000a
f6379550 fe65d004 00007ccc 000154b6 00000001 f637965c c37e9944 f6135740
005e000a c37ea2e4 e02071b0 00000000 105fbc80 c042d7c7 dfd4972c ffffffff
Call Trace:
[<c042d7c7>] getnstimeofday+0x30/0xb6
[<c0621468>] io_schedule+0x36/0x59
[<c04790ab>] sync_buffer+0x0/0x33
[<c04790db>] sync_buffer+0x30/0x33
[<c062157a>] __wait_on_bit_lock+0x2a/0x
[<c04790ab>] sync_buffer+0x0/0x33
[<c0621604>] out_of_line_wait_on_bit_lo
[<c043737c>] wake_bit_function+0x0/0x3c
[<c0479205>] __lock_buffer+0x21/0x24
[<f88666eb>] do_get_write_access+0x4d/0
[<f886627c>] __journal_file_buffer+0x11
[<f8866b18>] journal_get_write_access+0
[<f889e80b>] ext3_new_inode+0x591/0x971
[<f88acb40>] ext3_permission+0x0/0xa [ext3]
[<c0482dba>] permission+0xa2/0xb5
[<c0484dff>] __link_path_walk+0xcd4/0xd
[<f8866ee5>] journal_start+0xae/0xdd [jbd]
[<f88a4c0a>] ext3_create+0x75/0xdc [ext3]
[<c04833cc>] vfs_create+0xca/0x131
[<c0485e3a>] open_namei+0x16a/0x631
[<c04397a9>] lock_hrtimer_base+0x19/0x3
[<c0475104>] do_filp_open+0x1c/0x31
[<c0475157>] do_sys_open+0x3e/0xae
[<c04751f4>] sys_open+0x16/0x18
[<c0404f4b>] syscall_call+0x7/0xb
=======================
INFO: task tailwatchd:21557 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task
tailwatchd D 00007CCA 2396 21557 1 3073 5334 21526 (NOTLB)
f7968ecc 00200082 c5a70071 00007cca 80000001 00000000 00000001 0000000a
f6139aa0 c5ace8f9 00007cca 0005e888 00000001 f6139bac c37e9944 c3b19e40
f7968f3c c37ea2e4 e7bff000 00000310 f7968f3c ffffffe9 f7968f3c ffffffff
Call Trace:
[<c062183f>] __mutex_lock_slowpath+0x4d
[<c062187d>] .text.lock.mutex+0xf/0x14
[<c0485dad>] open_namei+0xdd/0x631
[<c0475104>] do_filp_open+0x1c/0x31
[<c0475157>] do_sys_open+0x3e/0xae
[<c04751f4>] sys_open+0x16/0x18
[<c0404f4b>] syscall_call+0x7/0xb
=======================
aacraid: aac_fib_send: first asynchronous command timed out.
Usually a result of a PCI interrupt routing problem;
update mother board BIOS or consider utilizing one of
the SAFE mode kernel options (acpi, apic etc)
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
This is good. Thank you for adding to my perspective.
do you have screenshots or log files of your kernel crash?
Have you already forced a check of the filesystems?