We help IT Professionals succeed at work.

centos software raid problem

luis_chiang
luis_chiang used Ask the Experts™
on
Hi Alll,

I have a problem with my software Raid,

every command I try I get these error:

-bash: /usr/bin/top: Input/output error

And I don't know how to detect which is the hard drive that is not working to disable and replace it!

Please help,


Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
Download the CentOS CD 1.  You can boot into the LiveCD (no install, just the LiveCD mode).

Then you can try to check your drives.  
You could try these to start with

cat /proc/mdstats
mdadm --detail /dev/md0         ( for example )
Top Expert 2011

Commented:
> Input/output error
Sounds like hardware issue.
1. Please run
dmesg
  to check if there is any error message.

2. Any recent change on hardware, disk, memory, BIOS setting, BIOS upgrade, disk controller firmware upgrade.....?

Author

Commented:
here is dmseg

I haven't update or change nothing recently in the server
Linux version 2.6.18-194.32.1.el5 (mockbuild@builder10.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Wed Jan 5 17:53:09 EST 2011
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000010000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bfff0000 (usable)
 BIOS-e820: 00000000bfff0000 - 00000000bffff000 (ACPI data)
 BIOS-e820: 00000000bffff000 - 00000000c0000000 (ACPI NVS)
 BIOS-e820: 00000000ffb80000 - 0000000100000000 (reserved)
2175MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000ff780
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
disabling kdump
Using x86 segment limits to approximate NX protection
On node 0 totalpages: 786416
  DMA zone: 4096 pages, LIFO batch:0
  Normal zone: 225280 pages, LIFO batch:31
  HighMem zone: 557040 pages, LIFO batch:31
DMI 2.3 present.
Using APIC driver default
ACPI: RSDP (v000 ACPIAM                                ) @ 0x000f5380
ACPI: RSDT (v001 A M I  OEMRSDT  0x12000305 MSFT 0x00000097) @ 0xbfff0000
ACPI: FADT (v002 A M I  OEMFACP  0x12000305 MSFT 0x00000097) @ 0xbfff0200
ACPI: MADT (v001 A M I  OEMAPIC  0x12000305 MSFT 0x00000097) @ 0xbfff0300
ACPI: OEMB (v001 A M I  OEMBIOS  0x12000305 MSFT 0x00000097) @ 0xbffff040
ACPI: DSDT (v001  1ABHQ 1ABHQ007 0x00000007 INTL 0x02002026) @ 0x00000000
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:2 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:2 APIC version 20
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at c4000000 (gap: c0000000:3fb80000)
Detected 2666.854 MHz processor.
Built 1 zonelists.  Total pages: 786416
Kernel command line: ro root=/dev/md2
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
CPU 0 irqstacks, hard=c0769000 soft=c0749000
PID hash table entries: 4096 (order: 12, 16384 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 3112192k/3145664k available (2180k kernel code, 32200k reserved, 910k data, 228k init, 2228160k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop (skipped), value calculated using timer frequency.. 5333.70 BogoMIPS (lpj=2666854)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebf3ff 00000000 00000000 00000080 00004400 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU0: Thermal monitoring enabled
Checking 'hlt' instruction... OK.
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
CPU0: Intel(R) Xeon(TM) CPU 2.66GHz stepping 05
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 11000
CPU 1 irqstacks, hard=c076a000 soft=c074a000
Initializing CPU#1
Calibrating delay using timer specific routine.. 5332.75 BogoMIPS (lpj=2666377)
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebf3ff 00000000 00000000 00000080 00004400 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Xeon(TM) CPU 2.66GHz stepping 05
Total of 2 processors activated (10666.46 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
Using local APIC timer interrupts.
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
sizeof(vma)=84 bytes
sizeof(page)=32 bytes
sizeof(inode)=340 bytes
sizeof(dentry)=136 bytes
sizeof(ext3inode)=492 bytes
sizeof(buffer_head)=52 bytes
sizeof(skbuff)=176 bytes
migration_cost=40
checking if image is initramfs... it is
Freeing initrd memory: 2534k freed
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=1
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: No dock devices found.
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI quirk: region 0800-087f claimed by ICH4 ACPI/GPIO/TCO
PCI quirk: region 0480-04bf claimed by ICH4 GPIO
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
PCI: Firmware left 0000:01:07.0 e100 interrupts enabled, disabling
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 *5 6 7 10 11 12 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 13 devices
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
pnp: 00:09: iomem range 0xfed20000-0xfed8ffff has been reserved
pnp: 00:09: iomem range 0xffb00000-0xffbfffff could not be reserved
pnp: 00:0a: iomem range 0xfec00000-0xfec00fff has been reserved
pnp: 00:0a: iomem range 0xfee00000-0xfee00fff has been reserved
pnp: 00:0b: ioport range 0x680-0x6ff has been reserved
pnp: 00:0b: ioport range 0x295-0x296 has been reserved
pnp: 00:0c: iomem range 0x0-0x9ffff could not be reserved
pnp: 00:0c: iomem range 0xc0000-0xdffff could not be reserved
pnp: 00:0c: iomem range 0xe0000-0xfffff could not be reserved
pnp: 00:0c: iomem range 0x100000-0xbfffffff could not be reserved
PCI: Bridge: 0000:00:1e.0
  IO window: b000-0000
  MEM window: fca00000-00000000
  PREFETCH window 0x00000000c4000000-0x00000000c40fffff
PCI: Setting latency timer of device 0000:00:1e.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 131072 (order: 8, 1048576 bytes)
TCP bind hash table entries: 65536 (order: 7, 524288 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
apm: BIOS not found.
audit: initializing netlink socket (disabled)
type=2000 audit(1312515777.938:1): initialized
highmem bounce pool size: 64 pages
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
SELinux:  Registering netfilter hooks
Initializing Cryptographic API
alg: No test for crc32c (crc32c-generic)
ksign: Installing public key data
Loading keyring
- Added public key 59F555817D939EA9
- User ID: CentOS (Kernel Module GPG key)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
0000:00:1d.7 EHCI: BIOS handoff failed (BIOS bug ?) 01010001
Boot video device is 0000:01:04.0
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:05: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:06: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
brd: module loaded
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH5: IDE controller at PCI slot 0000:00:1f.1
PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 169
ICH5: chipset revision 2
ICH5: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, hdb:pio
    ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
hdc: CD-224E, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide0...
ide-floppy driver 0.99.newide
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: No PS/2 controller found. Probing ports directly.
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP bic registered
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI No-Shortcut mode
ACPI: (supports S0 S1 S4 S5)
Initalizing network drop monitor service
Time: tsc clocksource has been installed.
Freeing unused kernel memory: 228k freed
Write protecting the kernel read-only data: 410k
ACPI: PCI Interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 177
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: debug port 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: irq 177, io mem 0xfebffc00
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 6 ports detected
ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 185
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: irq 185, io base 0x0000d000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 193
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: irq 193, io base 0x0000d400
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 169
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.2: irq 169, io base 0x0000d800
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
md: raid1 personality registered for level 1
SCSI subsystem initialized
libata version 3.00 loaded.
ata_piix 0000:00:1f.2: version 2.12
ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 18 (level, low) -> IRQ 169
ata_piix 0000:00:1f.2: MAP [ P0 -- P1 -- ]
PCI: Setting latency timer of device 0000:00:1f.2 to 64
scsi0 : ata_piix
scsi1 : ata_piix
ata1: SATA max UDMA/133 cmd 0xec00 ctl 0xe800 bmdma 0xdc00 irq 169
ata2: SATA max UDMA/133 cmd 0xe400 ctl 0xe000 bmdma 0xdc08 irq 169
ata1.00: ATA-7: SAMSUNG HD322HJ, 1AG01118, max UDMA7
ata1.00: 625142448 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata1.00: configured for UDMA/133
ata2: link is slow to respond, please be patient (ready=0)
ata2: device not ready (errno=-16), forcing hardreset
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: SRST failed (errno=-16)
ata2: reset failed, giving up
  Vendor: ATA       Model: SAMSUNG HD322HJ   Rev: 1AG0
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 625142448 512-byte hdwr sectors (320073 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3
sd 0:0:0:0: Attached scsi disk sda
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised: dm-devel@redhat.com
device-mapper: dm-raid45: initialized v0.2594l
md: Autodetecting RAID arrays.
md: autorun ...
md: considering sda3 ...
md:  adding sda3 ...
md: sda2 has different UUID to sda3
md: sda1 has different UUID to sda3
md: created md2
md: bind<sda3>
md: running: <sda3>
md: md2: raid array is not clean -- starting background reconstruction
raid1: raid set md2 active with 1 out of 2 mirrors
md: considering sda2 ...
md:  adding sda2 ...
md: sda1 has different UUID to sda2
md: created md1
md: bind<sda2>
md: running: <sda2>
raid1: raid set md1 active with 1 out of 2 mirrors
md: considering sda1 ...
md:  adding sda1 ...
md: created md0
md: bind<sda1>
md: running: <sda1>
raid1: raid set md0 active with 1 out of 2 mirrors
md: ... autorun DONE.
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
SELinux:  Disabled at runtime.
SELinux:  Unregistering netfilter hooks
type=1404 audit(1312515885.011:2): selinux=0 auid=4294967295 ses=4294967295
EDAC MC: Ver: 2.0.1 Jan  5 2011
EDAC e7xxx: error reporting device not found:vendor 8086 device 0x2541 (broken BIOS?)
Intel(R) PRO/1000 Network Driver - version 7.3.21-k4.1-NAPI
Copyright (c) 1999-2006 Intel Corporation.
ACPI: PCI Interrupt 0000:01:08.0[A] -> GSI 23 (level, low) -> IRQ 177
e100: Intel(R) PRO/100 Network Driver, 3.5.10-k3-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
e1000: 0000:01:08.0: e1000_probe: (PCI:33MHz:32-bit) 00:30:48:71:e6:03
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378, irq 7 [PCSPP]
intel_rng: FWH not detected
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
ACPI: PCI Interrupt 0000:01:07.0[A] -> GSI 22 (level, low) -> IRQ 201
input: PC Speaker as /class/input/input0
e100: eth1: e100_probe: addr 0xfeafd000, irq 201, MAC addr 00:30:48:71:E6:92
hdc: ATAPI 24X CD-ROM drive, 128kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ACPI: PCI Interrupt 0000:00:1f.3[B] -> GSI 17 (level, low) -> IRQ 209
Program cat tried to read /dev/mem between 6f7000->6f7800.
Program cat tried to read /dev/mem between 6f7000->6f9000.
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
sd 0:0:0:0: Attached scsi generic sg0 type 0
Program cat tried to read /dev/mem between 6f7000->6f7800.
Program cat tried to read /dev/mem between 6f7000->6f9000.
Program cat tried to read /dev/mem between 6f7000->6f7800.
Program cat tried to read /dev/mem between 6f7000->6f9000.
Program cat tried to read /dev/mem between 6f7000->6f7800.
Program cat tried to read /dev/mem between 6f7000->6f9000.
lp0: using parport0 (interrupt-driven).
lp0: console ready
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ACPI: Sleep Button (CM) [SLPB]
ACPI: Mapper loaded
dell-wmi: No known WMI GUID found
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
device-mapper: multipath: version 1.0.5 loaded
Program mount tried to read /dev/mem between 6f7000->6f7800.
Program mount tried to read /dev/mem between 6f7000->6f9000.
Program fsck tried to read /dev/mem between 6f7000->6f7800.
Program fsck tried to read /dev/mem between 6f7000->6f9000.
EXT3 FS on md2, internal journal
kjournald starting.  Commit interval 5 seconds
Program mount tried to read /dev/mem between 6f7000->6f7800.
Program mount tried to read /dev/mem between 6f7000->6f9000.
EXT3 FS on md0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Program mount tried to read /dev/mem between 6f7000->6f7800.
Program mount tried to read /dev/mem between 6f7000->6f9000.
Program mount tried to read /dev/mem between 6f7000->6f7800.
Program mount tried to read /dev/mem between 6f7000->6f9000.
Program cat tried to read /dev/mem between 6f7000->6f7800.
Program cat tried to read /dev/mem between 6f7000->6f9000.
Adding 2048184k swap on /dev/md1.  Priority:-1 extents:1 across:2048184k
IA-32 Microcode Update Driver: v1.14a <tigran@veritas.com>
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program mount tried to read /dev/mem between 6f7000->6f7800.
Program mount tried to read /dev/mem between 6f7000->6f9000.
microcode: CPU1 updated from revision 0x14 to 0x29, date = 08112004 
microcode: CPU0 updated from revision 0x14 to 0x29, date = 08112004 
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Loading iSCSI transport class v2.0-871.
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
cxgb3i: tag itt 0x1fff, 13 bits, age 0xf, 4 bits.
iscsi: registered transport (cxgb3i)
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
Broadcom NetXtreme II CNIC Driver cnic v2.1.0 (Oct 10, 2009)
Broadcom NetXtreme II iSCSI Driver bnx2i v2.1.0 (Dec 06, 2009)
iscsi: registered transport (bnx2i)
iscsi: registered transport (tcp)
iscsi: registered transport (iser)
iscsi: registered transport (be2iscsi)
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
ADDRCONF(NETDEV_UP): eth0: link is not ready
e1000: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
ADDRCONF(NETDEV_UP): eth1: link is not ready
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program mount tried to read /dev/mem between 6f7000->6f7800.
Program mount tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Bluetooth: Core ver 2.10
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.8
Bluetooth: L2CAP socket layer initialized
Bluetooth: HIDP (Human Interface Emulation) ver 1.1
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program basename tried to read /dev/mem between 6f7000->6f7800.
Program basename tried to read /dev/mem between 6f7000->6f9000.
Program mount tried to read /dev/mem between 6f7000->6f7800.
Program mount tried to read /dev/mem between 6f7000->6f9000.
Program mount tried to read /dev/mem between 6f7000->6f7800.
Program mount tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
IPv4 over IPv4 tunneling driver
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program ifconfig tried to read /dev/mem between 6f7000->6f7800.
Program ifconfig tried to read /dev/mem between 6f7000->6f9000.
Program touch tried to read /dev/mem between 6f7000->6f7800.
Program touch tried to read /dev/mem between 6f7000->6f9000.
Program cat tried to read /dev/mem between 6f7000->6f7800.
Program cat tried to read /dev/mem between 6f7000->6f9000.

Open in new window

Author

Commented:
mdadm --detail
/dev/md0:
        Version : 0.90
  Creation Time : Fri Aug  6 02:05:53 2010
     Raid Level : raid1
     Array Size : 104320 (101.89 MiB 106.82 MB)
  Used Dev Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Aug  4 22:45:01 2011
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 17970dc6:4c3e3a99:991a9c32:636a2e01
         Events : 0.274

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       0        0        1      removed

/dev/md1:
        Version : 0.90
  Creation Time : Fri Aug  6 02:00:29 2010
     Raid Level : raid1
     Array Size : 2048192 (2000.52 MiB 2097.35 MB)
  Used Dev Size : 2048192 (2000.52 MiB 2097.35 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Sun Jul 31 04:22:09 2011
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : ae657b4a:bcaa352a:10ffedd4:4de70984
         Events : 0.256

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       0        0        1      removed


/dev/md2:
        Version : 0.90
  Creation Time : Fri Aug  6 02:00:30 2010
     Raid Level : raid1
     Array Size : 310415872 (296.04 GiB 317.87 GB)
  Used Dev Size : 310415872 (296.04 GiB 317.87 GB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Thu Aug  4 23:42:45 2011
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : c797b71b:d2628d2a:d8b263af:44470f07
         Events : 0.3947138

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed

Open in new window

Top Expert 2011

Commented:
I saw a lot of
> Program touch tried to read /dev/mem between 6f7000->6f7800.
Seems like memory problem.
You might want to replace the memory modules first.

Author

Commented:
mdstat
cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0]
      104320 blocks [2/1] [U_]

md1 : active raid1 sda2[0]
      2048192 blocks [2/1] [U_]

md2 : active raid1 sda3[0]
      310415872 blocks [2/1] [U_]

unused devices: <none>

Open in new window

Author

Commented:
with the problem: read /dev/mem between 6f7000->6f7800. .. I have been working fine for more than a year

Now I want to able to work well without replacing the harddrive,  I just want to disable the bad drive and continue working only with the good one

Top Expert 2011

Commented:
In dmesg
----
ata2: SATA max UDMA/133 cmd 0xe400 ctl 0xe000 bmdma 0xdc08 irq 169
....
ata2: link is slow to respond, please be patient (ready=0)
ata2: device not ready (errno=-16), forcing hardreset
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: SRST failed (errno=-16)
ata2: reset failed, giving up
  Vendor: ATA       Model: SAMSUNG HD322HJ   Rev: 1AG0
  Type:   Direct-Access                      ANSI SCSI revision: 05
-----------
ATA2 has problem. Check the disk cable (replace with a good one) or replace with good disk.

Backup your data ASAP.

Author

Commented:

I already backup my data, but I don't want to reinstall the server
Top Expert 2011

Commented:
OK,
good hard drive is
ata1.00: ATA-7: SAMSUNG HD322HJ, 1AG01118, max UDMA7

bad one is
Model: SAMSUNG HD322HJ   Rev: 1AG0

Shut down the server and locate the bad one and replace it.

Author

Commented:
and it is possible to continue working but only with one hard drive?
Top Expert 2011

Commented:
Yes, you can boot with one hard disk.
Top Expert 2011
Commented: