Link to home
Start Free TrialLog in
Avatar of dee43
dee43Flag for United States of America

asked on

WARNING: md: d1: read error on /dev/dsk/c1t0d0s0

Our Sun Fire V480 server is reporting errors in the message log.  The server is running solaris 9, 4x900 MHz; 16GB memory and 2x36 GB hard drives (mirrored).

The message log reports the following:  

Aug  8 08:03:47 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug  8 08:03:47 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug  8 08:03:54 cells md_stripe: [ID 641072 kern.warning] WARNING: md: d1: read
error on /dev/dsk/c1t0d0s0
Aug  8 08:03:54 cells md_mirror: [ID 104909 kern.warning] WARNING: md: d1: /dev/
dsk/c1t0d0s0 needs maintenance
Aug  8 08:19:57 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug  8 08:19:57 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug  8 08:20:01 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug  8 08:20:01 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug  8 08:20:01 cells md_stripe: [ID 641072 kern.warning] WARNING: md: d3: read
error on /dev/dsk/c1t0d0s5
Aug  8 08:20:08 cells last message repeated 1 time
Aug  8 08:20:08 cells md_mirror: [ID 104909 kern.warning] WARNING: md: d3: /dev/
dsk/c1t0d0s5 needs maintenance

AND ERRORS REPORTED ON 8/15
-----------------------------------------
Aug 15 08:36:05 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug 15 08:36:05 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug 15 08:36:05 cells scsi: [ID 107833 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0/ssd@w21000004cf9666db,0 (ssd1):
Aug 15 08:36:05 cells   SCSI transport failed: reason 'reset': retrying command
Aug 15 08:36:21 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug 15 08:36:21 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug 15 08:36:24 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug 15 08:36:24 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug 15 08:36:25 cells md_stripe: [ID 641072 kern.warning] WARNING: md: d6: read
error on /dev/dsk/c1t0d0s3

Also, Solaris management console does not work any more.  When you start the  program, it says " starting the server for the first time. May take a few minutes.  Please allow configuation to continue until you see " Welcome to the Solaris Management Console."

Can someone tell me what these errors mean?  Can this be fixed without loosing data on the FS.

Help!
Avatar of rugdog
rugdog
Flag of Mexico image

seems like one of your hard drives in the mirror set is failing. you shouldn't loose data because they're mirrored, but certainly you have to check that failing disk. is it hot-swappable?
U get these type of err's even if ur disk actually not failed ...

d0 -m d1 d2
d1 1 1 c0t0d0s0
d2 1 1 c0t1d0s0

When u run metastat d0 .. u will actualy fing metasync command for d2 and metareplace command on d1. In this case clear the mirrors and recreate it..

in /etc/lvm/md.cf comment the above 3 lines and run metaroot /dev/dsk/c0t0d0s0 and issue reboot -- -s

then run metaclear -r d0 followed by the below commands

metainit -f d1 1 1 c0t0d0s0
metainit d2 c0t1d0s0
metainit d0 -m d2
metaroot d0
lockfs -fa
reboot
metattach d0 d1 (Note: - U r making primary mirror as d2 and secondary as d1)
metastat d2 (It should be ok now)

If u wanted to make d1 as primary and d2 as secondary follow the same procedure again ..

Thanks
Mada

Avatar of dee43

ASKER

Why can't I open the Solaris Management console?
Avatar of Hanno P.S.
a) What shows
     metastat -p      (shows your setup)

I' assume this:
d0 -m d1 d2
d1 1 1 c1t0d0s0
d2 1 1 c1t1d0s0
d5 -m d3 d4
d3 1 1 c1t0d0s0
d5 1 1 c1t1d0s0

b) check status of mirrored metadevice d0 (d5 repectively)
metastat d0

If you get a message like
" ... maintenance... "
and someting like
" ... run metareplace c1t1d0s0 <device2>"

Try first to resync with
metareplace d0 -e c1t1d0s0

Could you post output from "metastat -p" please?
Avatar of dee43

ASKER

JustUNIX,

Here's the output from metastat -p and metastat d0.

metastat -p
d20 -m d16 d18 1
d16 1 2 c4t3d0s7 c4t4d0s7 -i 32b
d18 1 1 c4t6d0s6
d19 -m d15 d17 1
d15 1 2 c4t1d0s5 c4t2d0s5 -i 32b
d17 1 1 c4t5d0s6
d14 -m d12 d13 1
d12 1 1 c1t0d0s1
d13 1 1 c1t1d0s1
d11 -m d9 d10 1
d9 1 1 c1t0d0s4
d10 1 1 c1t1d0s4
d8 -m d6 d7 1
d6 1 1 c1t0d0s3
d7 1 1 c1t1d0s3
d5 -m d3 d4 1
d3 1 1 c1t0d0s5
d4 1 1 c1t1d0s5
d0 -m d1 d2 1
d1 1 1 c1t0d0s0
d2 1 1 c1t1d0s0
hsp001

---------------------------

 metastat d0
d0: Mirror
    Submirror 0: d1
      State: Needs maintenance
    Submirror 1: d2
      State: Okay        
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 6292242 blocks

d1: Submirror of d0
    State: Needs maintenance
    Invoke: metareplace d0 c1t0d0s0 <new device>
    Size: 6292242 blocks
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c1t0d0s0          0     No     Maintenance   Yes


d2: Submirror of d0
    State: Okay        
    Size: 6292242 blocks
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c1t1d0s0          0     No            Okay   Yes


Device Relocation Information:
Device   Reloc  Device ID
c1t0d0   Yes    id1,ssd@w20000004cf9666db
c1t1d0   Yes    id1,ssd@w20000004cf966034

---------------------------------
ASKER CERTIFIED SOLUTION
Avatar of Hanno P.S.
Hanno P.S.
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial