WARNING: md: d1: read error on /dev/dsk/c1t0d0s0

Our Sun Fire V480 server is reporting errors in the message log.  The server is running solaris 9, 4x900 MHz; 16GB memory and 2x36 GB hard drives (mirrored).

The message log reports the following:  

Aug  8 08:03:47 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug  8 08:03:47 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug  8 08:03:54 cells md_stripe: [ID 641072 kern.warning] WARNING: md: d1: read
error on /dev/dsk/c1t0d0s0
Aug  8 08:03:54 cells md_mirror: [ID 104909 kern.warning] WARNING: md: d1: /dev/
dsk/c1t0d0s0 needs maintenance
Aug  8 08:19:57 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug  8 08:19:57 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug  8 08:20:01 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug  8 08:20:01 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug  8 08:20:01 cells md_stripe: [ID 641072 kern.warning] WARNING: md: d3: read
error on /dev/dsk/c1t0d0s5
Aug  8 08:20:08 cells last message repeated 1 time
Aug  8 08:20:08 cells md_mirror: [ID 104909 kern.warning] WARNING: md: d3: /dev/
dsk/c1t0d0s5 needs maintenance

AND ERRORS REPORTED ON 8/15
-----------------------------------------
Aug 15 08:36:05 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug 15 08:36:05 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug 15 08:36:05 cells scsi: [ID 107833 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0/ssd@w21000004cf9666db,0 (ssd1):
Aug 15 08:36:05 cells   SCSI transport failed: reason 'reset': retrying command
Aug 15 08:36:21 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug 15 08:36:21 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug 15 08:36:24 cells scsi: [ID 243001 kern.warning] WARNING: /pci@9,600000/SUNW
,qlc@2/fp@0,0 (fcp0):
Aug 15 08:36:24 cells   FCP: WWN 0x21000004cf9666db reset successfully
Aug 15 08:36:25 cells md_stripe: [ID 641072 kern.warning] WARNING: md: d6: read
error on /dev/dsk/c1t0d0s3

Also, Solaris management console does not work any more.  When you start the  program, it says " starting the server for the first time. May take a few minutes.  Please allow configuation to continue until you see " Welcome to the Solaris Management Console."

Can someone tell me what these errors mean?  Can this be fixed without loosing data on the FS.

Help!
dee43Asked:
Who is Participating?
 
Hanno P.S.IT Consultant and Infrastructure ArchitectCommented:
Ok, that's not too bad. Now try
 metareplace d0 -e c1t0d0s0
and see if it resynchronizes the mirror ... (It should copy data from d2 to d1)
0
 
rugdogCommented:
seems like one of your hard drives in the mirror set is failing. you shouldn't loose data because they're mirrored, but certainly you have to check that failing disk. is it hot-swappable?
0
 
madan1278Commented:
U get these type of err's even if ur disk actually not failed ...

d0 -m d1 d2
d1 1 1 c0t0d0s0
d2 1 1 c0t1d0s0

When u run metastat d0 .. u will actualy fing metasync command for d2 and metareplace command on d1. In this case clear the mirrors and recreate it..

in /etc/lvm/md.cf comment the above 3 lines and run metaroot /dev/dsk/c0t0d0s0 and issue reboot -- -s

then run metaclear -r d0 followed by the below commands

metainit -f d1 1 1 c0t0d0s0
metainit d2 c0t1d0s0
metainit d0 -m d2
metaroot d0
lockfs -fa
reboot
metattach d0 d1 (Note: - U r making primary mirror as d2 and secondary as d1)
metastat d2 (It should be ok now)

If u wanted to make d1 as primary and d2 as secondary follow the same procedure again ..

Thanks
Mada

0
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

 
dee43Author Commented:
Why can't I open the Solaris Management console?
0
 
Hanno P.S.IT Consultant and Infrastructure ArchitectCommented:
a) What shows
     metastat -p      (shows your setup)

I' assume this:
d0 -m d1 d2
d1 1 1 c1t0d0s0
d2 1 1 c1t1d0s0
d5 -m d3 d4
d3 1 1 c1t0d0s0
d5 1 1 c1t1d0s0

b) check status of mirrored metadevice d0 (d5 repectively)
metastat d0

If you get a message like
" ... maintenance... "
and someting like
" ... run metareplace c1t1d0s0 <device2>"

Try first to resync with
metareplace d0 -e c1t1d0s0

Could you post output from "metastat -p" please?
0
 
dee43Author Commented:
JustUNIX,

Here's the output from metastat -p and metastat d0.

metastat -p
d20 -m d16 d18 1
d16 1 2 c4t3d0s7 c4t4d0s7 -i 32b
d18 1 1 c4t6d0s6
d19 -m d15 d17 1
d15 1 2 c4t1d0s5 c4t2d0s5 -i 32b
d17 1 1 c4t5d0s6
d14 -m d12 d13 1
d12 1 1 c1t0d0s1
d13 1 1 c1t1d0s1
d11 -m d9 d10 1
d9 1 1 c1t0d0s4
d10 1 1 c1t1d0s4
d8 -m d6 d7 1
d6 1 1 c1t0d0s3
d7 1 1 c1t1d0s3
d5 -m d3 d4 1
d3 1 1 c1t0d0s5
d4 1 1 c1t1d0s5
d0 -m d1 d2 1
d1 1 1 c1t0d0s0
d2 1 1 c1t1d0s0
hsp001

---------------------------

 metastat d0
d0: Mirror
    Submirror 0: d1
      State: Needs maintenance
    Submirror 1: d2
      State: Okay        
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 6292242 blocks

d1: Submirror of d0
    State: Needs maintenance
    Invoke: metareplace d0 c1t0d0s0 <new device>
    Size: 6292242 blocks
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c1t0d0s0          0     No     Maintenance   Yes


d2: Submirror of d0
    State: Okay        
    Size: 6292242 blocks
    Stripe 0:
        Device     Start Block  Dbase        State Reloc Hot Spare
        c1t1d0s0          0     No            Okay   Yes


Device Relocation Information:
Device   Reloc  Device ID
c1t0d0   Yes    id1,ssd@w20000004cf9666db
c1t1d0   Yes    id1,ssd@w20000004cf966034

---------------------------------
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.