Link to home
Start Free TrialLog in
Avatar of Kong
Kong

asked on

"Device sdd not ready" in /var/log/messages

Hi all,

We restarted the 2-node cluster (both nodes and SAN) and got a lot of Device not ready messages in /var/log/messages:

Jul  2 04:02:30 lu3cduddb1 syslogd 1.4.1: restart.
Jul  2 04:02:31 lu3cduddb1 kernel: Device sdb not ready.
Jul  2 04:02:31 lu3cduddb1 kernel: Device sdd not ready.
Jul  2 04:02:31 lu3cduddb1 kernel: Device sdf not ready.
Jul  2 04:02:31 lu3cduddb1 kernel: Device sdh not ready.
Jul  2 04:02:31 lu3cduddb1 kernel: Device sdj not ready.
...

How do I track down and resolve this problem? Thanks in advance.
 
Avatar of pjedmond
pjedmond
Flag of United Kingdom of Great Britain and Northern Ireland image

Number of possible reasons behind this:

1.   That the device is not connected:
2.   That the device takes time to 'spin up to speed'
3.   That the device is not accessable for some other reason...rebuilding a RAID array etc.
4.   That the device is a removeable device such as a CD.

Fault-finding:

1.    Look at /etc/fstab                      #Can you identify the device?
2.    cat /proc/scsi/scsi                     # May give you something useful?
3.    cdrecord -scanbus                     #CHecks what's on the scsi bus

The reason for the 3rd one is that I was wondering if you had a CD tower of some sort connected. Any drive without a CD in it could be declared 'not ready'

(   (()
(`-' _\
 ''  ''
Avatar of Kong
Kong

ASKER

Wow, thanks for the very quick response!

Here's the output, I'm not a linux admin by any stretch of the imagination so can't tell what's wrong from the output:

fstab:

[root@lu3cduddb1 ~]# cat /etc/fstab
# This file is edited by fstab-sync - see 'man fstab-sync' for details
LABEL=/                 /                        ext3    defaults        1 1
LABEL=/boot             /boot                    ext3    defaults        1 2
none                    /dev/pts                 devpts  gid=5,mode=620  0 0
none                    /dev/shm                 tmpfs   defaults        0 0
LABEL=/home             /home                    ext3    defaults        1 2
LABEL=/opt              /opt                     ext3    defaults        1 2
none                    /proc                    proc    defaults        0 0
none                    /sys                     sysfs   defaults        0 0
LABEL=/tmp              /tmp                     ext3    defaults        1 2
LABEL=/usr              /usr                     ext3    defaults        1 2
LABEL=/var              /var                     ext3    defaults        1 2
LABEL=SW-cciss/c0d0p2   swap                     swap    defaults        0 0
#
# OCFS RAC File Systems
#
/dev/mpath/mpath0p1     /oracle/RACConfig        ocfs2   _netdev,datavolume 0 0
/dev/mpath/mpath1p1     /oracle/oradata/TEST_DBs ocfs2   _netdev,datavolume 0 0
/dev/mpath/mpath2p1     /oracle/oradata/prod     ocfs2   _netdev,datavolume 0 0
#

/dev/hda                /media/cdrom            auto    pamconsole,exec,noauto,managed 0 0
/dev/fd0                /media/floppy           auto    pamconsole,exec,noauto,managed 0 0

------

[root@lu3cduddb1 ~]# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   RAID                             ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 00 Lun: 01
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 00 Lun: 02
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 00 Lun: 03
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   RAID                             ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 01
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 02
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 03
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   RAID                             ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 01
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 02
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 03
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 01 Lun: 00
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   RAID                             ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 01 Lun: 01
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 01 Lun: 02
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 01 Lun: 03
  Vendor: HP       Model: HSV100           Rev: 3028
  Type:   Direct-Access                    ANSI SCSI revision: 02

-----

[root@lu3cduddb1 ~]# cdrecord -scanbus
Cdrecord-Clone 2.01-dvd (i686-pc-linux-gnu) Copyright (C) 1995-2004 Jörg Schilling
Note: This version is an unofficial (modified) version with DVD support
Note: and therefore may have bugs that are not present in the original.
Note: Please send bug reports or support requests to http://bugzilla.redhat.com/bugzilla
Note: The author of cdrecord should not be bothered with problems in this version.
scsidev: 'ATA'
devname: 'ATA'
scsibus: -2 target: -2 lun: -2
Linux sg driver version: 3.5.27
Using libscg version 'schily-0.8'.
cdrecord: Warning: using inofficial libscg transport code version (schily - Red Hat-scsi-linux-sg.c-1.83-RH '@(#)scsi-linux-sg.c      1.83 04/05/20 Copyright 1997 J. Schilling').
scsibus0:
        0,0,0     0) 'TEAC    ' 'DV-28E-C        ' 'B.4F' Removable CD-ROM
        0,1,0     1) *
        0,2,0     2) *
        0,3,0     3) *
        0,4,0     4) *
        0,5,0     5) *
        0,6,0     6) *
        0,7,0     7) *
Avatar of Kong

ASKER

Not sure if this output helps, but there shouldn't be any [failed] lines:

[root@lu3cduddb1 ~]# multipath -l
mpath2 (3600508b4001026000000600000420000)
[size=150 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [active]
 \_ 0:0:0:3 sdc 8:32  [active][ready]
\_ round-robin 0 [enabled]
 \_ 0:0:1:3 sdf 8:80  [failed][faulty]
\_ round-robin 0 [enabled]
 \_ 1:0:0:3 sdi 8:128 [active][ready]
\_ round-robin 0 [enabled]
 \_ 1:0:1:3 sdl 8:176 [failed][faulty]

mpath1 (3600508b40010260000006000003b0000)
[size=20 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [enabled]
 \_ 0:0:0:2 sdb 8:16  [failed][faulty]
\_ round-robin 0 [active]
 \_ 0:0:1:2 sde 8:64  [active][ready]
\_ round-robin 0 [enabled]
 \_ 1:0:0:2 sdh 8:112 [failed][faulty]
\_ round-robin 0 [enabled]
 \_ 1:0:1:2 sdk 8:160 [active][ready]

mpath0 (3600508b4001026000000600000340000)
[size=1 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [active]
 \_ 0:0:0:1 sda 8:0   [active][ready]
\_ round-robin 0 [enabled]
 \_ 0:0:1:1 sdd 8:48  [failed][faulty]
\_ round-robin 0 [enabled]
 \_ 1:0:0:1 sdg 8:96  [active][ready]
\_ round-robin 0 [enabled]
 \_ 1:0:1:1 sdj 8:144 [failed][faulty]
ASKER CERTIFIED SOLUTION
Avatar of pjedmond
pjedmond
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Kong

ASKER

No, an engineer came around and apparently the fiber channel was configured incorrectly, the HP Eva (3000?) SAN doesn't support dual Active channels, it's now set to Active-Passive however the display still shows:

[root@lu3cduddb2 ~]# multipath -l
mpath2 (3600508b4001026000000600000420000)
[size=150 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [active]
 \_ 0:0:0:3 sdc 8:32  [active][ready]
\_ round-robin 0 [enabled]
 \_ 0:0:1:3 sdf 8:80  [failed][faulty]
\_ round-robin 0 [enabled]
 \_ 1:0:0:3 sdi 8:128 [active][ready]
\_ round-robin 0 [enabled]
 \_ 1:0:1:3 sdl 8:176 [failed][faulty]

mpath1 (3600508b40010260000006000003b0000)
[size=20 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [enabled]
 \_ 0:0:0:2 sdb 8:16  [failed][faulty]
\_ round-robin 0 [active]
 \_ 0:0:1:2 sde 8:64  [active][ready]
\_ round-robin 0 [enabled]
 \_ 1:0:0:2 sdh 8:112 [failed][faulty]
\_ round-robin 0 [enabled]
 \_ 1:0:1:2 sdk 8:160 [active][ready]

mpath0 (3600508b4001026000000600000340000)
[size=1 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [active]
 \_ 0:0:0:1 sda 8:0   [active][ready]
\_ round-robin 0 [enabled]
 \_ 0:0:1:1 sdd 8:48  [failed][faulty]
\_ round-robin 0 [enabled]
 \_ 1:0:0:1 sdg 8:96  [active][ready]
\_ round-robin 0 [enabled]
 \_ 1:0:1:1 sdj 8:144 [failed][faulty]

The engineer thinks the driver is faulty because it shouldn't show the [failed][faulty] paths, says it's working as expected...

We're still getting spammed with "Device not ready" messages in /var/log/messages, but at least the two nodes can see the SAN now and I can create my database - for how long, I don't know... It looks a bit suss to me, but I'm not a sys admin...