Link to home
Start Free TrialLog in
Avatar of assistunix
assistunixFlag for United States of America

asked on

aix mpio disks are in state failed and could not be enabled error

Hello
i got the following error on my lpar.

can someone provide some information on it, what it means, what could cause it, what could be done to troubleshoot it?

some mpio disks are in state failed and could not be enabled?
SOLUTION
Avatar of woolmilkporc
woolmilkporc
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of assistunix

ASKER

Hello Again, thank you for the quick reply.(as always)

The issue was a temporary one. as i got the error in the error long on the lpar of "path failed"
some mpio disks are in state failed and could not be enabled

than i did lspath on the lpar and everything was enabled.

than i went to vio server.

the issue was with VIO, checked the error logs there and this was the error.
 "vhostx Virtual SCSI Host Adapter detected an error"

worked with IBM, who recommended to increase the memory as after their analyzation, it seemed as if the memory was too little on this.

so that issue is resolved of temp error in vio causing failed path error alert, and hopefully adding more memory would prevent it from coming again.

HOWEVER, i am keen to learn

if this was not to be a temp error and if the disks on the lpar were not to have switched paths to become enable again as they did and were infact in failed state in the output of "lspath" as the error earlier stated.
than would i do the following?
"
If so, and if "vhostx" is indeed the host adapter responsible for the client adapter "vscsix" at the LPAR,  you'll have to issue (again as padmin on the concerned VIOS):

you mean, vhostx (which is found in error log on VIO for error  "vhostx Virtual SCSI Host Adapter detected an error")  ??
and you mean "vscsix" (which is found in lspath on the lpar for the disks) ??
    and "vscsix" is used to indicate which VIO that disk is coming from right ?

how can i relate and figure out that which "vscsix" goes to which "vhostx" ???

rmdev -dev vhostx -ucfg -recursive   ( this would remove vhost, "clear the cache or something)???
cfgdev -dev vhostx                              ( this would add the vhost again for the disk to that lpar) ???

If there are several VIO servers providing the disks for the LPAR you can leave it running. If there is only one VIOS, shutdown the partition beforehand!

(if only one vio for disks on lpar, than shutdown BEFORE using rmdev and cfgdev command in vio)???

-ucfg is very important, else the complete vhost config will be lost! ( what do these flags mean)???

Now start the partition if you stopped it before, or issue "cfgmgr" there and enable the failing paths if you left it running. (how to enable the failing paths??, would cfgmgr enable the paths itself)????


THANK YOU!!!
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
thank you
Hello wmp.

root # lspath
Enabled hdisk0 vscsi0
Enabled hdisk1 vscsi1
Enabled hdisk2 vscsi6
Enabled hdisk3 vscsi0
Failed  hdisk3 vscsi1


hdisk 3 only has two user created file systems and both are accessible for read and write, but one of its path has failed. Please help, in enabling the path.

the lpar is attached to two vio's.
i check the errlog on both VIO's-
On VIO1 - there is no Virtual SCSI Host Adapter detected an error- although there is one that came about a week back and the issue of failed path came today.
and On VIO2 - there is no Virtual SCSI Host Adapter detected an error at all.

i was able to find out the vhost information of the LPAR from VIO server using the client partition ID.

VIO1-

$ lsmap -all |grep ^vhost  | grep 007
vhost0          U9119.FHA.0292674-V5-C75                     0x00000007
vhost5          U9119.FHA.0292674-V5-C71                     0x00000007
vhost6          U9119.FHA.0292674-V5-C72                     0x00000007
vhost7          U9119.FHA.0292674-V5-C73                     0x00000007
$

VIO -2

$ lsmap -all |grep ^vhost  | grep 007
vhost0          U9119.FHA.0292674-V6-C75                     0x00000007
vhost5          U9119.FHA.0292674-V6-C71                     0x00000007
vhost6          U9119.FHA.0292674-V6-C72                     0x00000007
vhost7          U9119.FHA.0292674-V6-C73                     0x00000007
$

however, how do i match those vhosts to the vsci's on the LPAR ?

how can i determine which vhost is attached to the vscsi0 that has failed? and which VIO has the failed vscsi path?

i believe in AIX, generally it is believed that vscsi0 comes from VIO1 and vscsi1 comes from VIO2. although that could vary from every environment, depending on how the configuration was done-
can you tell me how i can verify, in my system- as to which VIO is vscsi1 coming from-?
turns out the hdisk3 was not mapped on VIO2, so i mapped it again with mkvdev command and than enabled the path smitty mpiopath_enable_all and all was well.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yes i can tell by the amount of knowledge you have- and for the sake of me and my others like me on this site, we are really grateful for having a Genius like you addicted to AIX. :) thank you.
Thank you for the compliments!

Do you need further assistance in solving this issue?

wmp
yes, i am having an issue running lsmap. let me provide an output
lspath*