Solved

Fixing dead paths (powerpath) on AIX 5.3 and AIX 5.2

Posted on 2013-01-21
6
1,960 Views
Last Modified: 2013-01-23
Recently I was involved with migrating SAN, Our Storage folks are replacing there old SAN switches with NEW switches. New cables were layed out from the NEW switch and all I had to do is to remove fcs0 old connection and replace it with new connection from the new switch and vice versa on fcs1. So this is what happens

1.      Physically disconnect OLD fibre from fcs0 and reconnect it with the new fibre
2.      Fix the dead paths
3.      Physically disconnect OLD fibre from fcs1 and reconnect it with the new fibre
4.      Fix the dead paths.

Since powerpath was installed on these servers, I should be able to run the below commands to fix the dead paths in 2 & 4
  # powermt check
  # powermt config

Now the question is, I am able to run these commands and fix dead paths on AIX 5.2 but on AIX 5.3 my “powermt check” hangs, this is not a particular case, This is will all the servers I am working on and the only resolution is to reboot the servers for the paths to come alive.

Why is this happening on AIX 5.3, Is there any other way to fix the dead paths other than rebooting
0
Comment
Question by:mnis2008
  • 3
  • 2
6 Comments
 
LVL 47

Expert Comment

by:dlethe
ID: 38801881
Do you have any processes running with open files/handles that use the other path?   If so, MAYBE a kill -9 will let you run the powermt  successfully.

But bottom line, you'll most likely have to reboot.  Reconfiguring FC and for that matter, any peripherals often requires a reboot.  This isn't an AIX thing, I've had to do this with every mainstream O/S you can probably think of (as I develop storage-centric configurators and diagnostics).

Bottom line, you got lucky not having to reboot 5.2.  Consider this just par for the course.  Also just double-check switches first and make sure they reboot switches after you reboot the AIX boxes before you sign off that the job has been properly done.
0
 

Author Comment

by:mnis2008
ID: 38801922
I dont have any running or open files on App side or the OS side as all the applications were properly shutdown

Also according to my understanding AIX should be resistant of any SAN changes if "dyntrk" attribute is turned on. This is turned on in my env.

http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105839
0
 
LVL 47

Assisted Solution

by:dlethe
dlethe earned 500 total points
ID: 38801987
No, you probably just don't have any user-level apps that have files mounted on any of these devices open.

I'm talking about system-level code that might have /dev/rhdisk[n] open,   The raw device handles are what kills you.  Can you get away with going to single-user mode and not rebooting?  That *MAY* do the trick.

But if you have any of these disks as part of your rootvg then you probably will have to reboot.

A less drastic work around (I know, rebooting can be very painful at times) is to go to the switch and temporarily rezone the LUNs in question.   If AIX can't see them, then typically any process that uses them moves on and releases them.  Then you can satisfy your curiosity and look at the various system logs and see what services complains that the disks  went away.

But don't even *THINK* about it if the HDD's in question are part of a mounted volume group.  You risk data loss if you don't dismount them first.
0
[Webinar] Disaster Recovery and Cloud Management

Learn from Unigma and CloudBerry industry veterans which providers are best for certain use cases and how to lower cloud costs, how to grow your Managed Services practice in IaaS clouds, and how to utilize public cloud for Disaster Recovery

 

Author Comment

by:mnis2008
ID: 38802445
Does this work, I was just trying to make sure I dont need a reboot :), If If I have to I have to -

Before the cable is disconnected, Eg fcs0 ( Same for fcs1)

            Remove the paths on the failing adapter from powerpath:
            # sudo powermt remove hba=0


            Unconfigure the port and all attached devices:
            # sudo rmdev -Rdl fcs0

Disconnect the cable and reattach the new cable from new SAN    

              Run config Manager, fcs and fscsi devices should be Available (the disks will not yet be Available)
             # cfgmgr

             Run EMC cfgmgr.  It should login to the storage array, and configure all disks and redundant paths:
             # sudo emc_cfgmgr                  
         
             Run # powermt config
0
 
LVL 47

Accepted Solution

by:
dlethe earned 500 total points
ID: 38802527
I just can't tell you if that will or won't work.   But I will tell you that if any mounted disks are attached to hba0 then your system may crash, so make darned sure that nothing uses that port.

Rebooting IS best practices.  If this was my data I would just wait for a downtime window to reboot and be safe.
0
 
LVL 30

Expert Comment

by:Duncan Meyers
ID: 38804081
Since you have new SAN switches, you'll need to reboot. AIX uses the FCID (Fibre Cahnnel ID) of the storage port in the device descriptor (so does HP-UX) so if you change the switch port that the storage is plugged into, you change the device descriptor. Reboot the server and you'll see the PowerPath pseudo devices come back with new PowerPath IDs. You'll just need to fix up the mount points and you're done.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now