Solved

fcs5 ADAPTER ERROR in AIX 5.3

Posted on 2013-01-06
10
2,322 Views
Last Modified: 2013-01-16
Hi All,

I am getting following errors for fcs5 adapter which is connected to the tape drives on my AIX 5.3 TSM server. How can I check whether the adapter is bad or not, as I am getting these errors more frequently. I have also verified the zoning information on switch side but dont see any issues there, but there are some loss of sync incidents on the ports. How should I troubleshoot whether it is a hardware error or any issues on the switch side?

errpt shows following error:

---------------------------------------------------------------------------
LABEL:          FSCSI_ERR6
IDENTIFIER:     B8FBD189

Date/Time:       Sun Jan  3 05:20:44 CST 2013
Sequence Number: 1337815
Machine Id:      0112179A4C40
Node Id:         servername
Class:           S
Type:            TEMP
Resource Name:   fscsi5

Description
SOFTWARE PROGRAM ERROR

Probable Causes
ADAPTER MICROCODE
SOFTWARE PROGRAM
SOFTWARE DEVICE DRIVER

Failure Causes
ADAPTER MICROCODE
SOFTWARE PROGRAM
SOFTWARE DEVICE DRIVER

        Recommended Actions
        IF PROBLEM PERSISTS THEN DO THE FOLLOWING
        CONTACT APPROPRIATE SERVICE REPRESENTATIVE

Detail Data
SENSE DATA
0000 0000 0000 00A1 0000 0006 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0062 0413 0000 0000
0062 0D13 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 C02F 0000 1612 0002 0000 0000 0000 0000 0001 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 5005 0763
0240 6024 5005 0763 0200 6024 0400 0000 0000 0000 0000 0000 0000 0000 0000 0000
0FF6 B000
0
Comment
Question by:virgo0880
  • 5
  • 4
10 Comments
 
LVL 3

Expert Comment

by:gronblom
ID: 38749873
The error is not reporting any possible problem externally.  This is reporting as a software error in the firmware (Licensed Internal Code).  The error code indicates that a call is being made to the controller that the controller does not understand.  This is likely due to downrev firmware on the controller card.

You need to update the firmware on the card or the driver in the O/S (or both).  If this doesn't help, try to see if the driver is sending buggy signals byupdating it, and if you still have the issue, the only other source of this would be the firmware on the switch.  You can do the upgrades in whatever order you see fit, but the order that I presented should give you the best chance of success with fewer upgrades.

This does not present as, nor does it look like a hardware error.  It has the rare chance that the switch is sending bad packets, and this can be checked by zoning another port and moving the HBA connection to the new port.

Ernie
0
 
LVL 20

Expert Comment

by:carlmd
ID: 38750377
I suggest you upgrade to the last AIX 5.3 version, which is 5300-12-05-1140, along with the firmware on the device.
0
 

Author Comment

by:virgo0880
ID: 38763899
I am already at that version. We have opened a hardware case with IBM and they will be replacing the HBA card. Now, this card is connected to tape drives as one of the path. So, before changing the card, what is the procedure for making this card unconfigure and remove without disturbing other parts? Can somebody share that information?

Thanks
0
 
LVL 3

Expert Comment

by:gronblom
ID: 38763955
Unless you are going to be removing the card hot (without taking the system down), there isn't any unconfiguring on the host that must be done.

If you are doing it hot, you will need to use the 'hot-plug' task in the 'diag' utility (Task selection) which will step you through powering down the slot and removal/replacement of the card.  The IBM FE should be well versed on this procedure.

When you bring the server back up it will use the same device files as the old card.  You will need to rezone the switch because the WWN of the card will be changing, but you need to have the new card to know what the new WWN is to do the rezoning.

After the repair, you can go into the 'diag' utility and select "Log Repair Action" from the Task Selection menu to prevent the errpt entry from triggering further diagnostics on the new controller.
0
 

Author Comment

by:virgo0880
ID: 38764472
When I am trying to unconfigure that device, it is throwing following errors:

Command: failed        stdout: yes           stderr: no

Before command completion, additional instructions may appear below.

Method error (/etc/methods/ucfgAtape):
        0514-053 Error returned from sys_config.
Unable to unconfigure device: Device busy

fcnet4 deleted
rmt21 deleted
rmt22 deleted
rmt23 deleted
rmt26 deleted

How can I free up that device, so that we can run daignostic to check whether the card is bad or not? I also tried to do rmdev but getting device busy error.
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 3

Accepted Solution

by:
gronblom earned 500 total points
ID: 38764528
There is a process that has the device open or the tape drive is just not responding.  Do you have backup software that may be holding the device open?

A tape device on the bus sending malformed packets could be the initial cause of the error message and the HBA may be a victim of the drive failure.  Perhaps you should pull the tape drive from the bus, rescan the bus for the original devices (cfgmgr), and see if you still get the errors.
0
 

Author Comment

by:virgo0880
ID: 38764541
The tape drive is working fine, as I see my backup tapes are read/writing properly to all the tape drives. I think here the issue is the hba I am trying to unconfigure is tied to tape path and thats why I am not able to unconfigure it. Is there a way through which I can see what paths are used by this hba and remove that path so that it will free up the hba.
0
 
LVL 3

Expert Comment

by:gronblom
ID: 38765637
You shouldn't need the path because you can simply reference the device and have the O/S determine what is beng removed.  Try using this command:
lsdev           (this will show the device paths as well)
rmdev -l rmt123    (this is lowercase L - substitute the rmt123 for the device listed in lsdev)

Please note that the message "rmt123 defined" is a successful message.  This means that the device is still in the Customized Devices definition database so that you can add it back later without reinstalling the drivers.
0
 

Author Comment

by:virgo0880
ID: 38767968
I tried doing rmdev and cfgmgr but the drive is showing only 2 paths instead of showing 4 paths. In the output you can see, it is not showing the path rmt24 and rmt34. I tried offlining the drive from TSM, rmdev all the four paths and cfgmgr again. But that is not working. This have had worked before several times, but this time not working. I just did the reboot of the system and nothing else.

Output of lsdev -Cc tape command :

rmt1  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt2  Available 1n-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt3  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt4  Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt5  Available 1n-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt6  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt7  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt8  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt9  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt10 Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt11 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt12 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt13 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt14 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt15 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt16 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt17 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt18 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt19 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt20 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt21 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt22 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt23 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt25 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt26 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt27 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt28 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt29 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt30 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt31 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt32 Available 2U-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt33 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt35 Available 2U-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt36 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt37 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt38 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt39 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt40 Available 2U-08-02     IBM 3592 Tape Drive (FCP)

Open in new window

0
 

Author Comment

by:virgo0880
ID: 38783251
There was problem with one of the FC port on the tape drive and it was bad. After replacing the tape drive, the issue was resolved and the errors were gone. Also, all the path was showing ok for this drive.

Giving points

Thanks
virgo
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

I don't know if many of you have made the great mistake of using the Cisco Thin Client model with the management software VXC. If you have then you are probably more then familiar with the incredibly clunky interface, the numerous work arounds, and …
Hyper-convergence systems have taken the IT world by storm and have quickly started to change our point of view of how the data center should and could be architected. In this article, I’ll explain the benefits of employing a hyper-converged system …
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now