?
Solved

fcs5 ADAPTER ERROR in AIX 5.3

Posted on 2013-01-06
10
Medium Priority
?
2,489 Views
Last Modified: 2013-01-16
Hi All,

I am getting following errors for fcs5 adapter which is connected to the tape drives on my AIX 5.3 TSM server. How can I check whether the adapter is bad or not, as I am getting these errors more frequently. I have also verified the zoning information on switch side but dont see any issues there, but there are some loss of sync incidents on the ports. How should I troubleshoot whether it is a hardware error or any issues on the switch side?

errpt shows following error:

---------------------------------------------------------------------------
LABEL:          FSCSI_ERR6
IDENTIFIER:     B8FBD189

Date/Time:       Sun Jan  3 05:20:44 CST 2013
Sequence Number: 1337815
Machine Id:      0112179A4C40
Node Id:         servername
Class:           S
Type:            TEMP
Resource Name:   fscsi5

Description
SOFTWARE PROGRAM ERROR

Probable Causes
ADAPTER MICROCODE
SOFTWARE PROGRAM
SOFTWARE DEVICE DRIVER

Failure Causes
ADAPTER MICROCODE
SOFTWARE PROGRAM
SOFTWARE DEVICE DRIVER

        Recommended Actions
        IF PROBLEM PERSISTS THEN DO THE FOLLOWING
        CONTACT APPROPRIATE SERVICE REPRESENTATIVE

Detail Data
SENSE DATA
0000 0000 0000 00A1 0000 0006 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0062 0413 0000 0000
0062 0D13 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 C02F 0000 1612 0002 0000 0000 0000 0000 0001 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 5005 0763
0240 6024 5005 0763 0200 6024 0400 0000 0000 0000 0000 0000 0000 0000 0000 0000
0FF6 B000
0
Comment
Question by:virgo0880
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
10 Comments
 
LVL 3

Expert Comment

by:gronblom
ID: 38749873
The error is not reporting any possible problem externally.  This is reporting as a software error in the firmware (Licensed Internal Code).  The error code indicates that a call is being made to the controller that the controller does not understand.  This is likely due to downrev firmware on the controller card.

You need to update the firmware on the card or the driver in the O/S (or both).  If this doesn't help, try to see if the driver is sending buggy signals byupdating it, and if you still have the issue, the only other source of this would be the firmware on the switch.  You can do the upgrades in whatever order you see fit, but the order that I presented should give you the best chance of success with fewer upgrades.

This does not present as, nor does it look like a hardware error.  It has the rare chance that the switch is sending bad packets, and this can be checked by zoning another port and moving the HBA connection to the new port.

Ernie
0
 
LVL 20

Expert Comment

by:carlmd
ID: 38750377
I suggest you upgrade to the last AIX 5.3 version, which is 5300-12-05-1140, along with the firmware on the device.
0
 

Author Comment

by:virgo0880
ID: 38763899
I am already at that version. We have opened a hardware case with IBM and they will be replacing the HBA card. Now, this card is connected to tape drives as one of the path. So, before changing the card, what is the procedure for making this card unconfigure and remove without disturbing other parts? Can somebody share that information?

Thanks
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 3

Expert Comment

by:gronblom
ID: 38763955
Unless you are going to be removing the card hot (without taking the system down), there isn't any unconfiguring on the host that must be done.

If you are doing it hot, you will need to use the 'hot-plug' task in the 'diag' utility (Task selection) which will step you through powering down the slot and removal/replacement of the card.  The IBM FE should be well versed on this procedure.

When you bring the server back up it will use the same device files as the old card.  You will need to rezone the switch because the WWN of the card will be changing, but you need to have the new card to know what the new WWN is to do the rezoning.

After the repair, you can go into the 'diag' utility and select "Log Repair Action" from the Task Selection menu to prevent the errpt entry from triggering further diagnostics on the new controller.
0
 

Author Comment

by:virgo0880
ID: 38764472
When I am trying to unconfigure that device, it is throwing following errors:

Command: failed        stdout: yes           stderr: no

Before command completion, additional instructions may appear below.

Method error (/etc/methods/ucfgAtape):
        0514-053 Error returned from sys_config.
Unable to unconfigure device: Device busy

fcnet4 deleted
rmt21 deleted
rmt22 deleted
rmt23 deleted
rmt26 deleted

How can I free up that device, so that we can run daignostic to check whether the card is bad or not? I also tried to do rmdev but getting device busy error.
0
 
LVL 3

Accepted Solution

by:
gronblom earned 2000 total points
ID: 38764528
There is a process that has the device open or the tape drive is just not responding.  Do you have backup software that may be holding the device open?

A tape device on the bus sending malformed packets could be the initial cause of the error message and the HBA may be a victim of the drive failure.  Perhaps you should pull the tape drive from the bus, rescan the bus for the original devices (cfgmgr), and see if you still get the errors.
0
 

Author Comment

by:virgo0880
ID: 38764541
The tape drive is working fine, as I see my backup tapes are read/writing properly to all the tape drives. I think here the issue is the hba I am trying to unconfigure is tied to tape path and thats why I am not able to unconfigure it. Is there a way through which I can see what paths are used by this hba and remove that path so that it will free up the hba.
0
 
LVL 3

Expert Comment

by:gronblom
ID: 38765637
You shouldn't need the path because you can simply reference the device and have the O/S determine what is beng removed.  Try using this command:
lsdev           (this will show the device paths as well)
rmdev -l rmt123    (this is lowercase L - substitute the rmt123 for the device listed in lsdev)

Please note that the message "rmt123 defined" is a successful message.  This means that the device is still in the Customized Devices definition database so that you can add it back later without reinstalling the drivers.
0
 

Author Comment

by:virgo0880
ID: 38767968
I tried doing rmdev and cfgmgr but the drive is showing only 2 paths instead of showing 4 paths. In the output you can see, it is not showing the path rmt24 and rmt34. I tried offlining the drive from TSM, rmdev all the four paths and cfgmgr again. But that is not working. This have had worked before several times, but this time not working. I just did the reboot of the system and nothing else.

Output of lsdev -Cc tape command :

rmt1  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt2  Available 1n-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt3  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt4  Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt5  Available 1n-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt6  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt7  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt8  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt9  Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt10 Available 1n-08-02     IBM 3592 Tape Drive (FCP)
rmt11 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt12 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt13 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt14 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt15 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt16 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt17 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt18 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt19 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt20 Available 1A-08-02     IBM 3592 Tape Drive (FCP)
rmt21 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt22 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt23 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt25 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt26 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt27 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt28 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt29 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt30 Available 2M-08-02     IBM 3592 Tape Drive (FCP)
rmt31 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt32 Available 2U-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt33 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt35 Available 2U-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt36 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt37 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt38 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt39 Available 2U-08-02     IBM 3592 Tape Drive (FCP)
rmt40 Available 2U-08-02     IBM 3592 Tape Drive (FCP)

Open in new window

0
 

Author Comment

by:virgo0880
ID: 38783251
There was problem with one of the FC port on the tape drive and it was bad. After replacing the tape drive, the issue was resolved and the errors were gone. Also, all the path was showing ok for this drive.

Giving points

Thanks
virgo
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I don't know if many of you have made the great mistake of using the Cisco Thin Client model with the management software VXC. If you have then you are probably more then familiar with the incredibly clunky interface, the numerous work arounds, and …
Learn about cloud computing and its benefits for small business owners.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
This is used to tweak the memory usage for your computer, it is used for servers more so than workstations but just be careful editing registry settings as it may cause irreversible results. I hold no responsibility for anything you do to the regist…

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question