Link to home
Start Free TrialLog in
Avatar of techie27
techie27

asked on

Tons of TAPE_ERR4 errors

there are a lot of errors in reported in aix and tsm that are related to each other: during migration jobs, the errors below appear.
01/06/09   13:49:16  ANR8311E An I/O error occurred while accessing drive
                      tapedrv3 (/dev/rmt19) for WRITE operation, errno = 78.
                      (PROCESS: 112)
01/06/09   14:17:28  ANR8311E An I/O error occurred while accessing drive
                      tapedrv8 (/dev/rmt16) for WRITE operation, errno = 78.
                      (PROCESS: 172)
01/03/09   05:37:55  ANR8311E An I/O error occurred while accessing drive
                      tapedrv9 (/dev/rmt15) for READ operation, errno = 5.
                      (PROCESS: 85)
while the migation takes place, the errors in aix are produced below:

LABEL:          TAPE_ERR4
IDENTIFIER:     5537AC5F

Date/Time:       Tue Jan  6 13:49:16 CST 2009
Sequence Number: 7972
Machine Id:      0006AADBD600
Node Id:         duke01
Class:           H
Type:            PERM
Resource Name:   rmt19
Resource Class:  tape
Resource Type:   LTO
Location:        U7311.D20.06042DC-P1-C08-T1-W224108001BC0BEC6-L1000000000000
VPD:
        Manufacturer................IBM
        Machine Type and Model......ULTRIUM-TD1
        Serial Number...............VD3ASV0823BVA01785
        Device Specific.(FW)........5AU1

Description
TAPE DRIVE FAILURE

Probable Causes
ADAPTER
TAPE DRIVE

Failure Causes
ADAPTER
TAPE DRIVE

        Recommended Actions
        PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
SENSE DATA
0600 0000 0A00 0400 0000 0000 0000 0000 0200 0300 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

When i was checking at the hba adapters, noticed that all of them have the same FRU number except for two of them? will this be cauising the problem. is this normal?

fcs0 & 1 has FRU number 03N5029 while the rest fcs2 -fcs8 10N8620

i would like some suggestions on tunning the hba's to make them achieve their max performance on our system.


Please shed some light on it.

AIX 5.3 9133-55A box.


ASKER CERTIFIED SOLUTION
Avatar of woolmilkporc
woolmilkporc
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of techie27
techie27

ASKER

How can i check the the SAN switches microcode/firmware level? is there any command on the aix side to check for it? or is something the SAN tech will have to provide? Do you have a PDF to find the meaning for all the ERROR No. in AIX ?  Thanks
Hi,
regarding the SAN switch, you'll have to use its user interface (web or telnet or ssh ...)
and issue the appropriate query to find out the firmware level.
What SAN switch do you use? If it's IBM or Brocade, I probably will be able to say more.

The AIX error numbers are contained in the header file /usr/include/sys/errno.h

wmp


All of the switches are running at 3.2(3a).

How can I relate the VTL ports and their WWN to the specific adapter defined in AIX?


20:01:08:00:1b:e0:63:f9-S             TVTL2_F4   22:01:08:00:1b:c0:63:f9  0xb10008             DANSAN002 fc9/3          

20:01:08:00:1b:e0:63:f9-S             TVTL2_F3   21:01:08:00:1b:c0:63:f9  0xb10009             DANSAN002 fc9/4          

20:41:08:00:1b:e0:be:c6-S            TVTL1_F8   25:41:08:00:1b:c0:be:c6 0x7d0000             DANSAN001 fc3/47        

20:41:08:00:1b:e0:be:c6-S            TVTL1_F7   24:41:08:00:1b:c0:be:c6 0x7d0001             DANSAN001 fc3/48        

20:41:08:00:1b:e0:be:c6-S            TVTL1_F4   22:41:08:00:1b:c0:be:c6 0xb10000             DANSAN002 fc3/47        

20:41:08:00:1b:e0:be:c6-S            TVTL1_F3   21:41:08:00:1b:c0:be:c6 0xb10001             DANSAN002 fc3/48        
Thanks

 

Which command did produce the above output?
I did not use any command the info was provided by the SAn techs.
Thanks
You could compare the WWNs contained in the above list (first resp. third column)
with the WWNs of your FC adapters, obtained via

lscfg -vl fcs[n] | grep Z8
or
lscfg -vl fcs[n] | grep Address

There might be a relation, but I'm afraid there isn't.

As I'm not familiar with your VTL, I guess there is not more I can say.

wmp


how can i know for sue that by updating the microcode/firmware level on the adapters to the latest level will fix the issue with the time-out errors?

It really does not mentioned anything in the release notes for the new microcode level.

"Adapter timeouts could be related to microcode. so check microcode levels of the FC adapters by using 'lsmcode -d fcs[n]'.
Afaik the latest mcode is 271304 (1.50x1). If your mcode is older, you should install the newest one.
Please look here for instructions and download - "

Thanks
As I wrote: "... could be related ..."

You can't know anything for sure.

As I wrote, too: "... If you still encounter the above errors, please contact IBM support to fix it..."

Firmware is just one possibility, and I swear, the first thing IBM will tell you is: "Update your adapters to the latest firmware level and we'll see."

wmp

Hi,
why wouldn't you recommend accepting my answer http:#a23323541 ?
It's a valid and comprehensive answer to the original question.
A lot of follow-ups have been asked afterwards, which I answered as precise as could be, at least in my opinion. Maybe techie27 was not quite satisfied with my answers to these follow-ups, but as I said - the original question has been answered correctly!
wmp