Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

TSM Tape drive failure Error

Posted on 2011-02-17
10
Medium Priority
?
3,227 Views
Last Modified: 2013-11-15
Hi All,

I got following message on my TSM server errpt

"daemon:notice root: IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION 5537AC5F 0217171511 P H rmt26 TAPE DRIVE FAILURE "
daemon:notice root: IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION 5537AC5F 0217171511 P H rmt16 TAPE DRIVE FAILURE "

Now the problem is there was one space reclamation which was running and it seems it is using this tape drive, now my other migration processes are waiting for the tape drives, as 8 tape drives are in use for other backups. I tried to cancel it, but it is not getting cancel. kindly let me know what can be done in this situtation ? Tape library is IBM L3494.

Thanks
virgo
0
Comment
Question by:virgo0880
  • 5
  • 5
10 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34924080
Hi and welcome back!

First, you should issue "errpt -a -j 5537AC5F " on the TSM server to get the full error messages.
The error identifier points to some drive or adapter failure. It's not a media problem.
Are there accompanying 0BA49C99 errors (check with "errpt")? If so, the culprit could be the adapter of either the AIX box or (rather) the one of the drive. A most common reason for such failures is cabling (plugs/pins). Please check!

Next, make sure that the drive is still online.

Use the library's web interface, if you enabled it:
http://library_hostname/srvrroot/en/en-us/wsindex.htm

Select "Monitor Library Manager" on the left, then "Component Availability".
All drives available?

Check Operator Interventions:
Select "Monitor Library Manager" on the left, then "Operator Interventions"
What do you see?

If you didn't enable the web interface you will have to walk to your library to perform the above checks.

Please report back the results!

wmp
0
 

Author Comment

by:virgo0880
ID: 34924121
I logged a call with IBM and the IBM engineer is going to replace the Tape Drive, as the tape got stuck in the drive. He will be replacing it now. But the problem is, he is just a Library engineer, he doesnt know how to make drive online /visible to TSM. Can u tell me the steps for the same. Currently "q path" is showing drive is not online.

Thanks
virgo
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34924290
I hope your engineer is capable of making the drive known to the library, since that's too complex to explain it here.

Once the drive is replaced remove its definition from AIX with "rmdev -dl rmt26" and "rmdev -dl rmt16" (it's obviously an "alt_pathing" device)

Seems that the drive is not in use by TSM, but if it is you must free it some way. In the worst case, if anything fails you will have to restart TSM.

Now run "cfgmgr". Does the new drive show up with the same device name(s) as the old one?

I don't know if your engineer will update the serial number of the new drive to be the same as the one of the old drive.

If you can keep the old number, you'll just have to issue on TSM (dsmadmc):

Upd path servername drivename srctype=server desttype=drive library=libraryname online=yes
and
Upd drive libraryname  drivename online=yes

servername, drivename and libraryname are the TSM internal names, not things like hostname or /dev/rmtx!

If you get a new serial number it could well be that you'll have to remove the complete path/drive definition from TSM and recreate it, so TSM will recognize the new serial number.

If you don't know how to do that please let me know. I'll assist you.

wmp
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 

Author Comment

by:virgo0880
ID: 34924375
Yes, he has used the old serial number with the new drive. Also, I have 4 paths showing as defined as when he removed the drive. which are showing in defined state : output of lsdev -Cc tape

snbc108:/# lsdev -Cc tape
lmcp0 Available              LAN/TTY Library Management Control Point
rmt0  Available 1Z-08-00-1,0 LVD SCSI Tape Drive
rmt1  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt2  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt3  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt4  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt5  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt6  Defined   1n-08-02     IBM 3592 Tape Drive (FCP)
rmt7  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt8  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt9  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt10 Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt11 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt12 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt13 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt14 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt15 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt16 Defined   1A-08-02     IBM 3592 Tape Drive (FCP)
rmt17 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt18 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt19 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt20 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt21 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt22 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt23 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt24 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt25 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt26 Defined   2M-08-02     IBM 3592 Tape Drive (FCP)
rmt27 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt28 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt29 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt30 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt31 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt32 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt33 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt34 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt35 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt36 Defined   2U-08-02     IBM 3592 Tape Drive (FCP)
rmt37 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt38 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt39 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt40 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)

Also, can u let me know first i have to removed the paths, do cfgmgr and then give the tsm commands to update the drive right...? Will it affect any backups to tapes which are running currently or migration processes runing currrently. Also where can i get servername,drivename and libraryname details...is there any command for that ?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34924407
lsdev does not show paths, but devices. Please avoid misunderstandings!

OK, your lsdev shows four primary devices which are defined, but not present.
Why is that? Which is the failing one?

On TSM issue

Q PATH * failing_tsm_drive

to see the rmt number.

0
 

Author Comment

by:virgo0880
ID: 34924426
lsdev is showing 1 pri and 3 alt paths (rmt6,rmt16,rmt26,rmt36) tape devices in defined state as we see in the output. Also, the output of the command given by you is :

Source Name     Source Type     Destination     Destination     On-Line
                                Name            Type            
-----------     -----------     -----------     -----------     -------
TSM             SERVER          TDRV6           DRIVE           No    


0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34924499
Sorry, the command should have been
Q PATH * TDRV6 F=D

Where do you see ALT paths?
Please verify with

lscfg -vl rmt6
lscfg -vl rmt16
lscfg -vl rmt26
lscfg -vl rmt36

and compare "Serial Number"

If they're identical remove all four devices using "rmdev -dl ...", run cfgmgr, then issue the TSM commands I gave you.

Currently running processes/sessions will not be affected.
0
 

Author Comment

by:virgo0880
ID: 34924615
Yes the drive is showing online now in q path, I did the steps given by you and it worked fine for me.
Regarding your question I can see in lsdev - PRI or ALT..keywork which I think is the primary and alternate paths. You can see the same in the output given above.

Is there any other way to find whether drive is online or not. I can see the drive is online using command given by you, heres the output :

Source Name: TSM
                   Source Type: SERVER
              Destination Name: TDRV6
              Destination Type: DRIVE
                       Library: L3494B
                     Node Name:
                        Device: /dev/rmt16
              External Manager:
                           LUN:
                     Initiator: 0
                     Directory:
                       On-Line: Yes
Last Update by (administrator):
         Last Update Date/Time: 02/18/11 04:23:00

So, do you think the tape drive is online now..?

Thanks
virgo

0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 2000 total points
ID: 34924716
The above command shows that the PATH is online, which is the most important part of the whole chain.

TSM can set a drive offline regardless of PATH status, but that's not typical behaviour.

See the drive's status with

Q DRIVE * TDRV6 F=D

If this shows "On-Line: Yes" everything is fine. Mission accomplished.

I once wrote a script to combine the display of PATH and DRIVE status. Find it in the attachment.

The command

Q SAN T=DR F=D

will give you some info on the SAN related properties of your drives.

To make the above command work you must have set "SANDISCOVERY ON" in the server options.
If it's disabled you can turn it on dynamically (and for the future) with

SETOPT SANDISCOVERY ON

Have fun with TSM!

wmp
DRIVES_PATHS   5          select -                                                    
               10         cast(paths.DESTINATION_NAME as char(11)) "Name", -          
               20         cast(paths.ONLINE as char(12)) "PATH Online" , -            
               30         cast(drives.ONLINE as char(13)) "DRIVE Online", -           
               35         cast(paths.DEVICE as char(10)) "DEVICE" , -                 
               37         cast(drives.DRIVE_SERIAL as char(13)) "SERIAL" -            
               40         from paths, drives -                                        
               50         where -                                                     
               60         paths.DESTINATION_TYPE='DRIVE' -                            
               70         and -                                                       
               80         drives.DRIVE_NAME=paths.DESTINATION_NAME

Open in new window

0
 

Author Closing Comment

by:virgo0880
ID: 34924872
OK
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Compliance and data security require steps be taken to prevent unauthorized users from copying data.  Here's one method to prevent data theft via USB drives (and writable optical media).
Windows Server 2003 introduced persistent Volume Shadow Copies and made 2003 a must-do upgrade.  Since then, it's been a must-implement feature for all servers doing any kind of file sharing.
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
Suggested Courses

782 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question