Solved

TSM Tape drive failure Error

Posted on 2011-02-17
10
3,057 Views
Last Modified: 2013-11-15
Hi All,

I got following message on my TSM server errpt

"daemon:notice root: IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION 5537AC5F 0217171511 P H rmt26 TAPE DRIVE FAILURE "
daemon:notice root: IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION 5537AC5F 0217171511 P H rmt16 TAPE DRIVE FAILURE "

Now the problem is there was one space reclamation which was running and it seems it is using this tape drive, now my other migration processes are waiting for the tape drives, as 8 tape drives are in use for other backups. I tried to cancel it, but it is not getting cancel. kindly let me know what can be done in this situtation ? Tape library is IBM L3494.

Thanks
virgo
0
Comment
Question by:virgo0880
  • 5
  • 5
10 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34924080
Hi and welcome back!

First, you should issue "errpt -a -j 5537AC5F " on the TSM server to get the full error messages.
The error identifier points to some drive or adapter failure. It's not a media problem.
Are there accompanying 0BA49C99 errors (check with "errpt")? If so, the culprit could be the adapter of either the AIX box or (rather) the one of the drive. A most common reason for such failures is cabling (plugs/pins). Please check!

Next, make sure that the drive is still online.

Use the library's web interface, if you enabled it:
http://library_hostname/srvrroot/en/en-us/wsindex.htm

Select "Monitor Library Manager" on the left, then "Component Availability".
All drives available?

Check Operator Interventions:
Select "Monitor Library Manager" on the left, then "Operator Interventions"
What do you see?

If you didn't enable the web interface you will have to walk to your library to perform the above checks.

Please report back the results!

wmp
0
 

Author Comment

by:virgo0880
ID: 34924121
I logged a call with IBM and the IBM engineer is going to replace the Tape Drive, as the tape got stuck in the drive. He will be replacing it now. But the problem is, he is just a Library engineer, he doesnt know how to make drive online /visible to TSM. Can u tell me the steps for the same. Currently "q path" is showing drive is not online.

Thanks
virgo
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34924290
I hope your engineer is capable of making the drive known to the library, since that's too complex to explain it here.

Once the drive is replaced remove its definition from AIX with "rmdev -dl rmt26" and "rmdev -dl rmt16" (it's obviously an "alt_pathing" device)

Seems that the drive is not in use by TSM, but if it is you must free it some way. In the worst case, if anything fails you will have to restart TSM.

Now run "cfgmgr". Does the new drive show up with the same device name(s) as the old one?

I don't know if your engineer will update the serial number of the new drive to be the same as the one of the old drive.

If you can keep the old number, you'll just have to issue on TSM (dsmadmc):

Upd path servername drivename srctype=server desttype=drive library=libraryname online=yes
and
Upd drive libraryname  drivename online=yes

servername, drivename and libraryname are the TSM internal names, not things like hostname or /dev/rmtx!

If you get a new serial number it could well be that you'll have to remove the complete path/drive definition from TSM and recreate it, so TSM will recognize the new serial number.

If you don't know how to do that please let me know. I'll assist you.

wmp
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:virgo0880
ID: 34924375
Yes, he has used the old serial number with the new drive. Also, I have 4 paths showing as defined as when he removed the drive. which are showing in defined state : output of lsdev -Cc tape

snbc108:/# lsdev -Cc tape
lmcp0 Available              LAN/TTY Library Management Control Point
rmt0  Available 1Z-08-00-1,0 LVD SCSI Tape Drive
rmt1  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt2  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt3  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt4  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt5  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt6  Defined   1n-08-02     IBM 3592 Tape Drive (FCP)
rmt7  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt8  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt9  Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt10 Available 1n-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt11 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt12 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt13 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt14 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt15 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt16 Defined   1A-08-02     IBM 3592 Tape Drive (FCP)
rmt17 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt18 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt19 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt20 Available 1A-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt21 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt22 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt23 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt24 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt25 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt26 Defined   2M-08-02     IBM 3592 Tape Drive (FCP)
rmt27 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt28 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt29 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt30 Available 2M-08-02-ALT IBM 3592 Tape Drive (FCP)
rmt31 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt32 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt33 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt34 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt35 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt36 Defined   2U-08-02     IBM 3592 Tape Drive (FCP)
rmt37 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt38 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt39 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)
rmt40 Available 2U-08-02-PRI IBM 3592 Tape Drive (FCP)

Also, can u let me know first i have to removed the paths, do cfgmgr and then give the tsm commands to update the drive right...? Will it affect any backups to tapes which are running currently or migration processes runing currrently. Also where can i get servername,drivename and libraryname details...is there any command for that ?
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34924407
lsdev does not show paths, but devices. Please avoid misunderstandings!

OK, your lsdev shows four primary devices which are defined, but not present.
Why is that? Which is the failing one?

On TSM issue

Q PATH * failing_tsm_drive

to see the rmt number.

0
 

Author Comment

by:virgo0880
ID: 34924426
lsdev is showing 1 pri and 3 alt paths (rmt6,rmt16,rmt26,rmt36) tape devices in defined state as we see in the output. Also, the output of the command given by you is :

Source Name     Source Type     Destination     Destination     On-Line
                                Name            Type            
-----------     -----------     -----------     -----------     -------
TSM             SERVER          TDRV6           DRIVE           No    


0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34924499
Sorry, the command should have been
Q PATH * TDRV6 F=D

Where do you see ALT paths?
Please verify with

lscfg -vl rmt6
lscfg -vl rmt16
lscfg -vl rmt26
lscfg -vl rmt36

and compare "Serial Number"

If they're identical remove all four devices using "rmdev -dl ...", run cfgmgr, then issue the TSM commands I gave you.

Currently running processes/sessions will not be affected.
0
 

Author Comment

by:virgo0880
ID: 34924615
Yes the drive is showing online now in q path, I did the steps given by you and it worked fine for me.
Regarding your question I can see in lsdev - PRI or ALT..keywork which I think is the primary and alternate paths. You can see the same in the output given above.

Is there any other way to find whether drive is online or not. I can see the drive is online using command given by you, heres the output :

Source Name: TSM
                   Source Type: SERVER
              Destination Name: TDRV6
              Destination Type: DRIVE
                       Library: L3494B
                     Node Name:
                        Device: /dev/rmt16
              External Manager:
                           LUN:
                     Initiator: 0
                     Directory:
                       On-Line: Yes
Last Update by (administrator):
         Last Update Date/Time: 02/18/11 04:23:00

So, do you think the tape drive is online now..?

Thanks
virgo

0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 34924716
The above command shows that the PATH is online, which is the most important part of the whole chain.

TSM can set a drive offline regardless of PATH status, but that's not typical behaviour.

See the drive's status with

Q DRIVE * TDRV6 F=D

If this shows "On-Line: Yes" everything is fine. Mission accomplished.

I once wrote a script to combine the display of PATH and DRIVE status. Find it in the attachment.

The command

Q SAN T=DR F=D

will give you some info on the SAN related properties of your drives.

To make the above command work you must have set "SANDISCOVERY ON" in the server options.
If it's disabled you can turn it on dynamically (and for the future) with

SETOPT SANDISCOVERY ON

Have fun with TSM!

wmp
DRIVES_PATHS   5          select -                                                    
               10         cast(paths.DESTINATION_NAME as char(11)) "Name", -          
               20         cast(paths.ONLINE as char(12)) "PATH Online" , -            
               30         cast(drives.ONLINE as char(13)) "DRIVE Online", -           
               35         cast(paths.DEVICE as char(10)) "DEVICE" , -                 
               37         cast(drives.DRIVE_SERIAL as char(13)) "SERIAL" -            
               40         from paths, drives -                                        
               50         where -                                                     
               60         paths.DESTINATION_TYPE='DRIVE' -                            
               70         and -                                                       
               80         drives.DRIVE_NAME=paths.DESTINATION_NAME

Open in new window

0
 

Author Closing Comment

by:virgo0880
ID: 34924872
OK
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Bare Metal Image backup allows for the restore of an entire system to a similar or dissimilar hardware. They are highly useful for migrations and disaster recovery. Bare Metal Image backups support Full and Incremental backups. Differential backup…
When we purchase storage, we typically are advertised storage of 500GB, 1TB, 2TB and so on. However, when you actually install it into your computer, your 500GB HDD will actually show up as 465GB. Why? It has to do with the way people and computers…
This tutorial will walk an individual through the process of configuring basic necessities in order to use the 2010 version of Data Protection Manager. These include storage, agents, and protection jobs. Launch Data Protection Manager from the deskt…
This tutorial will show how to configure a new Backup Exec 2012 server and move an existing database to that server with the use of the BEUtility. Install Backup Exec 2012 on the new server and apply all of the latest hotfixes and service packs. The…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question