Solved

tsm client taking more time for backup

Posted on 2010-11-15
18
1,203 Views
Last Modified: 2013-11-14
Hi All,

I have a node for which i have scheduled incremental backup. But this node is taking 2-3 hours for backing up only 126 number of files on average basis. How can i find out what is the problem, why the backup is taking this much time instead of completing in matter of minutes. Other systems in the schedule are completing in 5-10 minutes.

Thanks
virgo
0
Comment
Question by:virgo0880
  • 9
  • 7
  • 2
18 Comments
 
LVL 25

Expert Comment

by:madunix
Comment Utility
Did you check the connection? it could be a half-duplex connection instead of a full duplex ..mismatched settings between host NIC and the switch port.
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
What do the statistics displayed at the end of each backup run say?

Particularly contrast "Network data transfer rate:" with "Aggregate data transfer rate:"!

If there is not a huge difference there might indeed be a network misconfiguration, but if the network rate seems acceptable compared to the ones of your other systems we should suspect a client problem here!

- System overloaded, e.g. short on memory (high paging rates) or CPU?
- Stale NFS volumes involved and the client has to wait for the timeout?

If in doubt please post the statistics from "--- SCHEDULEREC STATUS BEGIN" up to "--- SCHEDULEREC STATUS END"

wmp
0
 

Author Comment

by:virgo0880
Comment Utility
I checked the NIC speed, it is set to auto negotation.

Hi wmp,

Please find the output :

11/15/10   22:32:06 --- SCHEDULEREC STATUS BEGIN
11/15/10   22:32:06 Total number of objects inspected:  788,139
11/15/10   22:32:06 Total number of objects backed up:      144
11/15/10   22:32:06 Total number of objects updated:          0
11/15/10   22:32:06 Total number of objects rebound:          0
11/15/10   22:32:06 Total number of objects deleted:          0
11/15/10   22:32:06 Total number of objects expired:         32
11/15/10   22:32:06 Total number of objects failed:           0
11/15/10   22:32:06 Total number of bytes transferred:   91.23 MB
11/15/10   22:32:06 Data transfer time:                    9.06 sec
11/15/10   22:32:06 Network data transfer rate:        10,305.46 KB/sec
11/15/10   22:32:06 Aggregate data transfer rate:          8.65 KB/sec
11/15/10   22:32:06 Objects compressed by:                    0%
11/15/10   22:32:06 Elapsed processing time:           02:59:55
11/15/10   22:32:06 --- SCHEDULEREC STATUS END
11/15/10   22:32:06 --- SCHEDULEREC OBJECT END MIS_UNIX_XXXX 11/15/10   19:30:00
11/15/10   22:32:06 Scheduled event 'MIS_UNIX_XXXX' completed successfully.
11/15/10   22:32:06 Sending results for scheduled event 'MIS_UNIX_XXXX'.
11/15/10   22:32:06 Results sent to server for scheduled event 'MIS_UNIX_XXXX'.

0
 
LVL 25

Expert Comment

by:madunix
Comment Utility
in client check dsmsched.log and dsmerror.log? clear client logs and take a manual backup and see what happens, check "q act" I usuauly prefer to see what happens in server side by "dsmadmc -console"
If you are using TSM journal, also verify that journal service is operational.
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
OK,

as you can see your network transfer rate is not the cause of your issue.

10 MB/sec is phantastic for a 100Mb network, and is still tolerable (although not really good) for a GE network.

I'd rather suspect that the number of files to be inspected is the culprit!

Nearly 800.000 files is not that uncommon, but on the other hand, if your client machine is a bit short on memory or CPU, inspecting this much files can be time consuming.

What are the values of your other, well-performing machines? Is there a comparable high amount of files, what are the transfer rates?

Does the client show this bad performance consistently? If not, there could also be other effects like media waits or the like.

Anyway, should it turn out that the number of files is actually the cause, you could consider implementing journal based backup!

wmp
0
 

Author Comment

by:virgo0880
Comment Utility
I dont think CPU/memory is the issue, as this system (IBM,7038-6M2) is having 4cpu/24g of configuration. Other well-performing machines are having values from 1431585, 1325417 etc...yes, i have taken a average of last 10 days & found that it is taking approx 3 hours on a daily basis...but the number of files to be backed up is very less, other machines are completing backup from 7-10 minutes. This is AIX 5.3.

Thanks
virgo
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
I fear you're a bit unclear - and a bit reticent!

1431585, 1325417 etc

What are these values? Number of files ...?

The number of files actually backed up is by no means a criterion here (as your network seems to work well), it's the number of files inspected which counts!

What are the transfer rates (network/aggregated) of those well-performing machines?
Does the slow machine have to back up NFS/automount volumes?
If so, what is your NFS performance in general, apart from TSM?

Do you have I/O waits, maybe due to disks being slower as the ones of the other machines? Or is it all the same SAN?



0
 

Author Comment

by:virgo0880
Comment Utility
Yes, these are the number of files examined, the files backed up are less say...150-200 files, for some of the other nodes. Also, there are no nfs mount points configured for the backup, it is just the OS fs backup which is scheduled.

virgo
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
So there are not many options left.

- Bad disk I/O performance (you didn't answer my question about that above).
-- Record an iostat during backup

- Heavy batch jobs in parallel to the backup (what's the task of the slow system anyway?)
-- Record a vmstat during backup

- Other hardware/software issues which might be reflected in the error log.
-- Examine errpt

- TSM client configuration
-- Compare dsm.sys/opt, TSM node definition with the corresponding data of the "good" systems.

wmp

0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:virgo0880
Comment Utility
Sorry for the delayed response, i will check on the things & revert.

Thanks
virgo
0
 

Author Comment

by:virgo0880
Comment Utility
Hi , I have checked the NIC setting on the switch & on the system, both are set to full 1 gb. But, when i see dsm.opt with one of my working system, here is the diff :

Working system : dsm.opt
$ cat dsm.opt
************************************************************************
* ADSTAR Distributed Storage Manager                                   *
*                                                                      *
* Sample Client User Options file for AIX and SunOS (dsm.opt.smp)      *
************************************************************************

*  This file contains an option you can use to specify the ADSM
*  server to contact if more than one is defined in your client
*  system options file (dsm.sys).  Copy dsm.opt.smp to dsm.opt.
*  If you enter a server name for the option below, remove the
*  leading asterisk (*).

*  For information about additional options you can set in this file,
*  see the options.doc file in the directory where ADSM was installed.

************************************************************************

* SErvername       A server name defined in the dsm.sys file

While the node on which i have the problem, is having dsm.opt as :

* SErvername       A server name defined in the dsm.sys file
Servername                      TSM
Tracefile   /tmp/Encrtrace.out
traceflags  service
tracemax 1024

What are these extra options defined like tracefile,traceflags,tracemax, is it because of this, the backups are taking more time.

thanks
virgo
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 250 total points
Comment Utility
Yes, of course!

You're running a fat service trace with each backup, which costs a lot of time (and disk space).

Check /tmp/Encrtrace.out! This file should be rather big - TRACEMAX is set to 1 GB.

Take out these three options if you don't need the trace and you'll see.

Btw. for completeness you should add the servername to dsm.opt of the "working" system. Although this value defaults to the first "Servername" entry in dsm.sys, it's better to have it in dsm.opt as well.





0
 

Author Comment

by:virgo0880
Comment Utility
I dont see /tmp/Encrtrace.out file on the system. Also, if i remove this option whether i need to restart the client scheduler for the changes to take effect ?

Once removing this option, can i check that the backups are not taking more time, by issuing some commands on the command-line, we have scheduled incremental backups for this system, what command i shud gave on command line to check if it is getting done early.

thanks
virgo
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
You must restart the client scheduler.

To check the effect just examine the statistics at the end of the scheduler log,
that't the easiest way.
If you didn't configure SCHEDLOGNAME in dsm.sys
it defaults to /usr/tivoli/tsm/client/ba/bin/dsmsched.log (or .../bin64/... with the 64bit client).
0
 

Author Comment

by:virgo0880
Comment Utility
ok, i have removed the trace optioins. Now, can i manually check  by issuing the backup command for the systems, instead of waiting till evening till the backup get starts.

Regards
virgo
0
 
LVL 68

Expert Comment

by:woolmilkporc
Comment Utility
Yes, of course.

Issue "dsmc i"

If you want a more verbose log in a file and watch the protocol simultaneously at the console issue

"dsmc i -verbose | tee /tmp/dsmincr.log" (filename is just an example)

0
 

Author Comment

by:virgo0880
Comment Utility
I tried executing the backup again & now it is taking less time for the backup, specially one of the filesystem was taking 2 hours for the backup & now it completed within 4 minutes. I will monitor the backup today & see how much time it wud take for the whole system to backup.

Thanks , wmp for your great help.

virgo
0
 

Author Closing Comment

by:virgo0880
Comment Utility
OK
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

By default, Carbonite Server Backup manages your encryption key for you using Advanced Encryption Standard (AES) 128-bit encryption. If you choose to manage your private encryption key, your backups will be encrypted using AES 256-bit encryption.
The article will include the best Data Recovery Tools along with their Features, Capabilities, and their Download Links. Hope you’ll enjoy it and will choose the one as required by you.
This tutorial will walk an individual through configuring a drive on a Windows Server 2008 to perform shadow copies in order to quickly recover deleted files and folders. Click on Start and then select Computer to view the available drives on the se…
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now