Netbackup Master and media servers

Hi Friends,

Need some suggestion on Netbackup. We are using it from many years however we are facing issues these days in exchange backups and VM backups. (Snapshot errors, error 130, error 156, partial backups).
We consulted Symantec and they are suggesting us to add media servers. Right now we have 3 Master servers in 3 regions and they are acting as a media server too.
Do we really need to add media servers. I am attaching an excel file with job count details. Do let me know if you experts need more details/information about our environment.
Who is Participating?
Duncan MeyersConnect With a Mentor Commented:
That's very slow. You should be able to complete 537GB in a couple of hours, assuming LTO-5 or LTO-6 tapes. Even LTO-4 should complete within a couple of hours. Even if you were running your backups across a single 1Gb Ethernet link, they should finish with 3 or 4 hours.

There could be many causes of the poor performance:
- file servers with millions of small files are *always* slow
- Exchange brick-level backups are *always* slow. Use a database backup and Recovery Storage Groups instead.
- defrag Windows servers. File systems fragmentation can really hurt backups. Diskeeper is available here:
- For Windows backups, try and group a number of clients into the same policy so you can get some concurrency happening
- read this:
- try these tips:

Test read-speed from disk on media server to verify that data streams can be read as fast as the tape can write. Use the bpbkar test as per the Performance Tuning Guide:

Choose a folder/file system of at least 2GB in size.

1 Turn on the legacy bpbkar log by ensuring that the bpbkar directory exists.
2 Set logging level to 1.
3 Enter the following:
/usr/openv/netbackup/bin/bpbkar -nocont -dt 0 -nofileinfo -nokeepalives file_system > /dev/null
Where file_system is the path being backed up.
install_path\NetBackup\bin\bpbkar32 -nocont X:\ > NUL
Where X:\ is the path being backed up.
4 Check how long it took NetBackup to move the data from the client disk:
UNIX: The start time is the first PrintFile entry in the bpbkar log. The end time is the entry "Client completed sending data for backup." The amount of data is given in the entry "Total Size."
Windows: Check the bpbkar log for the entry "Elapsed time."


 1. Check client settings:
        - Confirm your data are not 1000's of small files if is the case than you need a different solution.
        - Confirm your data is traveling through the backup NIC and is configured 1gig/auto under the switch and OS levels
        - Review your NETWORK_BUFFER_SIZE or Buffer_Size for windows, highiest value can be 256KB
        - If you have logging enabled at verbose 5 can impact performance lower this to 0
        - Where is the data stored DAS, SAN or NAS and based on this you need to confirm you are not doing Cross Mount Points for NAS Qtrees, Check SAN side, how is the throughput, any errors at the switch level?, what RAID configuration you have?, normaly SAN is never the problem, but dont let anything open. DAS can be tested with iostat commands in unix and performance monitor in windows a bad drive can give you some pain.
        - Measure the Network Speed with simple FTP copy of a 10GB file to your media server this will tell you how the Network speed is doing, also you can open a ticket with Symantec to run SAS and detect Network problems.

    2. Media Server Settings:
        - Check NUMBER_DATA_BUFFERS and SIZE_DATA_BUFFERS, 128 and 256 are the average values, but ensure your shared memory is configured to support the:
NUMBER_OF_DRIVES * MULTIPLEXING * (NUMBER_DATA_BUFFERS * SIZE_DATA_BUFFERS) = TOTAL_MEMORY. If you dont have enough memory than you cannot set this values.
        - Same deal as the clients check NIC speeds settings and switches configs.
        - Check media server I/O using dd and mt commands to backup files to your drives and check each one, this will measure your throughput speed between media and drive.
        - Confirm your HBA's speed is at the highiest value and that is supported by your Fabric.
        - Search for errors at SAN switch level for your tape drives.
        - with iostat/kstat you can see the speed of all your drives and look for I/O and Transport Errors.

Duncan MeyersCommented:
Backup environments tend to get neglected as IT environments grow, so backup infrastructure tends to bend, then break under the increased load as years go by. Symantec is almost certainly correct, and there's an easy way to check:

First you need to work out how many TB there is at each site. You also need to know your backup window - this mustn't exceed 12 hours. Once you have that, divide the number of TB by the backup window and multiply by 1000 to get the number of GB per hour (the throughput) you need to achieve to hit the backup window. Next step is to workout how many of the 'bottlenecks' you need to hit the throughput figure, so divide the throughput by:
504 GB/hr for LTO-5 tape drives
576 GB/hr for LTO-6 tape drives
You now have the number of tape drives going absolutely flat-out required to hit then backup window. Make sure you round up.
Compare this number to the number of tape drives you actually have. Now add 50% to that number as you'll not be able to drive tapes flat-out all the time (backup clients being what they are), so you need headroom.
For network links, divide the throughput figure by
180 GB/hr for 1Gb Ethernet
1800 GB/hr for 10Gb Ethernet
You now have the number of NICs and switch ports you need to hit the required  throughput.
If you exceed, say, four to eight 1Gb/sec NICs, then it's time to add a media server.

Symantec has a nifty solution with their NBU53x0 appliances - they give you de-duped backup to disk with a built-in media server and integrated tape-out, so they solve a bunch of problems at once
TechJoshiAuthor Commented:
Thanks meyersd! I will check these parameters and get back to you in case of any confusion.

Cloud Class® Course: Amazon Web Services - Basic

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

TechJoshiAuthor Commented:
Attached is the data backed up in 12 hours window. Please advice is it moderate or high?
Duncan MeyersCommented:
The bottom line, though, is another media server, or a NBU5320 appliance is unlikely to help. If you're only getting 44GB per hour throughput, you have a problem in the backup environment that you need to fix first.
TechJoshiAuthor Commented:
Thanks! Very helpful.
What should we do for the servers with 1000's of files being backed up every week. I have seen that it is taking almost a week for thess clients and next full schedule is waiting for the same client.
Duncan MeyersCommented:
Almost a week?? That's crazy. Can you open another question please and we should get to the  bottom of it. Please post a link to your new question here.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.