Solved

Arcserve 11.5 - Scheduling Multiple backup jobs - Order of execution

Posted on 2006-07-22
9
1,526 Views
Last Modified: 2008-01-09
Using Arcserve 11.5 on W2K3 with Ultrium 2 LTO library.

I have a GFS scheme that does daily differentials with Friday Fulls. Backing up 4 NAS boxes plus numerous servers and the odd workstation. I've scheduled 8 jobs which together backup everything needed in approximately 70 hours which just nudges into monday morning. The separate jobs mean that if a job fails, it can be rerun in isolation without running another 70 hour job!

The jobs are scheduled at hourly intervals with the first at 1700 and the last at 2359 and are in a specific order (most critical data first). The daily differential jobs usually take up to about an hour or an hour and a half so sometimes a job is scheduled to start whilst a job is in progress and arcserve logs the fact that the job couldn't start every hour until it is able to begin. On weekdays this isn't a problem. The jobs will run in order in most cases. Usually there is never more than one job waiting to run whilst a job is in progress.

My issue is with the full backups. The first Job takes about 15 hours and after 7 of those have elapsed there are 7 backups waiting to run whilst the first is in progress and this gives me various problems.

1) sometimes, some of the jobs never start all weekend.
2) when the first job finishes the next in the queue is not necessarily the one that runs next and my careful job ordering goes right out of the window.


So my question is:
Faced with multiple jobs that have passed their scheduled start time, in what order will arcserve start the waiting jobs and why will it sometimes give trying to start some jobs altogether?
0
Comment
Question by:jahboite
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
  • 3
9 Comments
 
LVL 22

Expert Comment

by:dovidmichel
ID: 17162195
There is no way to determine which job will get picked first when there are multiple jobs ready to go. In any case all jobs should be run.

Set the Job Engine in debug mode and perhaps something will show up in the activity log indicating why those jobs are not running.

If not already installed, install ARCserve 11.5 SP1 and Device Support Update 6, oh and if your NAS is OnStor there is one for that also.
http://supportconnectw.ca.com/public/storage/downloads/nt/115/basbr115-solpatch.asp

0
 
LVL 1

Accepted Solution

by:
markthomasrosenecker earned 500 total points
ID: 17237998
With ARCServe 11 (and pretty much any version since 2000 or perhaps 6.5), you can set scheduling priority for the nodes.  This means that you can have a SINGLE job, rather than many, for all of your hosts, and you can set the most important ones to go first.  Additionally, since you're using v11.5, you have the option for multiplexing your backups (having more than one client stream data to you at one time, which significantly decreases the backup window, and maximizes network and tape drive utilization).  LTO drives (or pretty much any tape drive, for that matter) like to run at full speed, and constantly have data streaming to it at full speed.  if your single client hiccups or studders, and doesn't send data to the drive as fast as the drive wants it, the drive has to stop, rewind, reposition, and then begin writing again.  This can add a very significant amount of time to the individual backup job.  Having multiple servers sending data at the same time generally resolves this slowdown...which may decrease that 70 hour backup window.

How much data are you backing up nightly, and what kind of networking equipment do you have (100Mb, 1000Mb, etc?)


          --Rosey
0
 
LVL 22

Expert Comment

by:dovidmichel
ID: 17238683
Source Priority is for multiple targets within the same job and will not effect the order multiple jobs from within the queue run.
0
Edgartown IT Case Study

Learn about Edgartown's quest to ensure the safety and security of the entire town's employee and citizen data. Read the case study!

 
LVL 1

Expert Comment

by:markthomasrosenecker
ID: 17241312
Yes, that was my point.  Combine all of the jobs into one main job, then set the source priority, so that the important systems get backed up first.

If a specific target is missed, just do a one-off backup job after the main job is done, rather than auto-rescheduling it to run the full job again.
0
 
LVL 12

Author Comment

by:jahboite
ID: 17253823
Thankyou for your input folks. Very helpful.

dovidmichel:
I think you're right that there's no way to choose which job will get picked first when there are multiple jobs that have reached their scheduled start time. I'm still trying to work out if arcserve chooses based on certain criteria, such as the lowest Job No. - I haven't been able to find anything written down so far so I'll collect some more data and see if I can see a pattern.
Thanks also for putting me on to the update 6.

Rosey:
The reason I've got multiple jobs is basically because we've moved from arcserve2000 and a 100GB/tape native AIT library (and not backing up anywhere near as much as we needed to) to 11.5 and 200 GB/tape native LTO2 library.  I'm still getting to grips with the whole thing and having multiple jobs means more flexibilty when things go wrong.
Having said that, I think that I've ironed out most of our issues which basically arose because of the huge amounts of superfluous data, so I'm thinking that your suggestion of 1 job is more practicable than it was at the start. I'm going to try it next full backup!

I'm regularly archiving data as it becomes redundant so a full backup is currently just over 1.3TB in 83 hours!

I'm also very interested in the multiplexing since you suggest the tape library will thank me for it like a dog with two dicks!  I'd kind of dismissed the idea of multiplexing partly because I thought the NIC on the backup server (which due to constraints is also storage for 0.6TB of live data) couldn't cope and partly because I don't currently know enough about it.  But I'm going to make the effort to find out more.
Most of the targets are on 1000MB NICs and we've got 2 1000MB switches (as well a two 100MB ones) - I'm going to have to make sure of this and perhaps put all targets through the same switch if they're not already.
The one backup target which has a scsi pipe straight to the tape library is reported as backed up at a rate of more than 1000MB/min. Other targets are lucky to achieve 300MB/min so if the NIC on that machine could handle it, I could possibly see a 45 hour reduction in time.

Thanks again for the info and I'll post the results of your suggestions when I've had a go.
0
 
LVL 1

Expert Comment

by:markthomasrosenecker
ID: 17254596
Theoretically, gigabit ethernet should max out at 450GB/hr, and assuming only 30% efficiency (which is just about right for ethernet), you should still get 125GB/hr.  Your LTO-2 tape drive can max out (for very large database files, streamed continuously without delay) at about 110GB/hr, so network bandwidth shouldn't be the issue there.  

Send as much as you can to it as fast as you can.  Backup servers are meant to be punished and pummelled whenever possible.  :-)

I always tell my network engineers that if they screw anything up, I'll find out by the following morning!  :-)

Let us know how it goes, and if you need any help configuring the job, feel free to ask.  Basically, though, you add all of your clients, set the multiplexing level (defaults to 4, and that's the max, I believe, without purchasing more licenses for it), and then set the source priority, and you should be golden.

Theoretical finish time should be within 11 hours, but real-world situations dictate that it'll probably take you closer to 20-30.  Still considerably better than the 83 it is/was taking.

Good luck!
0
 
LVL 22

Expert Comment

by:dovidmichel
ID: 17257600
As for picking from multiple jobs all ready to go in the queue, there is no order, it is totally random, just up to which one it happens to hit next. Having a priority order is already on the suggestion request list.

In general multiplexing will increase the total throughput. However to find the best takes some experimentation as to how many targets will be used at the same time, too many and throughput goes down. Also from what I have seen multiplexing has its own overhead which will take up around 10 and sometimes as much as 20% of the space on tape.
0
 
LVL 12

Author Comment

by:jahboite
ID: 17258406
It does seem random, but the last two weekends, the jobs have started in the same (random) order. So maybe it isn't quite random....
0
 
LVL 12

Author Comment

by:jahboite
ID: 17451952
This thread seems to have two streams, one of which slowed to a trickle just as another trickle was swelling magnificantly!

So to the original question (which technically is two questions):

"Faced with multiple jobs that have passed their scheduled start time, in what order will arcserve start the waiting jobs and why will it sometimes give trying to start some jobs altogether?"

This stream has run dry because I switched to using a multiplex job in place of the multiple jobs when I opened this thread.  Hence I never managed to get much data to see if a pattern emerged.  I also stopped seeing jobs that decided not to run; maybe because they knew I was on to them or maybe because I was fiddling too much...
So this stream ends and the questions remain unanswered.
But the points will not die.

For there is that other stream:

markthomasrosenecker dropped a bombshell and my eyes widened to the possibilities of multiplexing.  And what a difference it made!
The first week i tested it and was overjoyed to see the 1.3TB rocket on to tape in 31 hours (down from 80 odd!). In fact it was better than that because all but one of the machines finished inside 24 hours (and only four multiplexing streams at one time) with one machine taking just over 30 hours.
During the week after that we ditched Symantec AV and rolled out eTrust which seems to be much less of a resource hog!
The following weekend; 1.3TB in 18 hours!

And that my friends is what I consider a right royal result.

Which means markthomasrosenecker is a worthy winner of the poinks and thank you sir!
Thankee to all who participated!
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Concerto Cloud Services, a provider of fully managed private, public and hybrid cloud solutions, announced today it was named to the 20 Coolest Cloud Infrastructure Vendors Of The 2017 Cloud  (http://www.concertocloud.com/about/in-the-news/2017/02/0…
The business world is becoming increasingly integrated with tech. It’s not just for a select few anymore — but what about if you have a small business? It may be easier than you think to integrate technology into your small business, and it’s likely…
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question