Running parallel robocopy jobs

Hello, I'm a novice PowerShell user and I've put together a script that's an amalgam of many that I've been able to find that have parts and pieces of what I need it to do. I know it's not elegant :)  

So far it:
1) imports a csv file that contains the name of a share, it's source, and the destination
2) calls robocopy to copy the file and create a backup of the files plus creates a timestamped log directory and log files for each line in the csv
3) evaluates the log and if anything was copied or errored then it runs one more time to ensure that everything was copied after the second run
4) evaluates the results and, if any files weren't copied the second time, will send an email alert to our help desk software

I have a working copy of that script right now but I need it to run faster since it'll take too long to have all the jobs run in series. I have been struggling to find a way to run the jobs in parallel, though, and that's where I need some help.

The CSV is a file that just has headers named Share, Source, and Backup with values like this:
ShareName,Source,Backup
Corporate_Maintenance_Secure,\\fws.blah.com\corporate\Maintenance Secure,\\FWG-Commvault\Share Backups\Corporate Maintenance Secure
Corporate_Safety,\\fws.blah.com\corporate\Safety,\\fwg-commvault\Share Backups\Corporate Safety

The working script (that needs to be run in parallel rather than in serial) is:

Clear-Host
$ErrorActionPreference = "SilentlyContinue"
$DebugPreference = "Continue"
$VerbosePreference = "Continue"



## robocopy_shares.ps1 ########################################################
## Purpose:      Run robocopy to sync network shares to CommVault server
## History:      2015-04-01 - Created
##                         Robocopies, alerts if any files are moved.
##                         Double-checks if files are copied. Not parallel
###############################################################################



## User Supplied Variables
$Shares = import-csv C:\temp\sharemonitor\shares.csv
$BaseLogLocation = "C:\Temp\ShareMonitor"
$CurrentLogLocation = New-Item "$BaseLogLocation\$(get-date -f yyyy-MM-dd-hh)" -ItemType Directory -Force
$PSEmailServer = "mailhost.fws.farweststeel.com"

ForEach ($Share in $Shares) {

$Source = ( '"' + $Share.Source + '"')
$Dest = ( '"' + $Share.Backup + '"')
$ShareName = $Share.ShareName
$LogFile = "$CurrentLogLocation\$($ShareName)_$(get-date -f yyyy-MM-dd-hh-mm).txt"
$RoboSwitches = "/MIR /R:1 /W:1 /Log+:$LogFile"


## Robocopy Run

$robo_Test = Start-Process -ArgumentList "$Source $Dest $RoboSwitches" robocopy.exe -WAIT


## Use Regular Expression to grab the following Table
#               Total    Copied   Skipped  Mismatch    FAILED    Extras
#    Dirs :         1         0         1         0         0         0
#   Files :         1         0         1         0         0         0
$robo_results = Get-Content $LogFile
$robo_fin = $robo_results -match '^(?= *?\b(Total|Dirs|Files)\b)((?!    Files).)*$'

## Convert Table above into an array
$robo_arr = @()
foreach ($line in $robo_fin){
     $robo_arr += $line
}

## Create Powershell object to tally Robocopy results
$row = "" |select COPIED, MISMATCH, FAILED, EXTRAS
$row.COPIED = [int](($robo_arr[1] -split "\s+")[4]) + [int](($robo_arr[2] -split "\s+")[4])
$row.MISMATCH = [int](($robo_arr[1] -split "\s+")[6]) + [int](($robo_arr[2] -split "\s+")[6])
$row.FAILED = [int](($robo_arr[1] -split "\s+")[7]) + [int](($robo_arr[2] -split "\s+")[7])
$row.EXTRAS = [int](($robo_arr[1] -split "\s+")[8]) + [int](($robo_arr[2] -split "\s+")[8])

$Copied = $row.COPIED
$Mismatch = $row.MISMATCH
$Failed = $row.FAILED
$Extras = $row.EXTRAS

## If there are differences, lets run Robocopy one more time
if ( ($COPIED + $MISMATCH + $FAILED + $EXTRAS) -gt 0 ){

    ## Robocopy Run

    $robo_Test = Start-Process -ArgumentList "$Source $Dest $RoboSwitches" robocopy.exe -WAIT

    ## Use Regular Expression to grab the following Table
    #               Total    Copied   Skipped  Mismatch    FAILED    Extras
    #    Dirs :         1         0         1         0         0         0
    #   Files :         1         0         1         0         0         0
    ## $robo_results = $robo_test -match '^(?= *?\b(Total|Dirs|Files)\b)((?!    Files).)*$'
    $robo_results = Get-Content $LogFile
    $pattern = "-match '^(?= *?\b(Total|Dirs|Files)\b)((?!    Files).)*$'"
    $robo_fin = $robo_results -match '^(?= *?\b(Total|Dirs|Files)\b)((?!    Files).)*$'

    ## Convert Table above into an array
    $robo_arr = @()
    foreach ($line in $robo_fin){
         $robo_arr += $line
    }

    ## Create Powershell object to tally Robocopy results
    $row = "" |select COPIED, MISMATCH, FAILED, EXTRAS
    $row.COPIED = [int](($robo_arr[5] -split "\s+")[4]) + [int](($robo_arr[6] -split "\s+")[4])
    $row.MISMATCH = [int](($robo_arr[5] -split "\s+")[6]) + [int](($robo_arr[6] -split "\s+")[6])
    $row.FAILED = [int](($robo_arr[5] -split "\s+")[7]) + [int](($robo_arr[6] -split "\s+")[7])
    $row.EXTRAS = [int](($robo_arr[5] -split "\s+")[8]) + [int](($robo_arr[6] -split "\s+")[8])

    $Copied = $row.COPIED
    $Mismatch = $row.MISMATCH
    $Failed = $row.FAILED
    $Extras = $row.EXTRAS
   
    if ( ($COPIED + $MISMATCH + $FAILED + $EXTRAS) -gt 0 ){
        "Not all data is replicated." >> $Logfile
          $anonUser = "anonymous"
        $anonPass = ConvertTo-SecureString "anonymous" -AsPlainText -Force
        $anonCred = New-Object System.Management.Automation.PSCredential($anonUser, $anonPass)
        $MessageSubject = ("Verify backups for "+$Share.Sharename+"")
          $MessageBody = ("Backups for "+$Share.Sharename+" are not the same size as the data on the share. Please reference the robocopy logs and correct the situation.")
          Send-MailMessage -to "blah@blah.com" -from "Backup Verification <do-not-reply@blah.com>" -subject $MessageSubject -body $MessageBody -Credential $anonCred
   
    }

else {
    "All data is replicated." >> $Logfile
      $anonUser = "anonymous"
    $anonPass = ConvertTo-SecureString "anonymous" -AsPlainText -Force
    $anonCred = New-Object System.Management.Automation.PSCredential($anonUser, $anonPass)
    $MessageSubject = ("Verify backups for "+$Share.Sharename+"")
      $MessageBody = ("Backups for "+$Share.Sharename+" are the same size as the data on the share. No actions are needed.")
      Send-MailMessage -to "blah@blah.com" -from "Backup Verification <do-not-reply@blah.com>" -subject $MessageSubject -body $MessageBody -Credential $anonCred
      }

}

}

Please lend a hand with some code that would allow me to run this thing in parallel. Thanks!
cdummyAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

jmcgOwnerCommented:
How frequently is the second-chance run of Robocopy both needed and successful?

Do you have some measurement that indicates parallelization would help? You could try splitting your list in half and run two copies of the script simultaneously, one copy against each half of the list (you might have to make sure that the file names used are distinct so the two copies don't interfere with each other). I would not be surprised to hear that the run time of the single job is about the same as the run time when you split it into two parallel jobs. You'll be limited by file system read/write speed or by network transfer bandwidth and those are bottlenecks outside the control of your script.
0
cdummyAuthor Commented:
How frequently? Depends on how active our night shift is that time. The second run is essentially a verification check to make sure that whatever was copied on the first pass was just new and not something that was refusing to be moved over. In testing it's always successful but it's not in production right now, either.

Parallelization would definitely help because if the script has to run in serial it takes way too long. Ideally the robocopy kicks off for each of our remote branches and syncs up their data quickly. Our aim is to do that 4 times a day which isn't possible if all the jobs run one after the other.

As far as the other bits and pieces you mentioned, I've got those under control, never fear! They're not a concern. If you have some advice regarding code I'd be very interested in hearing that, though.
0
jmcgOwnerCommented:
I looked at this website for ideas to try to accomplish what you say you wanted:

http://newsqlblog.com/2012/04/16/concurrency-in-powershell-background-jobs-2/

The blogger, Jon Boulineau, has more posts there that go into more refined mechanisms for PowerShell parallelism.

In the code posted here, I've extracted out as $ScriptBlock the main loop of your current script and run it as an asynchronous background job. You can control the number of simultaneously running jobs $MaxJobs to achieve the degree of parallelism that you want while not necessarily becoming an overwhelming resource hog.

I don't have the capacity to test this, so it's just a best guess at this point. I was concerned to try to keep the scope correct for variables used within the ScriptBlock so they are passed as params or, in the case of the mail server name, moved inside the block. In looking at your script, it appeared to me that the logfile separation by sharename and timestamp should prevent collisions and that the background jobs could operate independently -- but there may be entanglements I did not spot. Also this version of the script is fire-and-forget, each job is responsible for its own notification and cleanup

Clear-Host
$ErrorActionPreference = "SilentlyContinue"
$DebugPreference = "Continue"
$VerbosePreference = "Continue"



## robocopy_shares.ps1 ########################################################
## Purpose:      Run robocopy to sync network shares to CommVault server
## History:      2015-04-01 - Created
##                         Robocopies, alerts if any files are moved.
##                         Double-checks if files are copied. Not parallel
###############################################################################



## User Supplied Variables
$Shares = import-csv C:\temp\sharemonitor\shares.csv
$BaseLogLocation = "C:\Temp\ShareMonitor"
$CurrentLogLocation = New-Item "$BaseLogLocation\$(get-date -f yyyy-MM-dd-hh)" -ItemType Directory -Force

### Control number of simultaneous jobs
$MaxJobCount = 3

### This loop will occur later...
### ForEach ($Share in $Shares) {

### Encapsulate part of script used for separate job
$ScriptBlock = `
{
    Param($Share, $CurrentLogLocation)

#Constant - moved to keep it in scope
$PSEmailServer = "mailhost.fws.farweststeel.com"

$Source = ( '"' + $Share.Source + '"')
$Dest = ( '"' + $Share.Backup + '"')
$ShareName = $Share.ShareName
$LogFile = "$CurrentLogLocation\$($ShareName)_$(get-date -f yyyy-MM-dd-hh-mm).txt"
$RoboSwitches = "/MIR /R:1 /W:1 /Log+:$LogFile"


## Robocopy Run

$robo_Test = Start-Process -ArgumentList "$Source $Dest $RoboSwitches" robocopy.exe -WAIT


## Use Regular Expression to grab the following Table
#               Total    Copied   Skipped  Mismatch    FAILED    Extras
#    Dirs :         1         0         1         0         0         0
#   Files :         1         0         1         0         0         0
$robo_results = Get-Content $LogFile 
$robo_fin = $robo_results -match '^(?= *?\b(Total|Dirs|Files)\b)((?!    Files).)*$'

## Convert Table above into an array
$robo_arr = @()
foreach ($line in $robo_fin){
     $robo_arr += $line
}

## Create Powershell object to tally Robocopy results
$row = "" |select COPIED, MISMATCH, FAILED, EXTRAS
$row.COPIED = [int](($robo_arr[1] -split "\s+")[4]) + [int](($robo_arr[2] -split "\s+")[4])
$row.MISMATCH = [int](($robo_arr[1] -split "\s+")[6]) + [int](($robo_arr[2] -split "\s+")[6])
$row.FAILED = [int](($robo_arr[1] -split "\s+")[7]) + [int](($robo_arr[2] -split "\s+")[7])
$row.EXTRAS = [int](($robo_arr[1] -split "\s+")[8]) + [int](($robo_arr[2] -split "\s+")[8])

$Copied = $row.COPIED
$Mismatch = $row.MISMATCH
$Failed = $row.FAILED
$Extras = $row.EXTRAS

## If there are differences, lets run Robocopy one more time
if ( ($COPIED + $MISMATCH + $FAILED + $EXTRAS) -gt 0 ){

    ## Robocopy Run

    $robo_Test = Start-Process -ArgumentList "$Source $Dest $RoboSwitches" robocopy.exe -WAIT

    ## Use Regular Expression to grab the following Table
    #               Total    Copied   Skipped  Mismatch    FAILED    Extras
    #    Dirs :         1         0         1         0         0         0
    #   Files :         1         0         1         0         0         0
    ## $robo_results = $robo_test -match '^(?= *?\b(Total|Dirs|Files)\b)((?!    Files).)*$'
    $robo_results = Get-Content $LogFile 
    $pattern = "-match '^(?= *?\b(Total|Dirs|Files)\b)((?!    Files).)*$'"
    $robo_fin = $robo_results -match '^(?= *?\b(Total|Dirs|Files)\b)((?!    Files).)*$'

    ## Convert Table above into an array
    $robo_arr = @()
    foreach ($line in $robo_fin){
         $robo_arr += $line
    }

    ## Create Powershell object to tally Robocopy results
    $row = "" |select COPIED, MISMATCH, FAILED, EXTRAS
    $row.COPIED = [int](($robo_arr[5] -split "\s+")[4]) + [int](($robo_arr[6] -split "\s+")[4])
    $row.MISMATCH = [int](($robo_arr[5] -split "\s+")[6]) + [int](($robo_arr[6] -split "\s+")[6])
    $row.FAILED = [int](($robo_arr[5] -split "\s+")[7]) + [int](($robo_arr[6] -split "\s+")[7])
    $row.EXTRAS = [int](($robo_arr[5] -split "\s+")[8]) + [int](($robo_arr[6] -split "\s+")[8])

    $Copied = $row.COPIED
    $Mismatch = $row.MISMATCH
    $Failed = $row.FAILED
    $Extras = $row.EXTRAS
    
    if ( ($COPIED + $MISMATCH + $FAILED + $EXTRAS) -gt 0 ){
        "Not all data is replicated." >> $Logfile
          $anonUser = "anonymous"
        $anonPass = ConvertTo-SecureString "anonymous" -AsPlainText -Force
        $anonCred = New-Object System.Management.Automation.PSCredential($anonUser, $anonPass)
        $MessageSubject = ("Verify backups for "+$Share.Sharename+"")
          $MessageBody = ("Backups for "+$Share.Sharename+" are not the same size as the data on the share. Please reference the robocopy logs and correct the situation.")
          Send-MailMessage -to "blah@blah.com" -from "Backup Verification <do-not-reply@blah.com>" -subject $MessageSubject -body $MessageBody -Credential $anonCred
    
    }

else { 
    "All data is replicated." >> $Logfile
      $anonUser = "anonymous"
    $anonPass = ConvertTo-SecureString "anonymous" -AsPlainText -Force
    $anonCred = New-Object System.Management.Automation.PSCredential($anonUser, $anonPass)
    $MessageSubject = ("Verify backups for "+$Share.Sharename+"")
      $MessageBody = ("Backups for "+$Share.Sharename+" are the same size as the data on the share. No actions are needed.")
      Send-MailMessage -to "blah@blah.com" -from "Backup Verification <do-not-reply@blah.com>" -subject $MessageSubject -body $MessageBody -Credential $anonCred
      }

}

}
#### end of ScriptBlock

Foreach($Share in $Shares)
{
    Start-Job -ScriptBlock $ScriptBlock -ArgumentList $Share, $CurrentLogLocation

    While( $(Get-Job -State Running | Measure-Object).count -ge $MaxJobs)
    {
        Start-Sleep -Milliseconds 100
    }
} 

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Discover the Answer to Productive IT

Discover app within WatchGuard's Wi-Fi Cloud helps you optimize W-Fi user experience with the most complete set of visibility, troubleshooting, and network health features. Quickly pinpointing network problems will lead to more happy users and most importantly, productive IT.

cdummyAuthor Commented:
Thanks jcmg, but it's not 100% quite yet. It appears to start the first job and then hangs up without proceeding to any others. It looks like the first robocopy job does complete successfully. Thanks a ton for assisting on this script--any ideas what is hanging things up?
0
jmcgOwnerCommented:
Try it with line 23 changed to:

$MaxJobs = 3

(This is just the sort of thing that happens when one writes code without being in a position to test it.)
0
cdummyAuthor Commented:
Yes, that's totally it! Woohoo, that's just what I was hoping for when I started pulling this thing together. Thanks so much!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Powershell

From novice to tech pro — start learning today.