Help in optimizing PowerShell script

Hello Guys,

I have written the following script for generating a report regarding SSRS, Excel and performance point by analyzing IIS logs:

Write-Host "`r"
if(!(Test-Path E:\BI_ToolUsage)){New-Item E:\BI_ToolUsage -type directory -force}
$today=(get-date).ToString("dd_MM_yyyy")

Write-Host "Content Farm Selected" -f green
$ConPaths="\\Server_logs\IC1-PS502\IIS-exports\*.log"
"date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken" |Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv
"date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken" |Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv
"date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken" |Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv
Foreach($path in $ConPaths)
{
$path
gc $path |?{($_ | Select-String "RSViewerPage.aspx" | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv -Width 30000 -append);($_ | Select-String "xlviewer.aspx" | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv -append -Width 30000);($_ | Select-String "PPSWebParts/ppsDashboard.css" |Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv -Width 30000 -append)}
}
$files=Get-ChildItem E:\BI_ToolUsage\Content*
foreach($file in $files){(Get-Content $file.FullName)| Where-Object {$_ -match '\S'}|%{$_ -replace " " , ","}|Out-File $file.FullName}
Write-Host "Completed !"

Open in new window



The thing here is the size of the folder is 19.3 GB and there are 671 files in the location : \\Server_logs\IC1-PS502\IIS-exports
When I run this script, it usually takes around 3-4 days for generating the reports. Is it somehow possible to optimize the above script which can reduce the time it takes to complete? As per my knowledge modification in the below piece of code can help a lot:

gc $path |?{($_ | Select-String "RSViewerPage.aspx" | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv -Width 30000 -append);($_ | Select-String "xlviewer.aspx" | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv -append -Width 30000);($_ | Select-String "PPSWebParts/ppsDashboard.css" |Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv -Width 30000 -append)}

Open in new window


Kindly look into this and share your views. Thanks
Anisha SinghAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

footechCommented:
You've got some odd/unnecessary syntax in that script.  For example:
 - the first foreach loop.  There is only one path in $ConPaths (though it includes a wildcard)
 - having the bulk of your actions performed within a Where-Object scriptblock.  Where-Object is used to filter objects - the usage doesn't match here.

I think the bulk of your optimization can come from not using append operations.  It's expensive to repeatedly open, write to, and close a file.  There's also some optimization that can come from searching for matches.

One question - can any single line match more that one of the strings you're searching for ("RSViewerPage.aspx", "xlviewer.aspx", "PPSWebParts/ppsDashboard.css")?

Also, what are the file sizes involved here (both the input and output files)?  Can you give me a range?  If it's possible to load some things into memory and keep them there for a bit it could help, but it would increase memory usage of the script - if the usage became too great for the system then you lose the advantage.
footechCommented:
Unless I've missed something, you should be able to combine the operation of replacing spaces with commas with operation of writing to the .CSVs - so you don't have to read the .CSVs back in again.

Give this a try.  It cuts down on the number of append operations.
Write-Host "`r"
if ( !(Test-Path E:\BI_ToolUsage) )
{ New-Item E:\BI_ToolUsage -type directory -force }

$today = (Get-Date).ToString("dd_MM_yyyy")

Write-Host "Content Farm Selected" -f green

$header = "date,time,s-ip,cs-method,cs-uri-stem,cs-uri-query,s-port,cs-username,c-ip,cs(User-Agent),sc-status,sc-substatus,sc-win32-status,time-taken"
$header | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv
$header | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv
$header | Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv

$ConPaths = Get-Item "\\Server_logs\IC1-PS502\IIS-exports\*.log"
foreach ($path in $ConPaths)
{
    $fileContents = Get-Content $path
    
    $fileContents | Where { $_ -like "*RSViewerPage.aspx*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv -Width 30000 -append
    $fileContents | Where { $_ -like "*xlviewer.aspx*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv -append -Width 30000
    $fileContents | Where { $_ -like "*PPSWebParts/ppsDashboard.css*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv -Width 30000 -append
}

Write-Host "Completed !"

Open in new window


Depending on what the answers are to some of the questions I asked before, you might be able to do even more within memory, but I think this gets you most of the way.
Anisha SinghAuthor Commented:
Footech, thanks for your reply.

Yes it is possible that a single may contain all the three strings, that's why the script is built in that way.

As for the input, it is the location as mentioned in the script which is exactly 19.3 Gb and I ran the script a week ago for which it took 5 days to complete and the whole output is around 6 Mbs.
Creating Active Directory Users from a Text File

If your organization has a need to mass-create AD user accounts, watch this video to see how its done without the need for scripting or other unnecessary complexities.

footechCommented:
OK, the script as I rewrote will still handle that.

So all the files in that location are 19.3 GB, but what is the smallest and largest sizes of those files?
Anisha SinghAuthor Commented:
Footech, smallest file size in the location is 1 KB and the largest is 523 MB.
footechCommented:
Most configurations these days shouldn't have a problem with loading the 523 MB file, but you'd have to verify with your setup.

I made one change in the way a variable is populated.
Write-Host "`r"
if ( !(Test-Path E:\BI_ToolUsage) )
{ New-Item E:\BI_ToolUsage -type directory -force }

$today = (Get-Date).ToString("dd_MM_yyyy")

Write-Host "Content Farm Selected" -f green

$header = "date,time,s-ip,cs-method,cs-uri-stem,cs-uri-query,s-port,cs-username,c-ip,cs(User-Agent),sc-status,sc-substatus,sc-win32-status,time-taken"
$header | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv
$header | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv
$header | Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv

$ConPaths = Get-Item "\\Server_logs\IC1-PS502\IIS-exports\*.log"
foreach ($path in $ConPaths)
{    
    Get-Content $path | Tee-Object -Variable fileContents | Where { $_ -like "*RSViewerPage.aspx*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv -Width 30000 -append
    $fileContents | Where { $_ -like "*xlviewer.aspx*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv -append -Width 30000
    $fileContents | Where { $_ -like "*PPSWebParts/ppsDashboard.css*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv -Width 30000 -append
}

Write-Host "Completed !"

Open in new window


I find myself wondering if there would be anything to be gained by copying all the log files to a local drive first (some methods are better at transferring data across a network than others).  If you want you can try comparing the network utilization during:
 - running Get-Content "\\Server_logs\IC1-PS502\IIS-exports\<somelargefile>.log"
 - copying "\\Server_logs\IC1-PS502\IIS-exports\<somelargefile>.log" to a local drive via SMB

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
younghvCommented:
I've requested that this question be deleted for the following reason:

Not enough information to confirm an answer.
footechCommented:
All of my recommendations for optimization are valid.  The code posted in http:#a40925219 implements a number of optimizations reducing the number of times that files are read in and appended to (instead of one append operation for every matching line found, there would only be a maximum of one per file read, and I also consolidated the reformatting of the output files so that doesn't have to happen separately).  How much of an improvement in runtime would depend entirely on the files that are being read in, but there would be an improvement.

It's possible more could be done, but without input from the author, the above stands.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Powershell

From novice to tech pro — start learning today.