Link to home
Start Free TrialLog in
Avatar of Anisha Singh
Anisha Singh

asked on

Help in optimizing PowerShell script

Hello Guys,

I have written the following script for generating a report regarding SSRS, Excel and performance point by analyzing IIS logs:

Write-Host "`r"
if(!(Test-Path E:\BI_ToolUsage)){New-Item E:\BI_ToolUsage -type directory -force}
$today=(get-date).ToString("dd_MM_yyyy")

Write-Host "Content Farm Selected" -f green
$ConPaths="\\Server_logs\IC1-PS502\IIS-exports\*.log"
"date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken" |Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv
"date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken" |Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv
"date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken" |Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv
Foreach($path in $ConPaths)
{
$path
gc $path |?{($_ | Select-String "RSViewerPage.aspx" | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv -Width 30000 -append);($_ | Select-String "xlviewer.aspx" | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv -append -Width 30000);($_ | Select-String "PPSWebParts/ppsDashboard.css" |Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv -Width 30000 -append)}
}
$files=Get-ChildItem E:\BI_ToolUsage\Content*
foreach($file in $files){(Get-Content $file.FullName)| Where-Object {$_ -match '\S'}|%{$_ -replace " " , ","}|Out-File $file.FullName}
Write-Host "Completed !"

Open in new window



The thing here is the size of the folder is 19.3 GB and there are 671 files in the location : \\Server_logs\IC1-PS502\IIS-exports
When I run this script, it usually takes around 3-4 days for generating the reports. Is it somehow possible to optimize the above script which can reduce the time it takes to complete? As per my knowledge modification in the below piece of code can help a lot:

gc $path |?{($_ | Select-String "RSViewerPage.aspx" | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv -Width 30000 -append);($_ | Select-String "xlviewer.aspx" | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv -append -Width 30000);($_ | Select-String "PPSWebParts/ppsDashboard.css" |Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv -Width 30000 -append)}

Open in new window


Kindly look into this and share your views. Thanks
Avatar of footech
footech
Flag of United States of America image

You've got some odd/unnecessary syntax in that script.  For example:
 - the first foreach loop.  There is only one path in $ConPaths (though it includes a wildcard)
 - having the bulk of your actions performed within a Where-Object scriptblock.  Where-Object is used to filter objects - the usage doesn't match here.

I think the bulk of your optimization can come from not using append operations.  It's expensive to repeatedly open, write to, and close a file.  There's also some optimization that can come from searching for matches.

One question - can any single line match more that one of the strings you're searching for ("RSViewerPage.aspx", "xlviewer.aspx", "PPSWebParts/ppsDashboard.css")?

Also, what are the file sizes involved here (both the input and output files)?  Can you give me a range?  If it's possible to load some things into memory and keep them there for a bit it could help, but it would increase memory usage of the script - if the usage became too great for the system then you lose the advantage.
Unless I've missed something, you should be able to combine the operation of replacing spaces with commas with operation of writing to the .CSVs - so you don't have to read the .CSVs back in again.

Give this a try.  It cuts down on the number of append operations.
Write-Host "`r"
if ( !(Test-Path E:\BI_ToolUsage) )
{ New-Item E:\BI_ToolUsage -type directory -force }

$today = (Get-Date).ToString("dd_MM_yyyy")

Write-Host "Content Farm Selected" -f green

$header = "date,time,s-ip,cs-method,cs-uri-stem,cs-uri-query,s-port,cs-username,c-ip,cs(User-Agent),sc-status,sc-substatus,sc-win32-status,time-taken"
$header | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv
$header | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv
$header | Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv

$ConPaths = Get-Item "\\Server_logs\IC1-PS502\IIS-exports\*.log"
foreach ($path in $ConPaths)
{
    $fileContents = Get-Content $path
    
    $fileContents | Where { $_ -like "*RSViewerPage.aspx*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_SSRS_$today.csv -Width 30000 -append
    $fileContents | Where { $_ -like "*xlviewer.aspx*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_ExcelService_$today.csv -append -Width 30000
    $fileContents | Where { $_ -like "*PPSWebParts/ppsDashboard.css*" } | % {$_ -replace " " , ","} | Out-file E:\BI_ToolUsage\Content_PerformancePoint_$today.csv -Width 30000 -append
}

Write-Host "Completed !"

Open in new window


Depending on what the answers are to some of the questions I asked before, you might be able to do even more within memory, but I think this gets you most of the way.
Avatar of Anisha Singh
Anisha Singh

ASKER

Footech, thanks for your reply.

Yes it is possible that a single may contain all the three strings, that's why the script is built in that way.

As for the input, it is the location as mentioned in the script which is exactly 19.3 Gb and I ran the script a week ago for which it took 5 days to complete and the whole output is around 6 Mbs.
OK, the script as I rewrote will still handle that.

So all the files in that location are 19.3 GB, but what is the smallest and largest sizes of those files?
Footech, smallest file size in the location is 1 KB and the largest is 523 MB.
ASKER CERTIFIED SOLUTION
Avatar of footech
footech
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I've requested that this question be deleted for the following reason:

Not enough information to confirm an answer.
All of my recommendations for optimization are valid.  The code posted in http:#a40925219 implements a number of optimizations reducing the number of times that files are read in and appended to (instead of one append operation for every matching line found, there would only be a maximum of one per file read, and I also consolidated the reformatting of the output files so that doesn't have to happen separately).  How much of an improvement in runtime would depend entirely on the files that are being read in, but there would be an improvement.

It's possible more could be done, but without input from the author, the above stands.