Link to home
Start Free TrialLog in
Avatar of WeTi
WeTi

asked on

Debug stats table of corona virus in Powershell

Dear experts

I been working work stastic of corona virus, this script below is not working, whenever I run this, it says,
this row. $out[$header[$i++]] = $_.InnerText.Trim() cannot be null, now I didn't make any change of the script so I think the problem should be on web that it changed something, but then I noticed that in the webside they added yesterday data and today data, that might break the index of counting. Anyone know how to fix this?

Thanks

$credential = Import-CliXml -Path 'C:\test\cred.xml'
[System.Net.ServicePointManager]::SecurityProtocol = `
[System.Net.SecurityProtocolType]::Tls11 -bor
[System.Net.SecurityProtocolType]::Tls12 -bor `  
[System.Net.SecurityProtocolType]::Tls -bor `
[System.Net.SecurityProtocolType]::Ssl3

$title = 'Corona Virus spread'
$time = (Get-Date).ToString("yyyy-MM-dd HH:mm")
$attachment = 'C:\test\corona.html'
$url = "https://www.worldometers.info/coronavirus/#countries"
$countries = 'Sweden', 'China', 'Iran','USA', 'Italy', 'Finland', 'Norway', 'Denmark', 'Russia', 'UK', 'Spain', 'Thailand', 'Canada', 'India', 'France', 'Germany'
# Available properties: 'Country,Other', 'TotalCases', 'NewCases', 'TotalDeaths', 'NewDeaths', 'TotalRecovered', 'ActiveCases', 'Serious,Critical', 'Tot Cases/1M pop'
$properties = 'Country,Other', 'TotalCases', 'Tot Cases/1M pop','ActiveCases','Serious,Critical','TotalDeaths' , 'TotalRecovered' 

$htmlHead = @'
<style>
body {background-color:white; font-family:verdana; font-size:32px;}
table {border-width:1px; border-style:solid; border-color:black; border-collapse:collapse;table-layout: auto;width: 100%;}
th {border-width:1px; padding:0px; border-style:solid; border-color:black; padding:3px; background-color:white}
td {border-width:1px; padding:0px; border-style:solid; border-color:black; padding:3px}
th[Col="TotalRecovered"] {background-color:green; color:white;}
    th[Col2="TotalDeaths"] {background-color:Red; color:white;}
    th[Col3="ActiveCases"] {background-color:lightYellow; color:Black;}
th[Col4="Tot Cases/1M pop"] {background-color:Yellow; color:Black;}
th[Col5="Serious,Critical"] {background-color:pink; color:Black;}
    td[Recovered="True"] {background-color:lightgreen;}
    td[TotCases="True"] {background-color:Yellow;}
    td[Deaths="True"] {background-color:red; color:white}
    td[ActiveCases="True"] {background-color:lightyellow;} 
    td[Soon="True"] {background-color:lightpink;}
</style>
'@

#tr:nth-child(odd) {background-color:#d3d3d3;}
#tr:nth-child(even) {background-color:white;}

Add-Type -Path C:\Test\HtmlAgilityPack.dll
$web = New-Object -TypeName HtmlAgilityPack.HtmlWeb
$doc = $web.Load($url)
$header = $doc.DocumentNode.SelectNodes('//th') | Select-Object -ExpandProperty InnerText | ForEach-Object {($_ -replace '&nbsp;', ' ').Trim()}
$table = $doc.DocumentNode.SelectNodes('//tr') | Select-Object -Skip 1 | ForEach-Object {
$i = 0
$out = [ordered]@{}
$_.SelectNodes('td') | ForEach-Object {
$out[$header[$i++]] = $_.InnerText.Trim()
}
[PSCustomObject]$out
} | Where-Object {$countries -contains $_.'Country,Other'} |
Select-Object -Property $properties |
Sort-Object -Property 'Country,Other'



$i = 0
$totalRecoveredIndex = $properties.IndexOf('TotalRecovered') +1
$totalDeathsIndex = $properties.IndexOf('TotalDeaths') +1
$totalCasesIndex = $properties.IndexOf('ActiveCases') +1
$TotalsoonIndex = $properties.IndexOf('Serious,Critical') +1
$Totalcases2Index = $properties.IndexOf('Tot Cases/1M pop') +1



$table |
ConvertTo-Html -Head $htmlhead -PreContent "<H1>$($title)</H1><H2>$($time)</H2><BR>"|
ForEach-Object {
If ($totalRecoveredIndex -gt 0) {
If ($_ -match '\A<tr><th>') {
$xml = [xml]$_
For ($h = 0; $h -lt $properties.Count; $h++) {
$xml.SelectSingleNode("/tr/th[$($h + 1)]").SetAttribute('Col', $properties[$h])
                    }
                For ($h = 0; $h -lt $properties.Count; $h++) {
$xml.SelectSingleNode("/tr/th[$($h + 1)]").SetAttribute('Col2', $properties[$h])
                    }
                For ($h = 0; $h -lt $properties.Count; $h++) {
$xml.SelectSingleNode("/tr/th[$($h + 1)]").SetAttribute('Col3', $properties[$h])
                    }
                For ($h = 0; $h -lt $properties.Count; $h++) {
$xml.SelectSingleNode("/tr/th[$($h + 1)]").SetAttribute('Col4', $properties[$h])
                    }
                    
                For ($h = 0; $h -lt $properties.Count; $h++) {
$xml.SelectSingleNode("/tr/th[$($h + 1)]").SetAttribute('Col5', $properties[$h])
                    }

$xml.InnerXml
} ElseIf ($_ -match '\A<tr><td>') {
$xml = [xml]$_
$xml.SelectSingleNode("/tr/td[$($totalRecoveredIndex)]").SetAttribute('Recovered', (-not [String]::IsNullOrEmpty($table[$i].'TotalRecovered')).ToString())
$xml.SelectSingleNode("/tr/td[$($totalDeathsIndex)]").SetAttribute('Deaths', (-not [String]::IsNullOrEmpty($table[$i].'TotalDeaths')).ToString())
$xml.SelectSingleNode("/tr/td[$($totalsoonIndex)]").SetAttribute('Soon', (-not [String]::IsNullOrEmpty($table[$i].'Serious,Critical')).ToString())
$xml.SelectSingleNode("/tr/td[$($totalCasesIndex)]").SetAttribute('ActiveCases', (-not [String]::IsNullOrEmpty($table[$i].'ActiveCases')).ToString())
$xml.SelectSingleNode("/tr/td[$($totalCases2Index)]").SetAttribute('TotCases', (-not [String]::IsNullOrEmpty($table[$i].'Tot Cases/1M pop')).ToString()) 
                $xml.InnerXml
$i++
} Else {
$_
}
}

Else {
$_
}
} | Set-Content -Path $attachment

Open in new window

Avatar of WeTi
WeTi

ASKER

After checking the output, well it still works, only the country rows become 2x, One today one yesterday I am not interested of yesterday data tho. anyone can help? Thanks
Avatar of WeTi

ASKER

I found the web got: <div class="tab-pane " id="nav-yesterday"  as a yesterday tag, is there a way to ignore this line and next 16 lines?

                                                    <div class="tab-pane " id="nav-yesterday" role="tabpanel" aria-labelledby="nav-yesterday-tab">
                                                    <div class="main_table_countries_div">
                                                        <table id="main_table_countries_yesterday" class="table table-bordered table-hover main_table_countries" style="width:100%">
                                                            <thead>
                                                                <tr>
                                                                    <th width="100">Country,<br />Other</th>
                                                                    <th width="20">Total<br />Cases</th>
                                                                    <th width="30">New<br />Cases</th>
                                                                    <th width="30">Total<br />Deaths</th>
                                                                    <th width="30">New<br />Deaths</th>
                                                                    <th width="30">Total<br />Recovered</th>
                                                                    <th width="30">Active<br />Cases</th>
                                                                    <th width="30">Serious,<br />Critical</th>
                                                                    <th width="30">Tot&nbsp;Cases/<br />1M pop</th>
                                                                </tr>
                                                            </thead>

Open in new window

Avatar of WeTi

ASKER

I could use -Context 0,15?
Avatar of Shaun Vermaak
That seems like a log of work to parse the data like that. Why not use an API from one of these?
https://www.programmableweb.com/news/apis-to-track-coronavirus-covid-19/review/2020/03/18 
Avatar of WeTi

ASKER

Well, the reason of doing this, is also to keep learning powershell for me. So yes I could use other API for this.... But I would like to know how to solve this
Avatar of WeTi

ASKER

I noticed that the -Context can only using for select-string, and the output is in a array, so i could only use Select-Object '<div class="tab-pane " id="nav-yesterday"', right now i dont know how to select using Select-Object and next 15 rows...
ASKER CERTIFIED SOLUTION
Avatar of oBdA
oBdA

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of WeTi

ASKER

What is this godlike line:
$tableToday = $doc.DocumentNode.SelectNodes('//table')[0]
Means?
Previously, there was only one table in the html, so you could just select the elements wherever they were.
But now there are two tables in the same site.
This line selects all <table> nodes (two, actually), and since today's table is the first one, it takes element 0.
From then on, the script works with the childnodes of this table only.
Avatar of WeTi

ASKER

Thanks for the answer, the thing is if the site change the table, then the first table would be yesterday instead, but that will be later and if that happens I will open a question again, thanks alot.