Solved

powershell parse

Posted on 2014-11-28
11
124 Views
Last Modified: 2015-01-03
Hello,
Finally I was able to complete a Powershell parse. But my code is very elementary. Code below Works, my goal is to learn more Powershell. Can anyone write this differently where the index array starting point may changes a little bit and the code still wont break. My concern is that this may be to specific for this page and my be prone to break easily if string changes by 1 space. (I do understand that if there are major code changes to website this wont work anyway.)


Add-Type -path C:\PStemp\HtmlAgilityPack\Net40\htmlagilitypack.dll
CLS

	$Website = "http://scores.espn.go.com/mlb/scoreboard?date=20140415"
	$wc = New-Object System.Net.WebClient;
	$doc = New-Object HtmlAgilityPack.HtmlDocument
	$doc.LoadHtml($wc.DownloadString($Website))

	$game = $doc.DocumentNode.SelectNodes('.//table["league-title"]')

	ForEach ($innerHTML in $game.InnerHTML){
	
	$Teams = $innerHTML -split "`"><a href=`""
	
	$Team1 = $Teams[1].Substring($Teams[1].IndexOf("http://espn.go.com") + 38, $Teams[1].IndexOf("</a>") - $Teams[1].IndexOf("http://espn.go.com") - 42).Replace("/", "").Replace("`"", "")
	$Team2 = $Teams[2].Substring($Teams[2].IndexOf("http://espn.go.com") + 38, $Teams[2].IndexOf("</a>") - $Teams[2].IndexOf("http://espn.go.com") - 42).Replace("/", "").Replace("`"", "")
	
	$Team1 = $Team1.toCharArray()
	[Array]::Reverse($Team1)
	$Team1 = -join $Team1
	
	$Team2 = $Team2.toCharArray()
	[Array]::Reverse($Team2)
	$Team2 = -join $Team2
	
	
	$Team1 = $Team1.Substring(0,$Team1.IndexOf("-")).ToUpper()
	$Team2 = $Team2.Substring(0, $Team2.IndexOf("-")).ToUpper()
	
	$Team1 = $Team1.toCharArray()
	[Array]::Reverse($Team1)
	$Team1 = -join $Team1
	
	$Team2 = $Team2.toCharArray()
	[Array]::Reverse($Team2)
	$Team2 = -join $Team2

			
	$Score1 = $Teams[1].Substring($Teams[1].IndexOf("-alsT`">") + 7, 2).Replace("<","").Replace("/", "-1")
	$Score2 = $Teams[2].Substring($Teams[2].IndexOf("-hlsT`">") + 7, 2).Replace("<", "").Replace("/", "-1")
	
	Write-Host $Team1 $Score1', '$Team2 $Score2
	
}

Open in new window

0
Comment
Question by:Leo Torres
  • 4
  • 3
  • 2
  • +1
11 Comments
 
LVL 80

Expert Comment

by:David Johnson, CD, MVP
ID: 40471234
which version of the htmlagilitypack are you using..
New-Object : Cannot find type [HtmlAgilityPack.HtmlDocument]: verify that the
assembly containing this type is loaded.
At line:5 char:9
+     $doc = New-Object HtmlAgilityPack.HtmlDocument
+            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidType: (:) [New-Object], PSArgumentExcepti
   on
    + FullyQualifiedErrorId : TypeNotFound,Microsoft.PowerShell.Commands.NewOb
   jectCommand
0
 
LVL 8

Author Comment

by:Leo Torres
ID: 40471377
0
 
LVL 8

Author Comment

by:Leo Torres
ID: 40471777
Neglected? Why Moderator?
0
Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

 
LVL 10

Expert Comment

by:JoeKlimis
ID: 40472881
Hi leo


I use  split to divide up the text  as the basic html does not usually change that often

Here is an example

$Website = "http://scores.espn.go.com/nhl/scoreboard?date=20141125"
$Request = Invoke-WebRequest -URI $webSite
$h = $request.ParsedHtml.getElementsByTagName("div")
$h | where classname -eq 'team-name' | select InnerText
$a = $h | where classname -eq 'span-2' | select innerhtml
$teama = ($a.innerHTML -split "</A>")[0].split(">")[11]
$scorea =  ($a.innerHTML -split "</A>")[1].split("<")[4].split(">")[1]
$teamb = (($a.innerHTML -split "</A>")[1] -split ">")[17]
$scoreb = ($a.innerHTML -split "</A>")[2].split(">")[4].split("<")

write-output $teama , $scorea , $teamb , $scoreb

Open in new window


Let me know if you require a specific example or further help.

Joe
0
 
LVL 69

Expert Comment

by:Qlemo
ID: 40472913
Oh, that rings a bell. August 2013, College Football results, then NFL results ... A pity each league has a different page layout on ESPN, and it is never as straightforward as it should.
0
 
LVL 69

Expert Comment

by:Qlemo
ID: 40472954
As it seems, one of the reliable ways is to get each game's HTML ID, extract the unique ID number (e.g. 340415104), and use that in an XPath like //*[@id="340415104-aTeamName"]/a (for "away" team name).
0
 
LVL 8

Author Comment

by:Leo Torres
ID: 40472991
Qlemo: as a programmer I realized this page matches teams to an ID at the beginning then references those IDs to represent team. So your suggestion crossed my mind but I had no idea how to do it. Remember I need to get results for day passed.

Thanks again for your help on this.
0
 
LVL 10

Expert Comment

by:JoeKlimis
ID: 40475648
Hi Leo

is this similar to the other ticket you raised ?    would you like some further help ?

Regards
Joe
0
 
LVL 69

Accepted Solution

by:
Qlemo earned 500 total points
ID: 40481096
Found a non-programmer solution. Not as elegant as I would like it to be, but it is more "generic" and allows easy adaption to other scenarios
Add-Type -path C:\temp\HtmlAgilityPack\Net40\htmlagilitypack.dll
CLS

$Website = "http://scores.espn.go.com/mlb/scoreboard?date=20140415"
$wc = New-Object System.Net.WebClient;
$doc = New-Object HtmlAgilityPack.HtmlDocument
$doc.LoadHtml($wc.DownloadString($Website))

$games = $doc.DocumentNode.SelectNodes('//*[@class="team-name"]|//*[@class="score"]/*[@class="finalScore"]') | select -Expand InnerText

while ($games)
{
  $Team1, $Score1, $Team2, $Score2, $games = $games
  Write-Host $Team1 $Score1', '$Team2 $Score2
}

Open in new window

That strange assignment in line 13 assigns the first 4 array elements to vars, and keeps the remainder. The array gets shorter by 4 elements this way each time. This procedure is necessary since the XPath expression returns 4 successive values per game. We could have been more specific, e.g. have an array with only "Team Away" names, another with "Team Away" scores aso. but that doesn't make things easier or more clear.
0
 
LVL 8

Author Closing Comment

by:Leo Torres
ID: 40528867
Sorry Qlemo.. been in vacation mode for past few weeks.
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Create and license users in Office 365 in bulk based on a CSV file. A step-by-step guide with PowerShell script examples.
A brief introduction to what I consider to be the best editor for PowerShell.
This video shows how to quickly and easily add an email signature for all users on Exchange 2016. The resulting signature is applied on a server level by Exchange Online. The email signature template has been downloaded from: www.mail-signatures…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.

837 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question