Leo Torres
asked on
powershell parse
Hello,
Finally I was able to complete a Powershell parse. But my code is very elementary. Code below Works, my goal is to learn more Powershell. Can anyone write this differently where the index array starting point may changes a little bit and the code still wont break. My concern is that this may be to specific for this page and my be prone to break easily if string changes by 1 space. (I do understand that if there are major code changes to website this wont work anyway.)
Finally I was able to complete a Powershell parse. But my code is very elementary. Code below Works, my goal is to learn more Powershell. Can anyone write this differently where the index array starting point may changes a little bit and the code still wont break. My concern is that this may be to specific for this page and my be prone to break easily if string changes by 1 space. (I do understand that if there are major code changes to website this wont work anyway.)
Add-Type -path C:\PStemp\HtmlAgilityPack\Net40\htmlagilitypack.dll
CLS
$Website = "http://scores.espn.go.com/mlb/scoreboard?date=20140415"
$wc = New-Object System.Net.WebClient;
$doc = New-Object HtmlAgilityPack.HtmlDocument
$doc.LoadHtml($wc.DownloadString($Website))
$game = $doc.DocumentNode.SelectNodes('.//table["league-title"]')
ForEach ($innerHTML in $game.InnerHTML){
$Teams = $innerHTML -split "`"><a href=`""
$Team1 = $Teams[1].Substring($Teams[1].IndexOf("http://espn.go.com") + 38, $Teams[1].IndexOf("</a>") - $Teams[1].IndexOf("http://espn.go.com") - 42).Replace("/", "").Replace("`"", "")
$Team2 = $Teams[2].Substring($Teams[2].IndexOf("http://espn.go.com") + 38, $Teams[2].IndexOf("</a>") - $Teams[2].IndexOf("http://espn.go.com") - 42).Replace("/", "").Replace("`"", "")
$Team1 = $Team1.toCharArray()
[Array]::Reverse($Team1)
$Team1 = -join $Team1
$Team2 = $Team2.toCharArray()
[Array]::Reverse($Team2)
$Team2 = -join $Team2
$Team1 = $Team1.Substring(0,$Team1.IndexOf("-")).ToUpper()
$Team2 = $Team2.Substring(0, $Team2.IndexOf("-")).ToUpper()
$Team1 = $Team1.toCharArray()
[Array]::Reverse($Team1)
$Team1 = -join $Team1
$Team2 = $Team2.toCharArray()
[Array]::Reverse($Team2)
$Team2 = -join $Team2
$Score1 = $Teams[1].Substring($Teams[1].IndexOf("-alsT`">") + 7, 2).Replace("<","").Replace("/", "-1")
$Score2 = $Teams[2].Substring($Teams[2].IndexOf("-hlsT`">") + 7, 2).Replace("<", "").Replace("/", "-1")
Write-Host $Team1 $Score1', '$Team2 $Score2
}
ASKER
This version
http://htmlagilitypack.codeplex.com/
http://htmlagilitypack.codeplex.com/
ASKER
Neglected? Why Moderator?
Hi leo
I use split to divide up the text as the basic html does not usually change that often
Here is an example
Let me know if you require a specific example or further help.
Joe
I use split to divide up the text as the basic html does not usually change that often
Here is an example
$Website = "http://scores.espn.go.com/nhl/scoreboard?date=20141125"
$Request = Invoke-WebRequest -URI $webSite
$h = $request.ParsedHtml.getElementsByTagName("div")
$h | where classname -eq 'team-name' | select InnerText
$a = $h | where classname -eq 'span-2' | select innerhtml
$teama = ($a.innerHTML -split "</A>")[0].split(">")[11]
$scorea = ($a.innerHTML -split "</A>")[1].split("<")[4].split(">")[1]
$teamb = (($a.innerHTML -split "</A>")[1] -split ">")[17]
$scoreb = ($a.innerHTML -split "</A>")[2].split(">")[4].split("<")
write-output $teama , $scorea , $teamb , $scoreb
Let me know if you require a specific example or further help.
Joe
Oh, that rings a bell. August 2013, College Football results, then NFL results ... A pity each league has a different page layout on ESPN, and it is never as straightforward as it should.
As it seems, one of the reliable ways is to get each game's HTML ID, extract the unique ID number (e.g. 340415104), and use that in an XPath like //*[@id="340415104-aTeamNa me"]/a (for "away" team name).
ASKER
Qlemo: as a programmer I realized this page matches teams to an ID at the beginning then references those IDs to represent team. So your suggestion crossed my mind but I had no idea how to do it. Remember I need to get results for day passed.
Thanks again for your help on this.
Thanks again for your help on this.
Hi Leo
is this similar to the other ticket you raised ? would you like some further help ?
Regards
Joe
is this similar to the other ticket you raised ? would you like some further help ?
Regards
Joe
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Sorry Qlemo.. been in vacation mode for past few weeks.
New-Object : Cannot find type [HtmlAgilityPack.HtmlDocum
assembly containing this type is loaded.
At line:5 char:9
+ $doc = New-Object HtmlAgilityPack.HtmlDocume
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidType: (:) [New-Object], PSArgumentExcepti
on
+ FullyQualifiedErrorId : TypeNotFound,Microsoft.Pow
jectCommand