• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 53
  • Last Modified:

Parse data from website

I am a PowerShell newbie.... I have been struggling on how to get a date value from a website.

I need to get a date from a certain file on a webpage. The webpage is in index format so looks lie this:

8/23/2014 12:00 AM     422508   file120.abc
8/23/2014 12:00 AM     440964   file121.abc
8/23/2014 12:00 AM     332636   file122.abc
...
11/26/2017 2:30 PM     3823     thefile.ini
...
11/25/2017 1:01 AM     88044    file309.abc

I need to find the line that contains 'thefile.ini' and get the date (11/26/2017) from it

When I look at the view source on the page I see

<br>11/26/2017  2:30 PM         3823 <A HREF="/folder/thefile.ini">thefile.ini</A>
0
edrz01
Asked:
edrz01
  • 11
  • 8
1 Solution
 
Jose Gabriel Ortega CEE Solution Guide - CEO Faru Bonon ITCommented:
Can we have an example of the webpage ?, (I mean something like the web saved into an HTML or something) because the question is to vague to be answered.
0
 
edrz01Author Commented:
Thanks Jose. Untitled.png
So I am trying to get the date for this particular file...
0
 
Jose Gabriel Ortega CEE Solution Guide - CEO Faru Bonon ITCommented:
No problem :) , is that an FTP view edrz01 ?
0
Simplify Active Directory Administration

Administration of Active Directory does not have to be hard.  Too often what should be a simple task is made more difficult than it needs to be.The solution?  Hyena from SystemTools Software.  With ease-of-use as well as powerful importing and bulk updating capabilities.

 
edrz01Author Commented:
No, it is a web listing. The URL is something like  (can't paste the real link here)

http://mcafee.xxx.xxxx.xxx/css_content/current/VSCANDAT1000/DAT/0000/
0
 
Jose Gabriel Ortega CEE Solution Guide - CEO Faru Bonon ITCommented:
Well let's start for something!

Source: https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/invoke-webrequest?view=powershell-5.1

$R = Invoke-WebRequest -URI http://putTheRealUrlHere.com
then do
$R.AllElements | gm
and post it here.

or
$R.AllElements | where {$_.innerhtml -like "11/26/2017"}

It's hard to tell the exact query because i don't have the link. sorry.
0
 
edrz01Author Commented:
First one:

PS C:\WINDOWS\system32> $R = Invoke-WebRequest -URI "http://mcafee.xxx.xxx.xxx/css_content/current/VSCANDAT1000/DAT/0000/"
 then do
 $R.AllElements | gm
then : The term 'then' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path
was included, verify that the path is correct and try again.
At line:2 char:2
+  then do
+  ~~~~
    + CategoryInfo          : ObjectNotFound: (then:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException
 


   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType   Definition                                                                                                                          
----        ----------   ----------                                                                                                                          
Equals      Method       bool Equals(System.Object obj)                                                                                                      
GetHashCode Method       int GetHashCode()                                                                                                                  
GetType     Method       type GetType()                                                                                                                      
ToString    Method       string ToString()                                                                                                                  
innerHTML   NoteProperty string innerHTML=<HEAD><TITLE>mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/</TITLE></HEAD>...                  
innerText   NoteProperty string innerText=mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/mcafee.xxx.xxxx.xxx - /css_content/current/VSCA...
outerHTML   NoteProperty string outerHTML=<HTML><HEAD><TITLE>mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/</TITLE></HEAD>...            
outerText   NoteProperty string outerText=mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/mcafee.xxx.xxxx.xxx - /css_content/current/VSCA...
tagName     NoteProperty string tagName=HTML
0
 
Jose Gabriel Ortega CEE Solution Guide - CEO Faru Bonon ITCommented:
that works on powershell version 5.1
to get the version go to a ps console and write: $PSVersionTable
0
 
edrz01Author Commented:
Name                           Value                                                                                                                        
----                           -----                                                                                                                        
PSVersion                      5.1.14393.1770                                                                                                                
PSEdition                      Desktop                                                                                                                      
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}                                                                                                      
BuildVersion                   10.0.14393.1770                                                                                                              
CLRVersion                     4.0.30319.42000                                                                                                              
WSManStackVersion              3.0                                                                                                                          
PSRemotingProtocolVersion      2.3                                                                                                                          
SerializationVersion           1.1.0.1
0
 
edrz01Author Commented:
If I delete the 'then do' it displays the information (if this is what you are expecting)

PS C:\WINDOWS\system32> $R = Invoke-WebRequest -URI "http://mcafee.xxx.xxxx.xxx/css_content/current/VSCANDAT1000/DAT/0000/"
 $R.AllElements | gm


   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType   Definition                                                                                                                          
----        ----------   ----------                                                                                                                          
Equals      Method       bool Equals(System.Object obj)                                                                                                      
GetHashCode Method       int GetHashCode()                                                                                                                  
GetType     Method       type GetType()                                                                                                                      
ToString    Method       string ToString()                                                                                                                  
innerHTML   NoteProperty string innerHTML=<HEAD><TITLE>mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/</TITLE></HEAD>...                  
innerText   NoteProperty string innerText=mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/mcafee.xxx.xxxx.xxx - /css_content/current/VSCA...
outerHTML   NoteProperty string outerHTML=<HTML><HEAD><TITLE>mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/</TITLE></HEAD>...            
outerText   NoteProperty string outerText=mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/mcafee.xxx.xxxx.xxx - /css_content/current/VSCA...
tagName     NoteProperty string tagName=HTML
0
 
edrz01Author Commented:
What's frustrating about this is when I view source the date field I need is before the actual file. Making it harder....Untitled1.png
0
 
Jose Gabriel Ortega CEE Solution Guide - CEO Faru Bonon ITCommented:
Well it's a web view.. there's nothing u can do about it.

Ok run this:

$R.AllElements | where {$_.innerhtml -like "11/26/2017"}
0
 
edrz01Author Commented:
Looks like nothing returned:

PS C:\WINDOWS\system32>
$R = Invoke-WebRequest -URI "http://mcafee.xxx.xxxx.xxx/css_content/current/VSCANDAT1000/DAT/0000/"
$R.AllElements | where {$_.innerhtml -like "11/26/2017"}

PS C:\WINDOWS\system32>
0
 
Jose Gabriel Ortega CEE Solution Guide - CEO Faru Bonon ITCommented:
haha Just send me the actual link in a pm.
0
 
edrz01Author Commented:
Sent you a PM - can't send link, sorry
0
 
Jose Gabriel Ortega CEE Solution Guide - CEO Faru Bonon ITCommented:
Well, the question can't be answered because of the lack of details, you can keep exploring the innerhtml and look for the properties you are looking for, or maybe if you can get the information in CSV or XML that would be helpful to your proposes.

jose
0
 
edrz01Author Commented:
Jose, I appreciate you trying to find a solution however not sure what you mean by 'lack of details'. I provided you everything I could.

The URL I am accessing looks like this (can't provide actual link since it is restricted)

Untitled1.png
On that listing is a filename called 'avvdat.ini'

I am trying to get the date to the left of that filename.

Untitled2.png
When I hit F12 on the webpage it shows the data as below. I put blocks around the date and the file.

Untitled3.png
0
 
Jose Gabriel Ortega CEE Solution Guide - CEO Faru Bonon ITCommented:
I don't think that any can give you an answer with the information you have provided. What I suggest you here, it's to save the HTML file in your desktop and remove the information that is not relevant to the question, (like enterprise, number of McAfee account or whatever). and left the HTML structure intact so we can finish the answer. What I mean with incomplete or it can't be answered is that we're not wizards. We need to have the full HTML structure so we can provide an answer or do the query accordingly to your SPECIFIC NEEDS. each web page is different.
0
 
oBdACommented:
Adjust the URL in the last line, save it as Whatever.ps1, and run it.
If it gives you the date, remove the "-Debug" switch from the last line.
If it does not give you the date, open the file "McAfeeContent.html" that should be in the current folder, replace any sensitive information with placeholders, and upload it here (no screenshot - either as file attachment or inside a [code][/code] block!).

The function will parse the complete content into custom PS objects which you can either filter yourself, or you can directly pass it a filter like in the code below, so you can get information about any file listed.
The Date property returned will be a full DateTime object, not just a string.
Function Get-McAfeeItem {
Param(
	[String]$Url,
	[String]$Filter,
	[Switch]$Debug
)
	$NameFilter = If ($Filter) {{$_.Name -like $Filter}} Else {{$true}}
	$DTProvider = New-Object -TypeName System.Globalization.CultureInfo -ArgumentList 'en-US'
	$DTFormat = 'M/d/yyyy h:mm tt'
	$Content = Invoke-WebRequest -Uri $Url | Select-Object -ExpandProperty Content
	If ($Debug) {$Content | Set-Content -Path "$((Get-Location -PSProvider FileSystem).Path)\McAfeeContent.html"}
	$Content.Replace("`r`n", ' ') -replace '\s+', ' ' -split '<br>' |
		Where-Object {$_ -match '\s*(?<Date>.*?)\s*(?<Size>\d+?)\s*<A\s+HREF\s*=\s*"(?<Path>.*?)"\s*>\s*(?<Name>.*?)</A>'} |
		Select-Object -Property `
			@{n='Name'; e={$Matches['Name']}},
			@{n='Size'; e={[int64]$Matches['Size']}},
			@{n='Date'; e={[DateTime]::ParseExact($Matches['Date'], $DTFormat, $DTProvider)}},
			@{n='Path'; e={$Matches['Path']}} |
		Where-Object $NameFilter
}

Get-McAfeeItem -Debug -Url "mcafee.acme.com/css_content/current/VSCANDAT1000/DAT/0000/" -Filter avvdat.ini | Select-Object -ExpandProperty Date

Open in new window

0
 
edrz01Author Commented:
OBDA - worked like a champ! Thank you!!!!!!
0
 
edrz01Author Commented:
Worked like a champ! Thanks!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

WEBINAR: 10 Easy Ways to Lose a Password

Join us on June 27th at 8 am PDT to learn about the methods that hackers use to lift real, working credentials from even the most security-savvy employees. We'll cover the importance of multi-factor authentication and how these solutions can better protect your business!

  • 11
  • 8
Tackle projects and never again get stuck behind a technical roadblock.
Join Now