Avatar of edrz01
edrz01
Flag for United States of America asked on

Parse data from website

I am a PowerShell newbie.... I have been struggling on how to get a date value from a website.

I need to get a date from a certain file on a webpage. The webpage is in index format so looks lie this:

8/23/2014 12:00 AM     422508   file120.abc
8/23/2014 12:00 AM     440964   file121.abc
8/23/2014 12:00 AM     332636   file122.abc
...
11/26/2017 2:30 PM     3823     thefile.ini
...
11/25/2017 1:01 AM     88044    file309.abc

I need to find the line that contains 'thefile.ini' and get the date (11/26/2017) from it

When I look at the view source on the page I see

<br>11/26/2017  2:30 PM         3823 <A HREF="/folder/thefile.ini">thefile.ini</A>
Powershell

Avatar of undefined
Last Comment
edrz01

8/22/2022 - Mon
Jose Gabriel Ortega Castro

Can we have an example of the webpage ?, (I mean something like the web saved into an HTML or something) because the question is to vague to be answered.
edrz01

ASKER
Thanks Jose. Untitled.png
So I am trying to get the date for this particular file...
Jose Gabriel Ortega Castro

No problem :) , is that an FTP view edrz01 ?
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
edrz01

ASKER
No, it is a web listing. The URL is something like  (can't paste the real link here)

http://mcafee.xxx.xxxx.xxx/css_content/current/VSCANDAT1000/DAT/0000/
Jose Gabriel Ortega Castro

Well let's start for something!

Source: https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/invoke-webrequest?view=powershell-5.1

$R = Invoke-WebRequest -URI http://putTheRealUrlHere.com
then do
$R.AllElements | gm
and post it here.

or
$R.AllElements | where {$_.innerhtml -like "11/26/2017"}

It's hard to tell the exact query because i don't have the link. sorry.
edrz01

ASKER
First one:

PS C:\WINDOWS\system32> $R = Invoke-WebRequest -URI "http://mcafee.xxx.xxx.xxx/css_content/current/VSCANDAT1000/DAT/0000/"
 then do
 $R.AllElements | gm
then : The term 'then' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path
was included, verify that the path is correct and try again.
At line:2 char:2
+  then do
+  ~~~~
    + CategoryInfo          : ObjectNotFound: (then:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException
 


   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType   Definition                                                                                                                          
----        ----------   ----------                                                                                                                          
Equals      Method       bool Equals(System.Object obj)                                                                                                      
GetHashCode Method       int GetHashCode()                                                                                                                  
GetType     Method       type GetType()                                                                                                                      
ToString    Method       string ToString()                                                                                                                  
innerHTML   NoteProperty string innerHTML=<HEAD><TITLE>mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/</TITLE></HEAD>...                  
innerText   NoteProperty string innerText=mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/mcafee.xxx.xxxx.xxx - /css_content/current/VSCA...
outerHTML   NoteProperty string outerHTML=<HTML><HEAD><TITLE>mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/</TITLE></HEAD>...            
outerText   NoteProperty string outerText=mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/mcafee.xxx.xxxx.xxx - /css_content/current/VSCA...
tagName     NoteProperty string tagName=HTML
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Jose Gabriel Ortega Castro

that works on powershell version 5.1
to get the version go to a ps console and write: $PSVersionTable
edrz01

ASKER
Name                           Value                                                                                                                        
----                           -----                                                                                                                        
PSVersion                      5.1.14393.1770                                                                                                                
PSEdition                      Desktop                                                                                                                      
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}                                                                                                      
BuildVersion                   10.0.14393.1770                                                                                                              
CLRVersion                     4.0.30319.42000                                                                                                              
WSManStackVersion              3.0                                                                                                                          
PSRemotingProtocolVersion      2.3                                                                                                                          
SerializationVersion           1.1.0.1
edrz01

ASKER
If I delete the 'then do' it displays the information (if this is what you are expecting)

PS C:\WINDOWS\system32> $R = Invoke-WebRequest -URI "http://mcafee.xxx.xxxx.xxx/css_content/current/VSCANDAT1000/DAT/0000/"
 $R.AllElements | gm


   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType   Definition                                                                                                                          
----        ----------   ----------                                                                                                                          
Equals      Method       bool Equals(System.Object obj)                                                                                                      
GetHashCode Method       int GetHashCode()                                                                                                                  
GetType     Method       type GetType()                                                                                                                      
ToString    Method       string ToString()                                                                                                                  
innerHTML   NoteProperty string innerHTML=<HEAD><TITLE>mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/</TITLE></HEAD>...                  
innerText   NoteProperty string innerText=mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/mcafee.xxx.xxxx.xxx - /css_content/current/VSCA...
outerHTML   NoteProperty string outerHTML=<HTML><HEAD><TITLE>mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/</TITLE></HEAD>...            
outerText   NoteProperty string outerText=mcafee.xxx.xxxx.xxx - /css_content/current/VSCANDAT1000/DAT/0000/mcafee.xxx.xxxx.xxx - /css_content/current/VSCA...
tagName     NoteProperty string tagName=HTML
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
edrz01

ASKER
What's frustrating about this is when I view source the date field I need is before the actual file. Making it harder....Untitled1.png
Jose Gabriel Ortega Castro

Well it's a web view.. there's nothing u can do about it.

Ok run this:

$R.AllElements | where {$_.innerhtml -like "11/26/2017"}
edrz01

ASKER
Looks like nothing returned:

PS C:\WINDOWS\system32>
$R = Invoke-WebRequest -URI "http://mcafee.xxx.xxxx.xxx/css_content/current/VSCANDAT1000/DAT/0000/"
$R.AllElements | where {$_.innerhtml -like "11/26/2017"}

PS C:\WINDOWS\system32>
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Jose Gabriel Ortega Castro

haha Just send me the actual link in a pm.
edrz01

ASKER
Sent you a PM - can't send link, sorry
Jose Gabriel Ortega Castro

Well, the question can't be answered because of the lack of details, you can keep exploring the innerhtml and look for the properties you are looking for, or maybe if you can get the information in CSV or XML that would be helpful to your proposes.

jose
Your help has saved me hundreds of hours of internet surfing.
fblack61
edrz01

ASKER
Jose, I appreciate you trying to find a solution however not sure what you mean by 'lack of details'. I provided you everything I could.

The URL I am accessing looks like this (can't provide actual link since it is restricted)

Untitled1.png
On that listing is a filename called 'avvdat.ini'

I am trying to get the date to the left of that filename.

Untitled2.png
When I hit F12 on the webpage it shows the data as below. I put blocks around the date and the file.

Untitled3.png
Jose Gabriel Ortega Castro

I don't think that any can give you an answer with the information you have provided. What I suggest you here, it's to save the HTML file in your desktop and remove the information that is not relevant to the question, (like enterprise, number of McAfee account or whatever). and left the HTML structure intact so we can finish the answer. What I mean with incomplete or it can't be answered is that we're not wizards. We need to have the full HTML structure so we can provide an answer or do the query accordingly to your SPECIFIC NEEDS. each web page is different.
ASKER CERTIFIED SOLUTION
oBdA

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
edrz01

ASKER
OBDA - worked like a champ! Thank you!!!!!!
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
edrz01

ASKER
Worked like a champ! Thanks!