[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Grap some text from an HTML file via Windows scripting

Posted on 2014-07-14
5
Medium Priority
?
167 Views
Last Modified: 2014-09-08
Hello experts,

I would like to catch a specific line from an intranet web page (html) and write it to a text file via Windows batch script. The HTML is a simple one with usual table, href, etc entities.

Is there a way to do this? Please provide a simple example?

Thank you in advance
0
Comment
Question by:bozer
  • 3
  • 2
5 Comments
 
LVL 54

Expert Comment

by:Scott Fell, EE MVE
ID: 40196082
What you will do is an xmlhttppost to the page to grab the code.  From there you just parse out what you need.

http://support.microsoft.com/kb/290591
       
I modified the code below.  We don't need DataToSend unless you need to pass form variables or a querystring.  I also changed, Response.Write xmlhttp.responsexml.xml to theHTML =  xmlhttp.responsetext

   
    DataToSend = ""
	dim xmlhttp 
	set xmlhttp = server.Createobject("MSXML2.ServerXMLHTTP")
	xmlhttp.Open "POST","http://www.somesite/somepage.html",false
	xmlhttp.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
	xmlhttp.send DataToSend
	Response.ContentType = "text/xml"
	theHTML =  xmlhttp.responsetext
        Set xmlhttp = nothing

Open in new window


At this point, all of  the html (if you were to view source) is in the variable theHTML.  Now you need to parse out what you want.

Let's say the our variable now looks like this.
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <title>some site</title>
</head>
<body>
  <h1>This is the title</h1>
  <h2>This is a sub title</h2>
  <p>about our sub title about our sub title about our sub title about our sub title</p>
  <h2>This is a sub title</h2>
  <p>about our sub title about our sub title about our sub title about our sub title</p>
  <h2>This is a sub title</h2>
  <p>about our sub title about our sub title about our sub title about our sub title</p>
</body>
</html>

Open in new window

You only want what is in the first paragraph.  To do this, we look in the variable, theHMTL for the first <p> tag and grab everything to the next </p> tag.
Step 1 finding the position of the first p tag and next p tag
start = instr(theHTML,"<p>")
end = instr(theHTML,"</p>")

Open in new window


Step 2 is getting just that paragraph.
theParagraph = mid(theHTML, start, end)

Open in new window


Putting it all together
 
    
DataToSend = ""
dim xmlhttp 
set xmlhttp = server.Createobject("MSXML2.ServerXMLHTTP")
xmlhttp.Open "POST","http://www.somesite/somepage.html",false
xmlhttp.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
xmlhttp.send DataToSend
Response.ContentType = "text/xml"
theHTML =  xmlhttp.responsetext
Set xmlhttp = nothing

start = instr(theHTML,"<p>")
end = instr(theHTML,"</p>")

theParagraph = mid(theHTML, start, end)

Open in new window

0
 

Author Comment

by:bozer
ID: 40196099
Hello Scott,

Thank you for the detailed reply. But this looks exactly how I was doing similar operations with Classic ASP. I want to do everything over a windows batch script so I don't have to worry about Web Servers, libraries, etc.

I think a plain batch script can read from a text file so what I basically want is to do the same for the static html page or perhaps, the batch can also do html > save as text file > get text from text file.
0
 
LVL 54

Accepted Solution

by:
Scott Fell,  EE MVE earned 1000 total points
ID: 40196176
That is vbs and you can run vbs from a batch file.  I run some scheduled tasks in the very same way.

If you have the html on your own server, you can use fso to read the file.  
http://msdn.microsoft.com/en-us/library/aa711216(v=vs.71).aspx
http://www.vb6.us/tutorials/using-fso-file-system-object-vb6
http://technet.microsoft.com/en-us/library/ee198716.aspx

Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\folder\file.html", ForReading)
start = instr(objFile,"<p>")
end = instr(objFile,"</p>")

theParagraph = mid(objFile, start, end)

Open in new window


vbs is used in asp.  I don't use vb6 or vb.net but I believe this part is identical.  Instead of response.write you use WScript.Echo

Save this as a vbs and call it from your batch file.
0
 

Author Comment

by:bozer
ID: 40225974
I apologize all, I had to attend some urgent issues and I will test your recommendations as soon as possible.
0
 

Author Closing Comment

by:bozer
ID: 40309348
Thanks Scott, playing around the file operations code you provided helped me solve my problem.
0

Featured Post

Transaction-level recovery for Oracle database

Veeam Explore for Oracle delivers low RTOs and RPOs with agentless transaction log backup and transaction-level recovery of Oracle databases. You can restore the database to a precise point in time, even to a specific transaction.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article explains how to install and use the NTBackup utility that comes with Windows Server.
Resolving an irritating Remote Desktop connection that stops your saved credentials from being used.
This tutorial will walk an individual through the steps necessary to configure their installation of BackupExec 2012 to use network shared disk space. Verify that the path to the shared storage is valid and that data can be written to that location:…
This tutorial will walk an individual through configuring a drive on a Windows Server 2008 to perform shadow copies in order to quickly recover deleted files and folders. Click on Start and then select Computer to view the available drives on the se…
Suggested Courses

873 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question