Link to home
Start Free TrialLog in
Avatar of OAC Technology
OAC TechnologyFlag for United States of America

asked on

Save website information to XML file with VBScript.

Hi,

We are using a vbscript file to check a website at http://mspairport.com/msp/parking/realtimemobile.aspx  and save the information to an XML file on the local computer.  This script was made here: https://www.experts-exchange.com/questions/23406852/Automatically-parse-website-information-to-text-or-XML-file.html.  

This has been working fine, but stopped working for us yesterday.  It crashes with the error "Invalid XML content:  Line: [2] The server did not understand the request, or the request was invalid.  Error processing resource 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'"

Any help on how to fix this would be greatly appreciated.

Thank you.
Option Explicit
 
Dim strURL1, strFile1, strContent
 
strURL1 = "http://mspairport.com/msp/parking/realtimemobile.aspx"
strFile1 =  "c:\parking.xml"
 
strContent = sendHttpRequest(strURL1, strFile1)
Call ParseDocument(strContent, strFile1)
 
Function sendHttpRequest(strURL, strLocalFile)
 
	' Declare our vars
	Dim objWinHttp, strStatus
	
	' Request URL from 1st Command Line Argument.  This is
	'strURL = WScript.Arguments(0)
	Set objWinHttp = CreateObject("WinHttp.WinHttpRequest.5.1")
	objWinHttp.Open "GET", strURL
	objWinHttp.Send
	
	'THIS DOESN'T SEEM TO WORK
	strStatus = objWinHttp.StatusText
	'WScript.Echo "Status: " & strStatus
	' Get the Status and compare it to the expected 200
	'If objWinHttp.Status <> 200 Then
	 '     ' If it's not 200 we throw an error
	 '     Err.Raise 1, "HttpRequester", "Invalid HTTP Response Code"
	'End If
	
	'WScript.Echo "Response: " & objWinHttp.ResponseText
	sendHttpRequest = objWinHttp.ResponseText
	
	Set objWinHttp = Nothing
 
End Function
 
Function ParseDocument(strContent, strLocalFile)
	Dim objDom, objDomOut, loadStatus, value, xmlParseErr, strRefresh
	Dim objFSO
	
	Set objDOM = CreateObject("MSXML2.DOMDocument.4.0")
	Set objDomOut = CreateObject("MSXML2.DOMDocument.4.0")
	Set objFSO = CreateObject("Scripting.FileSystemObject")
	
	objDom.validateOnParse = False
	objDom.async = False
	loadStatus = objDOM.LoadXML(strContent)
	
	Set xmlParseErr = objDOM.parseError
	If xmlParseErr.errorCode <> 0 Then
		WScript.Echo("Invalid XML content: " & vbCrLf & vbCrLf & vbTab & "Line: [" & xmlParseErr.Line & "] " _
					& xmlParseErr.reason & vbCrLf & "Found: " & xmlParseErr.srcText)
		WScript.Quit(0)
	End If
 
	'WScript.Echo "Lindbergh Terminal: " _
	'			& objDOM.nodeFromID("Refreshparkingdata1_lblLindberghStatus").Text _
	'			& " " & objDOM.nodeFromID("Refreshparkingdata1_lblAvailableLindbergh").Text
	'WScript.Echo "Humphrey Terminal: " _
	'			& objDOM.nodeFromID("Refreshparkingdata1_lblHumphreyStatus").Text _
	'			& " " & objDOM.nodeFromID("Refreshparkingdata1_lblAvailableHumphrey").Text
	
	' Check if the Output file already exists, if it does, just append the loaded elements
	' otherwise create the root element and append the terminal elements
	If objFSO.FileExists(strLocalFile) Then
		objDomOut.validateOnParse = False
		objDomOut.async = False
		loadStatus = objDomOut.Load(strLocalFile)
	Else
		objDomOut.AppendChild( objDomOut.CreateElement("Terminals") )
	End if
	
	strRefresh = getSafeElement( objDOM, "Refreshparkingdata1_Label1" )
	
	objDomOut.documentElement.AppendChild( AppendTerminal(objDomOut, "Lindbergh", getSafeElement( objDOM, "Refreshparkingdata1_lblLindberghStatus" ), getSafeElement( objDOM, "Refreshparkingdata1_lblAvailableLindbergh" ), strRefresh))
	objDomOut.documentElement.AppendChild( AppendTerminal(objDomOut, "Humphrey", getSafeElement( objDOM, "Refreshparkingdata1_lblHumphreyStatus" ), getSafeElement( objDOM, "Refreshparkingdata1_lblAvailableHumphrey" ), strRefresh))
	
	objDomOut.Save (strLocalFile)
End Function
 
Function getSafeElement(objDomDoc, strID)
	Dim elem
	Set elem = objDomDoc.nodeFromID(strID)
	If Not elem is Nothing Then
		getSafeElement = elem.Text
	Else
		getSafeElement = ""
	End If
End Function
 
Function AppendTerminal(objOutPutDOM, strName, strPercent, strDescription, strRefreshDesc)
	Dim elem, child
	Set elem = objOutPutDOM.CreateElement("Terminal")
	Set child = objOutPutDOM.CreateElement("Name")
	child.Text = strName
	elem.appendChild(child)
	Set child = objOutPutDOM.CreateElement("Percent")
	child.Text = strPercent
	elem.appendChild(child)
	Set child = objOutPutDOM.CreateElement("Description")
	child.Text = strDescription
	elem.appendChild(child)
	Set child = objOutPutDOM.CreateElement("RefreshDesc")
	child.Text = strRefreshDesc
	elem.appendChild(child)
	Set AppendTerminal = elem
End Function

Open in new window

Avatar of rejoinder
rejoinder
Flag of Canada image

I'm guessing that there is more code not being show??  The error  "Error processing resource 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'" sounds to me like the code is looking to another web site to validate something.  The link comes up dead.  It could be that the dtd file is no longer available from www.w3.org.  Can you check all the code for a reference to that web site?

Avatar of OAC Technology

ASKER

Nope, that is all of the code and it's worked fine for months.  If you do a "view source" on the parking website, the doctype on the first line points to that file.  I think the file is damaged or now missing on the w3.org site and that's when this started happening.  I've noticed the .dtd file will not open in IE, but does in Firefox.  Odd.
Step 1
Save the attached file and drop the .txt so the file name = "xhtml1-transitional.dtd"
Step 2
Edit the code below (file://C:\some-folder-with-the-dtd-file\xhtml1-transitional.dtd)
to point to the folder containing the dtd file from the step above.
What will happen is the web page will be read, the troublesome line will be replaced so that it points to the dtd file on your system, then any line with &nbsp; will be removed.
Option Explicit
 
Dim strURL1, strFile1, strContent
 
strURL1 = "http://mspairport.com/msp/parking/realtimemobile.aspx"
strFile1 =  "c:\parking.xml"
 
strContent = sendHttpRequest(strURL1, strFile1)
strContent = replace(strContent,"<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"">","<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""file://C:\some-folder-with-the-dtd-file\xhtml1-transitional.dtd"">")
strContent = replace(strContent,"&nbsp;","")
Call ParseDocument(strContent, strFile1)
 
Function sendHttpRequest(strURL, strLocalFile)
 
	' Declare our vars
	Dim objWinHttp, strStatus
	
	' Request URL from 1st Command Line Argument.  This is
	'strURL = WScript.Arguments(0)
	Set objWinHttp = CreateObject("WinHttp.WinHttpRequest.5.1")
	objWinHttp.Open "GET", strURL
	objWinHttp.Send
	
	'THIS DOESN'T SEEM TO WORK
	strStatus = objWinHttp.StatusText
	'WScript.Echo "Status: " & strStatus
	' Get the Status and compare it to the expected 200
	'If objWinHttp.Status <> 200 Then
	 '     ' If it's not 200 we throw an error
	 '     Err.Raise 1, "HttpRequester", "Invalid HTTP Response Code"
	'End If
	
	'WScript.Echo "Response: " & objWinHttp.ResponseText
	sendHttpRequest = objWinHttp.ResponseText
	
	Set objWinHttp = Nothing
 
End Function
 
Function ParseDocument(strContent, strLocalFile)
	Dim objDom, objDomOut, loadStatus, value, xmlParseErr, strRefresh
	Dim objFSO
	
	Set objDOM = CreateObject("MSXML2.DOMDocument.4.0")
	Set objDomOut = CreateObject("MSXML2.DOMDocument.4.0")
	Set objFSO = CreateObject("Scripting.FileSystemObject")
	
	objDom.validateOnParse = False
	objDom.async = False
	loadStatus = objDOM.LoadXML(strContent)
	
	Set xmlParseErr = objDOM.parseError
	If xmlParseErr.errorCode <> 0 Then
		WScript.Echo("Invalid XML content: " & vbCrLf & vbCrLf & vbTab & "Line: [" & xmlParseErr.Line & "] " _
					& xmlParseErr.reason & vbCrLf & "Found: " & xmlParseErr.srcText)
		WScript.Quit(0)
	End If
 
	'WScript.Echo "Lindbergh Terminal: " _
	'			& objDOM.nodeFromID("Refreshparkingdata1_lblLindberghStatus").Text _
	'			& " " & objDOM.nodeFromID("Refreshparkingdata1_lblAvailableLindbergh").Text
	'WScript.Echo "Humphrey Terminal: " _
	'			& objDOM.nodeFromID("Refreshparkingdata1_lblHumphreyStatus").Text _
	'			& " " & objDOM.nodeFromID("Refreshparkingdata1_lblAvailableHumphrey").Text
	
	' Check if the Output file already exists, if it does, just append the loaded elements
	' otherwise create the root element and append the terminal elements
	If objFSO.FileExists(strLocalFile) Then
		objDomOut.validateOnParse = False
		objDomOut.async = False
		loadStatus = objDomOut.Load(strLocalFile)
	Else
		objDomOut.AppendChild( objDomOut.CreateElement("Terminals") )
	End if
	
	strRefresh = getSafeElement( objDOM, "Refreshparkingdata1_Label1" )
	
	objDomOut.documentElement.AppendChild( AppendTerminal(objDomOut, "Lindbergh", getSafeElement( objDOM, "Refreshparkingdata1_lblLindberghStatus" ), getSafeElement( objDOM, "Refreshparkingdata1_lblAvailableLindbergh" ), strRefresh))
	objDomOut.documentElement.AppendChild( AppendTerminal(objDomOut, "Humphrey", getSafeElement( objDOM, "Refreshparkingdata1_lblHumphreyStatus" ), getSafeElement( objDOM, "Refreshparkingdata1_lblAvailableHumphrey" ), strRefresh))
	
	objDomOut.Save (strLocalFile)
End Function
 
Function getSafeElement(objDomDoc, strID)
	Dim elem
	Set elem = objDomDoc.nodeFromID(strID)
	If Not elem is Nothing Then
		getSafeElement = elem.Text
	Else
		getSafeElement = ""
	End If
End Function
 
Function AppendTerminal(objOutPutDOM, strName, strPercent, strDescription, strRefreshDesc)
	Dim elem, child
	Set elem = objOutPutDOM.CreateElement("Terminal")
	Set child = objOutPutDOM.CreateElement("Name")
	child.Text = strName
	elem.appendChild(child)
	Set child = objOutPutDOM.CreateElement("Percent")
	child.Text = strPercent
	elem.appendChild(child)
	Set child = objOutPutDOM.CreateElement("Description")
	child.Text = strDescription
	elem.appendChild(child)
	Set child = objOutPutDOM.CreateElement("RefreshDesc")
	child.Text = strRefreshDesc
	elem.appendChild(child)
	Set AppendTerminal = elem
End Function

Open in new window

xhtml1-transitional.dtd.txt
When I run the scrip tit says "Invalid XML content:  Line: [2] Declaration has an invalid name." "Found: <!DOCTYPEhtmlPUBLIC"-//W3C//DTDXHTML1.0Transitional//EN""file://C:\xhtml1-transitional.dtd">"

Thanks
Can you check the line in the code for me...
Over write line 9 with this:
strContent = replace(strContent,"<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"">","<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""file://C:\xhtml1-transitional.dtd"">")
Also, please check that the dtd is appropriatly named.
Gives me an "expected end of statement" error for line 9.
ASKER CERTIFIED SOLUTION
Avatar of rejoinder
rejoinder
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
One more thing - if you are running the script on Vista, please use a different folder to save to and do not try and save to the root of C.  This will require you to go in and edit the script a little to point to another location.
Thank you!  Some of the formatting may have been lost in the previous post.  The vbs file you sent me works great and updates the xml file properly.  Thanks again
I'm glad that worked :-)