Link to home
Start Free TrialLog in
Avatar of quest_capital
quest_capital

asked on

Parse XML with xmldom

I'm using xmldom to parse an xml page.
There are the first few line of the xml doc.

xml doc:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">
<rss version="0.91">

My problem is when I load the xml doc, if the <!DOCTYPE > line is present like seen above I get and error (msxml3.dll error '8000000a' ) and I can read the rest of the document.

However, if I take the <!DOCTYPE > line out every thing works fine

Is there a way to delete is Line before I parse the information?
I would perfer not to replace the line because the Line might change.
Avatar of chipple
chipple

You're using ASP?
I would use a regular expression to remove the DOCTYPE.

Dim xml
' Your XML text is in this variable

Dim rx
Set rx = New RegExp
rx.Pattern = "<!DOCTYPE[^>]*>"
xml = rx.Replace(xml,"")
Set rx = Nothing

' DOCTYPE line has been removed

(Code provided not tested.)

Good luck!
If you also want to get rid of the line break after the doctype line, use this pattern instead:
rx.Pattern = "<!DOCTYPE[^>]*>[" & vbCrLf & "]+"
Avatar of quest_capital

ASKER

chipple

I tried the  code and it did not work I got a code mismatch here is the code:

<%
  Response.Buffer = True
  Dim objXMLHTTP, xml

  ' Create an xmlhttp object:
  Set xml = Server.CreateObject("Microsoft.XMLHTTP")
  ' Or, for version 3.0 of XMLHTTP, use:
  ' Set xml = Server.CreateObject("MSXML2.ServerXMLHTTP")

  ' Opens the connection to the remote server.
  xml.Open "GET", "http://www.feedroom.com/rssout/national_rss_9a50890a7af8c2150f8bb6af03d6264a8fa88134.xml", False
 ' xml.Open "GET", "http://www.questentertainmentgroup.com/uploads/moreover.xmlhouston_news.xml", False

  ' Actually Sends the request and returns the data:
  xml.Send
 
'Dim xml
' Your XML text is in this variable
Dim rx
Set rx = New RegExp
rx.Pattern = "<!DOCTYPE[^>]*>"
xml = rx.Replace(xml,"") '............................Type Mismatch Here................................
Set rx = Nothing
' DOCTYPE line has been removed

Set objFSO = Server.CreateObject ("Scripting.FileSystemObject")
Path = Server.MapPath("/uploads/")
FullPath = Path & "\Feed.xml"

Set objTF = objFSO.CreateTextFile(FullPath,True)
objTF.Write xml.ResponseText

  'Display the HTML both as HTML and as text
  Response.Write "<xmp>"
  Response.Write xml.responseText
 
  Set xml = Nothing
%>
The Code above brings the xml down to local server.
The Cade below parses the xml.

Maybe I have the code in the wrong place?????????????????

<%
Dim objXML, elements, i, url, Headline, Place, tTime, title
'Set objXML = Server.CreateObject("Msxml2.DomDocument")
Set objXML = Server.CreateObject("Microsoft.XMLDOM")
objXML.validateOnParse="false"
objXML.async="false"
objXML.Load (Server.MapPath("\uploads\Feed.xml"))
'objXML.Load (Server.MapPath("\uploads\moreover.xmlhouston_news.xml"))
'objXML.removeChild(pubDate)

'4) Capture of number of links
IF Request.QueryString("num")= "" Then
NumberOfLinks = 2
Else
NumberOfLinks = 15
End IF

'if there are no errors
If objXML.parseError.errorcode = 0 Then

'Set Root = objXML.documentElement
Set elements = objXML.getElementsByTagName("item")  
'Set elements = objXML.getElementsByTagName("article")

For i = 1 to NumberOfLinks
'for each elem in elements
'For i = 0 to (elements.length - 25)
'For i = 0 to (elements.+4)

title = elements.item(i).childNodes(0).Text
'Headline = elements.item(i).childNodes(1).Text
'Place = elements.item(i).childNodes(2).Text
'tTime = elements.item(i).childNodes(7).Text
Response.Write("" & title & "")
'Response.Write("" & url & "")
'Response.Write("<a href=" & URL & " target=_blank ><strong>"& Headline &"</strong></a>"&"<br>")
'Response.Write(Place & "  "& tTime & "<br>"& "<br>")
Next
Set objXML = Nothing

Else
Response.Write ("There was an error")
End If
%>
Try this. The Replace method needs text but you were giving it your XMLHTTP object.

<%
  Response.Buffer = True
  Dim objXMLHTTP, xml

  ' Create an xmlhttp object:
  Set xml = Server.CreateObject("Microsoft.XMLHTTP")
  ' Or, for version 3.0 of XMLHTTP, use:
  ' Set xml = Server.CreateObject("MSXML2.ServerXMLHTTP")

  ' Opens the connection to the remote server.
  xml.Open "GET", "http://www.feedroom.com/rssout/national_rss_9a50890a7af8c2150f8bb6af03d6264a8fa88134.xml", False
 ' xml.Open "GET", "http://www.questentertainmentgroup.com/uploads/moreover.xmlhouston_news.xml", False

  ' Actually Sends the request and returns the data:
  xml.Send
 
'Dim xml
' Your XML text is in this variable
Dim rx
Set rx = New RegExp
rx.Pattern = "<!DOCTYPE[^>]*>"
xml = rx.Replace(xml.xml,"")
Set rx = Nothing
' DOCTYPE line has been removed

Set objFSO = Server.CreateObject ("Scripting.FileSystemObject")
Path = Server.MapPath("/uploads/")
FullPath = Path & "\Feed.xml"

Set objTF = objFSO.CreateTextFile(FullPath,True)
objTF.Write xml.ResponseText

  'Display the HTML both as HTML and as text
  Response.Write "<xmp>"
  Response.Write xml.responseText
 
  Set xml = Nothing
%>
chipple

I got an object doesn't support this property or method 'xml' error
on the same line.
Sorry about that, please try xml.responseText instead of xml.xml. I was confused with something else.

<%
  Response.Buffer = True
  Dim objXMLHTTP, xml

  ' Create an xmlhttp object:
  Set xml = Server.CreateObject("Microsoft.XMLHTTP")
  ' Or, for version 3.0 of XMLHTTP, use:
  ' Set xml = Server.CreateObject("MSXML2.ServerXMLHTTP")

  ' Opens the connection to the remote server.
  xml.Open "GET", "http://www.feedroom.com/rssout/national_rss_9a50890a7af8c2150f8bb6af03d6264a8fa88134.xml", False
 ' xml.Open "GET", "http://www.questentertainmentgroup.com/uploads/moreover.xmlhouston_news.xml", False

  ' Actually Sends the request and returns the data:
  xml.Send
 
'Dim xml
' Your XML text is in this variable
Dim rx
Set rx = New RegExp
rx.Pattern = "<!DOCTYPE[^>]*>"
xml = rx.Replace(xml.responseText,"")
Set rx = Nothing
' DOCTYPE line has been removed

Set objFSO = Server.CreateObject ("Scripting.FileSystemObject")
Path = Server.MapPath("/uploads/")
FullPath = Path & "\Feed.xml"

Set objTF = objFSO.CreateTextFile(FullPath,True)
objTF.Write xml.ResponseText

  'Display the HTML both as HTML and as text
  Response.Write "<xmp>"
  Response.Write xml.responseText
 
  Set xml = Nothing
%>
Wel I would love to try the code but the server want update to the newest page. That happens from time to time I think I will add anther post for it.
chipple

I get a blank xml document and is error below:

Microsoft VBScript runtime error '800a01a8'
Object required: 'objTF'
/uploads/maptest2.asp, line 35

on this code:
Set objTF = objFSO.CreateTextFile(FullPath,True)
objTF.Write xml.ResponseText

ASKER CERTIFIED SOLUTION
Avatar of chipple
chipple

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank Chipple

That worked out great. Please check for my next post.
I want to automatically download this xml page every hour.
https://www.experts-exchange.com/questions/21121423/Automatically-download-xml-page-every-hour.html