nealg2
asked on
Using XMLHTTP to retrieve XML document and parsing the result
I am trying to remotely retrieve an XML document using XMLHTTP (in ASP), and then parse out a particular tag. The problem I am running into is that Internet Explorer seems to want to parse the XML document on its own.
For example, the following code:
Set HttpReq = Server.CreateObject("Micro soft.XMLHT TP")
url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=PubMed&report=medline&mode=xml&id=15088295"
HttpReq.Open "GET", url, False, username, password
HttpReq.Send
buffer = HttpReq.responseText
Set HttpReq = Nothing
I then simply want to print out what is in the buffer:
Response.Write buffer
However, internet explorer is interpreting it as an XML document and processing it on it's own. Is there anyway to have the browser just treat is as a regular HTML document?
The reason I want to do it this way, is I get an error message back from Internet Explorer "An invalid character was found in text content. Error processing resource "
This is interesting because by typing the URL above directly into your browser, it displays fine. But using the code above, it generates this error message. (It is being tripped specifically on the Affiliation tag, when it comes the the letter 'e' with the accent over it).
Any idea what is going on? Or how to instruct Internet Explorer to handle the data as pure text, not XML??
Thanks very much.
For example, the following code:
Set HttpReq = Server.CreateObject("Micro
url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=PubMed&report=medline&mode=xml&id=15088295"
HttpReq.Open "GET", url, False, username, password
HttpReq.Send
buffer = HttpReq.responseText
Set HttpReq = Nothing
I then simply want to print out what is in the buffer:
Response.Write buffer
However, internet explorer is interpreting it as an XML document and processing it on it's own. Is there anyway to have the browser just treat is as a regular HTML document?
The reason I want to do it this way, is I get an error message back from Internet Explorer "An invalid character was found in text content. Error processing resource "
This is interesting because by typing the URL above directly into your browser, it displays fine. But using the code above, it generates this error message. (It is being tripped specifically on the Affiliation tag, when it comes the the letter 'e' with the accent over it).
Any idea what is going on? Or how to instruct Internet Explorer to handle the data as pure text, not XML??
Thanks very much.
ASKER
The question that still stands is how to get Internet Explorer, or any other browser, to treat the data I print out as an HTML document and not an XML document. I tried setting the ContentType but that did not seem to work.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks Mike.
Some followup questions:
After I issue the command "HttpReq.responseXML.save Response" then what happens to the XML data I retrieved? Because the next step in the code that I did not show was:
'Microsoft XML parser
dim objXML
Set objXML = Server.CreateObject("Micro soft.XMLDO M")
objXML.async = False
objXML.validateOnParse = False
objXML.LoadXML(****what goes here now??****)
And from there I am going to parse out just the fields I want using the command "objXML.documentElement.se lectSingle Node(tag). text" , and save that into a string var.
I also tried using the command "Response.ContentType = "text/xml" but that seems to mess up the HTML document that is printed out. Internet Explorer is thinking that I want to output an XML document, when in fact, I am outputing an HTML document from the ASP (with the data stored in the selected XML tags as part of the HTML).
I hope this makes sense.
Some followup questions:
After I issue the command "HttpReq.responseXML.save Response" then what happens to the XML data I retrieved? Because the next step in the code that I did not show was:
'Microsoft XML parser
dim objXML
Set objXML = Server.CreateObject("Micro
objXML.async = False
objXML.validateOnParse = False
objXML.LoadXML(****what goes here now??****)
And from there I am going to parse out just the fields I want using the command "objXML.documentElement.se
I also tried using the command "Response.ContentType = "text/xml" but that seems to mess up the HTML document that is printed out. Internet Explorer is thinking that I want to output an XML document, when in fact, I am outputing an HTML document from the ASP (with the data stored in the selected XML tags as part of the HTML).
I hope this makes sense.
ASKER
Once again I posted too hastily. Turns out the problem I was running into was the MSXML component did not work on my localhost, but worked once uploaded to my host provider.
Use
objXML.Load HttpReq.responseBody
Avoid using strings.
But to re-iterate, do not use Microsoft.XMLDOM for your progID. This is not server safe. Use MSXML 3 or 4.
Regards,
Mike Sharp
objXML.Load HttpReq.responseBody
Avoid using strings.
But to re-iterate, do not use Microsoft.XMLDOM for your progID. This is not server safe. Use MSXML 3 or 4.
Regards,
Mike Sharp
ASKER
When doing the parse with XMLDOM, I set the validateOnParse parameter to false, and then it doesn't complain about invalid characters.
This seems to be a work-around for my original problem, so I would still appreciate any suggestions on a better way to do this.