German "Umlaute" in MSXML

If I load the xml from http://www.consolewars.de/headlines.rss to the MSXML DOM, I get an error of an invalid charachter in this line/item:

<title>CapcomStore eröffnet am Donnerstag</title>

Problem is the german "Umlaut" ö within the title. How can I prevent the MSXML parser to throw out this error or what can I do that the parser will load the xml correctly?
LVL 11
SvenTech Lead Web-DevelopmentAsked:
Who is Participating?
 
SvenConnect With a Mentor Tech Lead Web-DevelopmentAuthor Commented:
Hi Olaf, the only way I got it to work right now is using this code (see below). I now got to see if the characters are all correct when getting the items but for the moment I got the xml into the MSXML DomDocument and can parse it.
<%
Dim objXMLHTTP, objXML
 
Set objXMLHTTP = Server.CreateObject("MSXML2.ServerXMLHTTP.3.0")
 
objXMLHTTP.open "GET", "http://www.consolewars.de/headlines.rss", True
objXMLHTTP.send 
objXMLHTTP.waitForResponse 3
 
Set objXML = Server.CreateObject("MSXML2.DOMDocument.6.0")
 
objXML.async = False
objXML.resolveExternals = False
objXML.validateOnParse = False
 
If objXML.load(objXMLHTTP.responseStream) Then
	
	If objXML.validate Then
 
		Response.ContentType = "text/xml"
		Response.Write objXML.xml
	
	Else
 
		Response.Write objXML.parseError.reason & "[Line: " & objXML.parseError.line & ", Pos: " & objXML.parseError.linepos & "]"
	
	End If	
		
 
Else
 
	Response.Write objXML.parseError.reason & "[Line: " & objXML.parseError.line & ", Pos: " & objXML.parseError.linepos & "]"
	
End If
 
Set objXML = Nothing
Set objXMLHTTP = Nothing
%>

Open in new window

0
 
Olaf DoschkeSoftware DeveloperCommented:
At the begin of the xml a codepage might be encoded. This should be some codepage containing german characters. I don't know what parser you use, it might take UTF-8 as default, where the ö character would be invalid.
0
 
SvenTech Lead Web-DevelopmentAuthor Commented:
Here is my code so far. XMLRequest is a function of my own that works without error. It's only loading the xml source via MSXML2.ServerXMLHTTP.
Set objXML = CreateObject("MSXML2.DOMDocument.6.0")
 
With objXML
 
        .async = False
        .resolveExternals = False
        .validateOnParse = False
        .setProperty "ProhibitDTD", False	
         
         strXML = XMLRequest("http://www.consolewars.de/headlines.rss")
   
        .loadXML strXML 
 
End With
 
Set objXML = Nothing

Open in new window

0
Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

 
SvenTech Lead Web-DevelopmentAuthor Commented:
@Olaf: As state I use the MSXML parser (MSXML2.DOMDocument.6.0). Codepage within windows is UTF-16 I think.
0
 
Olaf DoschkeSoftware DeveloperCommented:
Sorry, you already said you are using MSXML DOM, I see the XML is encoded with encoding="ISO-8859-1" (Latin-1), which would be sufficient for german chars and that of several other languages. The MSXML parser seems not to respect that.
Bye, Olaf.
 
0
 
Olaf DoschkeConnect With a Mentor Software DeveloperCommented:
I think the "error" is that loading the xml into a variable with strXML = XMLRequest(http://www.consolewars.de/headlines.rss) makes this an UTF-16 string, so LoadXML will not accept the now invalid ö.
See here: http://msdn.microsoft.com/en-us/library/aa468560.aspx
quote:
The bottom line is that you cannot switch between a multibyte character set like UTF-8, Shift-JIS, or Windows-1250 and Unicode character encodings such as UTF-16, UCS-2, or UCS-4 using the encoding attribute on an XML declaration, because the declaration itself has to use the same number of bytes per character as the rest of the document.
So as strXML is UTF-16 the parser cannot switch to ISO-8859-1. Perhaps loading the xml to the DOM object directly might help.
Bye, Olaf.
 
0
 
Olaf DoschkeSoftware DeveloperCommented:
that is of course...
...
.loadXML XMLRequest("http://www.consolewars.de/headlines.rss")
...

Open in new window

0
 
Olaf DoschkeConnect With a Mentor Software DeveloperCommented:
or how about:
objXML = CreateObject("MSXML2.DOMDocument.6.0")
objxml.load("http://www.consolewars.de/headlines.rss")
strXML = objxml.xml

Open in new window

0
 
SvenTech Lead Web-DevelopmentAuthor Commented:
Mmh... allready tried that witout success (code below). Problem is, that the ressource seems not to be loaded but the parser does return true on load-Method !?
Dim objXML
 
Set objXML = CreateObject("MSXML2.DOMDocument.6.0")
 
If objXML.load("http://www.consolewars.de/headlines.rss") Then
 
	Response.ContentType = "text/xml"
	Response.Write objXML.xml
 
Else
 
	Response.Write objXML.parseError.reason & "[Line: " & objXML.parseError.line & ", Pos: " & objXML.parseError.linepos & "]"
	
End If
 
Set objXML = Nothing

Open in new window

0
 
Olaf DoschkeSoftware DeveloperCommented:
So you say this will run the If branch, as load returns true, but you don't get anythoung out of objxml.xml? You should again set objXML.async=False, like in your first sample, before load. Otherwise you write objXML.xml before anything is loaded.

Bye, Olaf.
0
 
Olaf DoschkeSoftware DeveloperCommented:
Hm, as you set validateonparse = False, checking objXML.validate seems rather useless, it would always be true, wouldn't it? Anyway, glad you got it working.

Bye, Olaf.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.