Solved

XML Trouble - An invalid character was found in text content. Error processing resource

Posted on 2004-09-12
6
18,864 Views
Last Modified: 2013-12-03
The actual XML file
http://www.thevfamily.com/news/yh_xml/iraq.xml

ASP Link
http://www.thevfamily.com/news/?cat=iraq

I am just toying with XML and RSS Feeds.  The majority of this ASP is from a moreover RSS example I found a while back.  It works for the moreover feeds ok, but I want to make it work with yahoo now.  

The ASP Link above works for the categories EXCEPT the Iraq one.  The XML is grabbed and stored locally like the rest but I get a blank screen when I try to view the news via the asp, and an error if i try to view the xml directly;

-----XML ERROR-----
The XML page cannot be displayed
Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.
An invalid character was found in text content. Error processing resource 'http://www.thevfamily.com/news/yh_xml/iraq.xml'....
<description>OneWorld.net - BAGHDAD, Sep 7 (IPS)
-----XML ERROR-----

IF I open iraq.xml in txtpad, immediately following
<description>OneWorld.net - BAGHDAD, Sep 7 (IPS)
is a space, then a -, then another space = " - "

There are other -'s all through the other xml files.  If I manually del the -, the Iraq news displays.

What on earth could be wrong with that ONE character???

Here is the code for the ASP and the XSL

-----ASP-----
<% Option Explicit
Dim sXMLDataDir, nHoursToRefresh, sCategory, pgStartTime
pgStartTime = Timer()
sXMLDataDir = Server.MapPath(".") & "/yh_xml/"
nHoursToRefresh = 1
sCategory = "topstories"

Function GetMoreoverXML(ByVal sCategory, ByVal sXMLDataFile)
      Dim sChoice, sSource, objHTTP, sArticleList, sOut, oTS, oFSO, i
      sSource= "http://rss.news.yahoo.com/rss/" & sCategory
      Set objHTTP = Server.CreateObject("Microsoft.XMLHTTP")
      Set oFSO = Server.CreateObject("Scripting.FileSystemObject")
      objHTTP.Open "GET", sSource, False
      objHTTP.Send
      sArticleList = objHTTP.ResponseBody
      Set objHTTP = Nothing
      sOut = ""
      For i = 0 To UBound(sArticleList)
          sOut = sOut & ChrW(AscW(Chr(AscB(MidB(sArticleList, i + 1, 1)))))
      Next
      sOut = Replace (sOut, " encoding=""iso-8859-1"" ", "")
      sOut = Replace (sOut, "<!-- generated by static php_rss_category -->", "")
      sOut = Replace (sOut, "&quot;", """")
      sOut = Replace (sOut, "&amp;#36;", "$")
      sOut = Replace (sOut, "&amp;#151;", "-")
      sOut = Replace (sOut, "&amp;#8212;", "-")
      sOut = Replace (sOut, "&amp;#x2014;", "-")
      
      Set oTS = oFSO.CreateTextFile(sXMLDataFile, True)
      oTS.Write sOut
      oTS.Close
      Set oTS = Nothing
      Set oFSO = Nothing
      GetMoreoverXML = True
End Function

Function sShowNews(ByVal sXMLDataFile)
    Dim objXML, objXSL
      Set objXML = Server.CreateObject("Microsoft.XMLDOM")
      Set objXSL = Server.CreateObject("Microsoft.XMLDOM")
      objXSL.Async = False
      objXML.Load(sXMLDataFile)
      objXSL.Load(Server.MapPath(".") & "/yahoo.xsl")
      If (objXSL.ParseError.ErrorCode = 0) Then
            sShowNews = objXML.TransformNode(objXSL)
      Else
            sShowNews = "Error: " & objXSL.ParseError.Reason & "<br /> URL:" & objXSL.URL
      End If
      Set objXSL = Nothing
      Set objXML = Nothing
End Function

Dim objFSO, fNewsFile, dLastSync, bGotFile, sXMLFile
If Request.QueryString("cat") <> "" Then
    sCategory = Request.QueryString("cat")
End If
bGotFile = False
Set objFSO = Server.CreateObject("Scripting.FileSystemObject")
sXMLFile = sXMLDataDir & LCase(sCategory) & ".xml"
If Not objFSO.FileExists(sXMLFile) Then
    bGotFile = GetMoreoverXML(sCategory, sXMLFile)
    dLastSync = Now()
Else
    Set fNewsFile = objFSO.GetFile(sXMLFile)
    dLastSync = fNewsFile.DateLastModified
    Set fNewsFile = Nothing
    If DateDiff("h", dLastSync, Now()) >= nHoursToRefresh Then
        bGotFile = GetMoreoverXML(sCategory, sXMLFile)
        dLastSync = Now()
    Else
        bGotFile = True
    End If
End If
Set objFSO = Nothing %>

<table align="center" border="0" width="660" cellpadding="10">
<tr><td valign="top"><h2><%=UCase(sCategory) %></h2>
<% If bGotFile = True Then
    Response.Write "<p><small>Current: " & FormatDateTime(NOW(), vbLongDate) & " @ " & FormatDateTime(NOW(), vbLongTime) & "</small>" & vbNewLine
    Response.Write "<br><small>Cached: " & FormatDateTime(dLastSync, vbLongDate) & " @ " & FormatDateTime(dLastSync, vbLongTime) & "</small></p>" & vbNewLine
    Response.Write sShowNews(sXMLFile)
Else
    Response.Write "<p><strong>There was an error retrieving the news feed.</strong></p>"
End If %>
</td><td valign="top" width="100" bgcolor="#F7F7F7">
<font size="2"><b>NEWS VIEW</b><BR>
<a href="?cat=topstories">Top Stories</a><br>
<a href="?cat=world">World</a><br>
<a href="?cat=us">US</a><br>
<a href="?cat=politics">Politics</a><br>
<a href="?cat=mideast">Mid East</a><br>
<a href="?cat=iraq">Iraq</a><br>
<a href="?cat=sept11">Sept 11</a><br>
<a href="?cat=oped">Op Ed</a><br>
<p><font size="2"><b>XML VIEW</b><BR>
<a href="yh_xml/topstories.xml">Top Stories</a><br>
<a href="yh_xml/world.xml">World</a><br>
<a href="yh_xml/us.xml">US</a><br>
<a href="yh_xml/politics.xml">Politics</a><br>
<a href="yh_xml/mideast.xml">Mid East</a><br>
<a href="yh_xml/iraq.xml">Iraq</a><br>
<a href="yh_xml/sept11.xml">Sept 11</a><br>
<a href="yh_xml/oped.xml">Op Ed</a><br>
</font><%Response.Write "<p><font size=1 face=Verdana>Generated in: " &_
FormatNumber(Timer() - pgStartTime, 4) & "Sec</font><br>"%>
</td></tr></table>
-----ASP-----

-----XSL-----
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/">
<xsl:for-each select="rss/channel/item">
<xsl:choose>
<xsl:when expr="childNumber(this) > 1000"></xsl:when>
<xsl:otherwise>
    <p><font face="Arial"><strong><xsl:value-of select="title"/> </strong>
      <br /><font color="#660000" size="1"><xsl:value-of select="pubDate"/></font></font><br />
    <font face=""><xsl:value-of select="description"/> ... </font>
      <a><xsl:attribute name="href"><xsl:value-of select="link"/></xsl:attribute>
      <xsl:attribute name="target">_blank</xsl:attribute>more</a><br /></p>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
-----XSL-----

I just don't understand what is wrong with that one character.

Ideas?
0
Comment
Question by:Mike_V
6 Comments
 
LVL 19

Expert Comment

by:drichards
ID: 12041153
If you look at the iraq.xml file as binary, there is a 0x96 at position 50 in line 1322 as reported.  This is an illegal UTF-8 character.  Or more accurately an illegal first byte.  UTF-8 is the default encoding for an XML file.
0
 
LVL 15

Expert Comment

by:dualsoul
ID: 12041508
that's write you should change encoding of the XML document, or if you can't download it to local filesystem, and filter-out all this things.
0
 
LVL 3

Author Comment

by:Mike_V
ID: 12059266
drichards

Can you give me an idea of how to address it?  I have tried to use replace() to take it out when I DL the RSS to local files, but have not been able to do so.

0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 19

Accepted Solution

by:
drichards earned 500 total points
ID: 12066579
How did you try to replace it?  Should be something like:

    xmlString = Replace(xmlString, Chr(150), Chr(32))

This will replace the 0x96 with a space.  I think 'xmlString' will be 'sOut' for you.  You can add this reaplace to the list of Replace calls you are already doing.

0
 
LVL 3

Author Comment

by:Mike_V
ID: 12067595
THANK YOU drichards

0
 

Expert Comment

by:hellomehta
ID: 21432906
The XML page cannot be displayed
Cannot view XML input using style sheet. Please correct the error and then click the Refresh button, or try again later.


--------------------------------------------------------------------------------

An invalid character was found in text content. Error processing resource 'https://claimsldap.aig.com/LdapAdminWeb/web/Logi...



0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
XSLT Assistance 9 49
XML file fails to process correctly 3 60
XSLT -  replace invalid xml characters 15 35
PowerShell script to remove string in xml file 3 24
The Problem How to write an Xquery that works like a SQL outer join, providing placeholders for absent data on the outer side?  I give a bit more background at the end. The situation expressed as relational data Let’s work through this.  I’ve …
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question