Solved

XML Exception: Invalid Character(s)

Posted on 2009-05-12
4
1,404 Views
Last Modified: 2013-11-11
I am working on a small project that is receiving XML data in string form from a long running application. I am trying to load this string data into an XDocument (System.Xml.Linq.XDocument), and then from there do some XML Magic and create an xlsx file for a report on the data.

On occasion, I receive the data that has invalid XML characters, and when trying to parse the string into an XDocument, I get this error.

[System.Xml.XmlException]
Message: '?', hexadecimal value 0x1C, is an invalid character.

Since I have no control over the remote application, you could expect ANY kind of character.

I am well aware that XML has a way where you can put characters in it such as &#x1C or something like that.

If at all possible I would SERIOUSLY like to keep ALL the data. If not, than let it be.


---

I have thought about editing the response string programatically, then going back and trying to re-parse should an exception be thrown, but I have tried a few methods and none of them seem successful.

Thank you for your thought.
TextReader  tr;
XDocument  doc;
string           response; //XML string received from server.
 
...
 
tr = new StringReader (response);
 
try
{
     doc = XDocument.Load(tr);
}
catch (XmlException e)
{
    //handle here?
}

Open in new window

0
Comment
Question by:Meiscooldude
  • 2
  • 2
4 Comments
 
LVL 6

Accepted Solution

by:
ViceroyFizzlebottom earned 500 total points
ID: 24367911
Here is something I did a while ago when faced with the same issue. Basically, read in the data as plain text, manipulate it how you want to get it massaged, then load that into your XML doc.
                using (StreamReader reader = _xmlCatalogFile.OpenText())
                {
                    string strRawData = reader.ReadToEnd();
                    reader.Close();
 
                    // Replace malformed data
                    Regex badAmpersand = new Regex("&(?![a-zA-Z]{2,6};|#[0-9]{2,4};)");
                    const string goodAmpersand = "&";
                    strRawData = badAmpersand.Replace(strRawData, goodAmpersand);
 
                    _xmlDocument.LoadXml(strRawData);
                }

Open in new window

0
 
LVL 6

Expert Comment

by:ViceroyFizzlebottom
ID: 24367916
The regular expression above was simply to format '&' but the general idea can be used for anything.
0
 

Author Comment

by:Meiscooldude
ID: 24368086
Thank you for the hasty reply,

From what i can see, that will only replace an ampersand if it is in front of a hex char or something like 'gt;'

I am looking for a way to replace ALL invalid characters, such as 'G' with their corresponding &#hexvalue or simply removing it all together. (preferably keeping it)
0
 

Author Closing Comment

by:Meiscooldude
ID: 31580658
I used a method like this, thank you vm
0

Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Question! 4 36
HttpPostedFile 1 26
Reading the web config for a running service on Windows 10 16 40
Store results in vb.net 3 20
Introduction In my previous article (http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SSIS/A_9150-Loading-XML-Using-SSIS.html) I showed you how the XML Source component can be used to load XML files into a SQL Server database, us…
Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
In a recent question (https://www.experts-exchange.com/questions/28997919/Pagination-in-Adobe-Acrobat.html) here at Experts Exchange, a member asked how to add page numbers to a PDF file using Adobe Acrobat XI Pro. This short video Micro Tutorial sh…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question