Meiscooldude
asked on
XML Exception: Invalid Character(s)
I am working on a small project that is receiving XML data in string form from a long running application. I am trying to load this string data into an XDocument (System.Xml.Linq.XDocument ), and then from there do some XML Magic and create an xlsx file for a report on the data.
On occasion, I receive the data that has invalid XML characters, and when trying to parse the string into an XDocument, I get this error.
[System.Xml.XmlException]
Message: '?', hexadecimal value 0x1C, is an invalid character.
Since I have no control over the remote application, you could expect ANY kind of character.
I am well aware that XML has a way where you can put characters in it such as  or something like that.
If at all possible I would SERIOUSLY like to keep ALL the data. If not, than let it be.
---
I have thought about editing the response string programatically, then going back and trying to re-parse should an exception be thrown, but I have tried a few methods and none of them seem successful.
Thank you for your thought.
On occasion, I receive the data that has invalid XML characters, and when trying to parse the string into an XDocument, I get this error.
[System.Xml.XmlException]
Message: '?', hexadecimal value 0x1C, is an invalid character.
Since I have no control over the remote application, you could expect ANY kind of character.
I am well aware that XML has a way where you can put characters in it such as  or something like that.
If at all possible I would SERIOUSLY like to keep ALL the data. If not, than let it be.
---
I have thought about editing the response string programatically, then going back and trying to re-parse should an exception be thrown, but I have tried a few methods and none of them seem successful.
Thank you for your thought.
TextReader tr;
XDocument doc;
string response; //XML string received from server.
...
tr = new StringReader (response);
try
{
doc = XDocument.Load(tr);
}
catch (XmlException e)
{
//handle here?
}
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
The regular expression above was simply to format '&' but the general idea can be used for anything.
ASKER
Thank you for the hasty reply,
From what i can see, that will only replace an ampersand if it is in front of a hex char or something like 'gt;'
I am looking for a way to replace ALL invalid characters, such as 'G' with their corresponding &#hexvalue or simply removing it all together. (preferably keeping it)
From what i can see, that will only replace an ampersand if it is in front of a hex char or something like 'gt;'
I am looking for a way to replace ALL invalid characters, such as 'G' with their corresponding &#hexvalue or simply removing it all together. (preferably keeping it)
ASKER
I used a method like this, thank you vm