Link to home
Start Free TrialLog in
Avatar of ziorrinfotech
ziorrinfotech

asked on

getting "'.', hexadecimal value 0x00, is an invalid character error while read xml file using xmlreader

I am reading one big xml file which has article and along with other nodes.
this xml file contains 4 node of same article in 4 different language. "en" , "de_CH" ,"fr_CH" , "it_CH" .

when i am calling one function to read one particular node of xml file i am getting the above error.
the function is correct as i have tested it on other xml which contains different articles.
it is giving me error at this line
&lt;/p&gt;"></document_table></docume.

any one could help me here pls.
Avatar of ShazbotOK
ShazbotOK
Flag of United States of America image

with xmlReader you can choose to disable xml validation:
MyXmlReader.Settings.ValidationType = System.Xml.ValidationType.None;
What i am also suggest is ShazbotOk's Comment.


You need to set the XMLReader settings.
you can use unescape function.
Avatar of ziorrinfotech
ziorrinfotech

ASKER

Hi skvikram

could you pls tell me more about this unescape function
i am still receiving the same error but now in a different line.

any one could help me our here

i am revieving error on this line

        <br />
         <strong>Z

here is my function
 
 public static List<string> GetXMLFiles(int sourceID)  {
        List<string> returnList = null;
        string folder_Name = null;
        FileInfo fileInfo = null;
        string completeFilePath = null;
        string[] strFileNames = null;
 
        XmlReaderSettings readerSettings = new XmlReaderSettings();
        readerSettings.IgnoreComments = true;
        readerSettings.IgnoreWhitespace = true;
        readerSettings.ValidationType = ValidationType.None;
        
 
        returnList = new List<string>();
        folder_Name = HttpContext.Current.Server.MapPath("XML_File");
        strFileNames = Directory.GetFiles(folder_Name);
 
        foreach (string fileName in strFileNames)  {
            fileInfo = new FileInfo(fileName);
            if (fileInfo.Extension.ToLower() == ".xml") {
                completeFilePath = HttpContext.Current.Server.MapPath("XML_File/" + fileInfo.Name);
 
                if (sourceID == (int)enumSource.MC) {
                    using (XmlReader reader = XmlReader.Create(fileName, readerSettings)) {
                        reader.ReadToFollowing("topic");
                        if (reader.ReadState != ReadState.EndOfFile) {
                            returnList.Add(fileInfo.Name.Replace(fileInfo.Extension.ToString(), ""));
                        }
                    }
                }
                else {
                    returnList.Add(fileInfo.Name.Replace(fileInfo.Extension.ToString(), ""));
                }
            }
        }
 
        readerSettings = null;
        fileInfo = null;
        folder_Name = null;      
        completeFilePath = null;
        strFileNames = null;
        return returnList;
    }

Open in new window

does the <Strong>Z  have a closing tag?
The problem is (from what I can see) is your attempting to parse HTML with the XMLDom... which "yes" can be done but only if it is XHTML compliant, if it is not then the parser will fail.
You may want to checkout HTMLAgility project on CodePlex open source: http://www.codeplex.com/htmlagilitypack
this is a very powerful api that allows you to parse the HTML even if it is not XHTML compliant.
Actually one of node in the xml file which i am reading contains HTML tags and text.
so error usually come around these tags only.
i have downloaded the project from CodePlex but not sure how will this help me in my issue
ASKER CERTIFIED SOLUTION
Avatar of ShazbotOK
ShazbotOK
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial