asked on

getting "'.', hexadecimal value 0x00, is an invalid character error while read xml file using xmlreader

I am reading one big xml file which has article and along with other nodes.
this xml file contains 4 node of same article in 4 different language. "en" , "de_CH" ,"fr_CH" , "it_CH" .

when i am calling one function to read one particular node of xml file i am getting the above error.
the function is correct as i have tested it on other xml which contains different articles.
it is giving me error at this line
</p>"></document_table></docume.

any one could help me here pls.

ShazbotOK

with xmlReader you can choose to disable xml validation:
MyXmlReader.Settings.ValidationType = System.Xml.ValidationType.None;

thiyaguk

What i am also suggest is ShazbotOk's Comment.

You need to set the XMLReader settings.

skvikram

you can use unescape function.

ziorrinfotech

ASKER

Hi skvikram

could you pls tell me more about this unescape function

ziorrinfotech

ASKER

i am still receiving the same error but now in a different line.

any one could help me our here

i am revieving error on this line

<br />
<strong>Z

here is my function
 
 public static List<string> GetXMLFiles(int sourceID)  {
        List<string> returnList = null;
        string folder_Name = null;
        FileInfo fileInfo = null;
        string completeFilePath = null;
        string[] strFileNames = null;
 
        XmlReaderSettings readerSettings = new XmlReaderSettings();
        readerSettings.IgnoreComments = true;
        readerSettings.IgnoreWhitespace = true;
        readerSettings.ValidationType = ValidationType.None;
        
 
        returnList = new List<string>();
        folder_Name = HttpContext.Current.Server.MapPath("XML_File");
        strFileNames = Directory.GetFiles(folder_Name);
 
        foreach (string fileName in strFileNames)  {
            fileInfo = new FileInfo(fileName);
            if (fileInfo.Extension.ToLower() == ".xml") {
                completeFilePath = HttpContext.Current.Server.MapPath("XML_File/" + fileInfo.Name);
 
                if (sourceID == (int)enumSource.MC) {
                    using (XmlReader reader = XmlReader.Create(fileName, readerSettings)) {
                        reader.ReadToFollowing("topic");
                        if (reader.ReadState != ReadState.EndOfFile) {
                            returnList.Add(fileInfo.Name.Replace(fileInfo.Extension.ToString(), ""));
                        }
                    }
                }
                else {
                    returnList.Add(fileInfo.Name.Replace(fileInfo.Extension.ToString(), ""));
                }
            }
        }
 
        readerSettings = null;
        fileInfo = null;
        folder_Name = null;      
        completeFilePath = null;
        strFileNames = null;
        return returnList;
    }

Open in new window

ShazbotOK

does the <Strong>Z have a closing tag?
The problem is (from what I can see) is your attempting to parse HTML with the XMLDom... which "yes" can be done but only if it is XHTML compliant, if it is not then the parser will fail.
You may want to checkout HTMLAgility project on CodePlex open source: http://www.codeplex.com/htmlagilitypack
this is a very powerful api that allows you to parse the HTML even if it is not XHTML compliant.

ziorrinfotech

ASKER

Actually one of node in the xml file which i am reading contains HTML tags and text.
so error usually come around these tags only.

ziorrinfotech

ASKER

i have downloaded the project from CodePlex but not sure how will this help me in my issue

ASKER CERTIFIED SOLUTION

ShazbotOK

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

getting &quot;'.', hexadecimal value 0x00, is an invalid character error while read xml file using xmlreader

getting "'.', hexadecimal value 0x00, is an invalid character error while read xml file using xmlreader