How can I parse an invalid XML file with DOM API ?

Posted on 2003-02-18
Medium Priority
Last Modified: 2011-10-03
[Fatal Error] StatisticsLog_MIEP_20030129000057760.xml:111:741426: An invalid XML character (Unicode: 0x2) was found in the element content of the document.
     at com.MASP.report.wappie.MIEPlogs_Parser.findMatchedEDPID(MIEPlogs_Parser.java:57)
     at com.MASP.report.wappie.MIEPlogs_Parser.main(MIEPlogs_Parser.java:101)
Exception in thread "main"

This invalid XML file is too big,I can not find the wrong place.I use standard DOM API to parse this file.I want to know wheather I can pass througth when I run this program.
if can,could you tell me what should I do to my program.My program is as follows.

  protected DocumentBuilderFactory factory ;
  protected DocumentBuilder builder ;
  protected Document document ;
  try {
            factory =DocumentBuilderFactory.newInstance();
            builder = factory.newDocumentBuilder();
        }catch (FactoryConfigurationError e) {
            // unable to get a document builder factory
        }catch (ParserConfigurationException e) {
            // parser was unable to be configured

Question by:wuchunzhong
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 27

Accepted Solution

BigRat earned 800 total points
ID: 7981099
This is an annoyance since XML is VERY strict about what constitutes a character. Basically anything below hex 20 which is nt CR nor LF is invalid.

I can only suggest that you "clean up" the file by substituting such characters with, say, hex BF (inverted question), with perhaps a bit of script (awk? perl? regex?) depending on the encoding of the file.

Featured Post

On Demand Webinar: Networking for the Cloud Era

Ready to improve network connectivity? Watch this webinar to learn how SD-WANs and a one-click instant connect tool can boost provisions, deployment, and management of your cloud connection.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
Create a Windows 10 custom Image with custom task bar and custom start menu using XML for deployment.
NetCrunch network monitor is a highly extensive platform for network monitoring and alert generation. In this video you'll see a live demo of NetCrunch with most notable features explained in a walk-through manner. You'll also get to know the philos…
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …
Suggested Courses
Course of the Month12 days, 1 hour left to enroll

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question