Exception while parsing XML file

Posted on 2006-05-02
Last Modified: 2012-06-27
I am getting a String which is a document in XML format.

String d = .... //variable d now contains an XML

What I want to do is navigate through this xml and print some node values. To do this my code is

      byte b[] = d.getBytes();
          InputStream is = new ByteArrayInputStream(b);
          org.w3c.dom.Document doc12 = null;      
                try {
                                  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
                                  DocumentBuilder docb = dbf.newDocumentBuilder();
                                  doc12 = docb.parse(is);    //Exception is thrown here
                                  Element elmt = doc12.getDocumentElement();      
                                  System.out.println("Root Node Name : "+elmt.getNodeName());            
                  }catch(Exception e) {
The problem is when it reaches the line "doc12=......" it throws an exception Invalid byte 1 of 1-byte UTF-8 sequence.
      at Source)
      at Source)
      at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
      at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
      at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
      at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
      at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
      at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
      at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
      at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
      at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
      at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
      at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(
      at java.lang.reflect.Method.invoke(
      at com.wk.atlas.tracker.EmailManager.sendMail(
      at com.wk.atlas.tracker.TrackerEmailBean.onMessage(
      at weblogic.ejb20.internal.MDListener.execute(
      at weblogic.ejb20.internal.MDListener.transactionalOnMessage(
      at weblogic.ejb20.internal.MDListener.onMessage(
      at weblogic.jms.client.JMSSession.onMessage(
      at weblogic.jms.client.JMSSession.execute(
      at weblogic.kernel.ExecuteThread.execute(

Can someone please tell me how can I get rid of this. The xml that is fetched into the string is something like

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE wkdic
  PUBLIC "-//D//DTD DIC Atlas compliant R1//DE" "dic.dtd">


Question by:thomas908
    LVL 86

    Accepted Solution

    Your encoding is not UTF-8


    builder.parse(new InputSource(new StringReader(d)));
    LVL 86

    Expert Comment

    'builder' in your case, of course, is 'docb'
    LVL 12

    Assisted Solution

    I believe you do not have clear how parsing.
    See here for examples:

    Bye, Giant.
    LVL 3

    Expert Comment

    I think uve got to set the encoding specifically... this might work..

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder docb = dbf.newDocumentBuilder();
    InputSource inpSrc = new InputSource(is);
    doc12 = docb.parse(inpSrc);    //Exception is thrown here
    Element elmt = doc12.getDocumentElement();  


    LVL 92

    Assisted Solution

    >      byte b[] = d.getBytes();

    try specifying the charset to use to extract the bytes

         byte b[] = d.getBytes("UTF8");
    LVL 30

    Expert Comment

    I prefer builder.parse () over a StringReader, perhaps it gives better performance.
    LVL 86

    Expert Comment

    Starting with the contents *already* decoded is preferable for obvious reasons, quite apart from its being a one-liner
    LVL 8

    Author Comment

    Thank you all for helping
    LVL 86

    Expert Comment


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Free Trending Threat Insights Every Day

    Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

    Suggested Solutions

    INTRODUCTION Working with files is a moderately common task in Java.  For most projects hard coding the file names, using parameters in configuration files, or using command-line arguments is sufficient.   However, when your application has vi…
    Are you developing a Java application and want to create Excel Spreadsheets? You have come to the right place, this article will describe how you can create Excel Spreadsheets from a Java Application. For the purposes of this article, I will be u…
    Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
    This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    14 Experts available now in Live!

    Get 1:1 Help Now