Solved

Nested XMLReader Only Seeing Root

Posted on 2002-03-23
9
299 Views
Last Modified: 2008-02-26
I have a situation where I have a large number of XML files that refer to each other in a nested fashion. Each time a reference elemnt (ref.element) is encountered, I need to follow it's pointer attribute (refid) to the corresponding and include the contents of that new file with the current file contents. It's like a depth-first traversal of a tree.

I've coded a Walker class that extends XMLFilterImpl that recursively follows these links by creating a new instance of itself. It correctly fires all of the SAX events that I have coded for the class, but the other XMLReaders that I have chained to the "root instance" only get the events from the root file processing.

For my application, I'm attaching an XSLT processor to the reader and it is only processing results from the root file.

I've include a simplifed source to illustrate what I want to do. I'll post some of my code if anyone wants to see it.

Thanks,
GG

Four input files...
     <root id='0'>
          <out>Zero</out>
          <ref.element refid='1' />
          <ref.element refid='2' />
     </root>
     
     <element id='1'>
          <out>One</out>
          <ref.element refid='3' />
     </element>
     
     <element id='2'>
          <out>Two</out>
     </element>
     
     <element id='3'>
          <out>Three</out>
     </element>

Desired output...
     Zero
     One
     Three
     Two

Recieved output...
     Zero
0
Comment
Question by:GleasonGuy
  • 4
  • 2
  • 2
  • +1
9 Comments
 
LVL 14

Expert Comment

by:avner
ID: 6891928
You can use two methods :
Must of the XSL parsers have specific methods (like the MSXMLs selectNodes(str String) method) to reference between node names and strings.

the Other method is to use the ID and IDREF of XML , which will force you to use DTD or a like.
http://www.xml.com/pub/a/2000/10/04/linking/index.html

or just search for  "ID/IDREF" on the net.
0
 
LVL 14

Expert Comment

by:avner
ID: 6891949
GleasonGuy,

Please Ignore My response , I miss read your comment.

-Avner.
0
 
LVL 23

Expert Comment

by:b1xml2
ID: 6892067
Please post your code including the chaining portion. Thanx
0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 2

Author Comment

by:GleasonGuy
ID: 6892095
The following is a working simplified version of my actual code. I ran it against the four source file I described in my original post and it produces the same un-desired results. In my code I have a lot more println() to see my progress. I removed them for spoace here.

Thanks,
GG
---
import java.io.File;
import java.io.FileInputStream;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXTransformerFactory;
import javax.xml.transform.sax.TransformerHandler;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;
import org.xml.sax.Attributes;
import org.xml.sax.helpers.AttributesImpl;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLFilterImpl;
import org.xml.sax.helpers.XMLReaderFactory;

public class RefWalker extends XMLFilterImpl  {

  private static final String DEFAULT_SAX_PARSER = "org.apache.xerces.parsers.SAXParser";

  public RefWalker(XMLReader xmlReader) {
    super(xmlReader);
  }

  public void startElement(String uri, String localName, String qName, Attributes attributes)
    throws SAXException {
    try {
      if( localName.equals("ref.element") ) {

        String sFile = attributes.getValue( "refid" );

        XMLReader myReader = XMLReaderFactory.createXMLReader(DEFAULT_SAX_PARSER);

        // Setup another "walker" to process this ref
        RefWalker refFilter = new RefWalker(myReader);

        // Setup the input source to add an entity declaration onto the input file
        InputSource inputSource = new InputSource(
          new FileInputStream( new File(sFile + ".xml") ) );

        // Parse the new ref document
        refFilter.parse( inputSource );
      } else {
        super.startElement(uri, localName, qName, attributes);
      }
    }
    catch (Exception e) {
      throw new SAXException(e);
    }
  }

  public void endElement(String uri, String localName, String qName)
    throws SAXException {

    try {
      if( localName.equals("ref.element") ) {
        // Do Nothing
      } else {
        super.endElement(uri, localName, qName);
      }
    }
    catch (Exception e) {
      throw new SAXException(e);
    }
  }

  /**
  * FOR TESTING ONLY
  */
  public static void main(String[] args) {
    try {
      if(args.length < 1) {
        System.out.println("Proper usage: RefWalker <xml-file> [<xsl-file>]\n");
      }
      else {
        String sXml = args[0];
        String sXsl = (args.length > 1) ? args[1] : null;

        TransformerFactory transFact = TransformerFactory.newInstance( );

        if (transFact.getFeature(SAXTransformerFactory.FEATURE)) {

          SAXTransformerFactory saxTransFact = (SAXTransformerFactory) transFact;

          TransformerHandler transHand = null;

          // Use the identity XSL transformation or the one provided
          if( sXsl == null ) {
            transHand = saxTransFact.newTransformerHandler( );
          } else {
            transHand = saxTransFact.newTransformerHandler(
              new StreamSource(new File(sXsl)));
          }

          // Set the destination for the XSLT transformation
          transHand.setResult(new StreamResult(System.out));

          // Setup the reader
          XMLReader myReader = XMLReaderFactory.createXMLReader(DEFAULT_SAX_PARSER);

          // Setup another "walker" to process this ref
          RefWalker refFilter = new RefWalker( myReader );

          // Setup the input source
          InputSource inputSource = new InputSource(
            new FileInputStream( new File(sXml)));

          // Attach the XSLT processor to the reader
          refFilter.setContentHandler(transHand);

          // Parse the new ref document
          refFilter.parse( inputSource );
        } else {
          System.err.println("SAXTransformerFactory is not supported.");
          System.exit(1);
        }
      }
    }
    catch (Exception e) {
      System.err.println(e);
      e.printStackTrace();
    }
  }
}
0
 
LVL 27

Accepted Solution

by:
BigRat earned 200 total points
ID: 6894324
The XSLT transHand is attached to the first created instance of RefWalker, ie: the root instance. When you create a new reader and filter for the "included" file you need to attach something to that also (since the attachment is instance related and not static), and this needs to be done in startElement.
0
 
LVL 2

Author Comment

by:GleasonGuy
ID: 6894353
Thanks BigRat. You're really close to the actual solution I implemented yesterday morning. I wound up passing the TransformerHandler to the constructor. To sum up the changes...

--- snip ---
private TransformerHandler m_transHand;

public RefWalker(XMLReader xmlReader, TransformerHandler transHand) {
   super(xmlReader);
   m_transHand = transHand;
}
--- snip ---
public void startElement(...) {
    ...
    RefWalker refFilter = new RefWalker(myReader,m_transHand);
    ...
    refFilter.setContentHandler(m_transHand);
    refFilter.parse( inputSource );
}
--- snip ---
public static void main(...) {
    ...
    RefWalker refFilter = new RefWalker(myReader, transHand);
    ...
}
--- snip ---

Thanks,
GG
0
 
LVL 2

Author Comment

by:GleasonGuy
ID: 6894359
You're really more than close. It's actually right on to the solution I came up with. Thanks for the post. I hope others benefit from this as a PAQ.

Regards,
GG
0
 
LVL 27

Expert Comment

by:BigRat
ID: 6896055
Either a second constructor or a class static which gets set on startup. I don't like class statics so a second constructor is fine.
0
 
LVL 2

Author Comment

by:GleasonGuy
ID: 6897164
That's pretty funny. Are yo looking at my production code. ;-)

I actually have both a private constructor that gets called internally for all the nested parses AND a static class field to hold the common TransformerHandler. I'm not concerned about class fields in my case since the class is a very specialized process that will (should) never get instantiated more than once per run of the application.

Thanks,
GG
0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
selectSingleNode in Access 2013 11 59
Group all sequential comments all in one binder <p>, using XSL 10 44
XSLT XML 4 19
Powershell XML in variable 4 21
Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question