Link to home
Start Free TrialLog in
Avatar of mohan21_kumar
mohan21_kumar

asked on

how to read all the tags of xml file

Hi,

   How to read all the tag values of the xml file in java.

example

<par dur="50ms">    
<std_time>50</std_time>    
<repeat_factor>0</repeat_factor>      
<text src="A.txt" region="text"/>    
</par>    
<par dur="50ms">    
<std_time>50</std_time>    
<repeat_factor>0</repeat_factor>      
<img src="B.jpg" region="Image" start="0" end="50" frame="0" posX="85" posY="0" sizeX="137" sizeY="152" fit="fit no distortion"/>      
<text src="C.txt" region="text"/>    
</par>  

i want all the values of the xml parent tag par along with the attributes and the elements
suggest me how to read all the values of the xml files or give me some samples code which would help me .

thanx
Avatar of vippx
vippx

Hi

try looking up more info on SAx ( Simple API for XML). here is something to get u started:

http://www-106.ibm.com/developerworks/java/library/x-tipsaxp.html?open&ca=dgr-jw766x-tipsaxp
Avatar of Wim ten Brink
What you showed is NOT a valid XML file since the XML header is missing and the root tag par occurs more than once. ;-)

In Java you have two options to parse XML files. You can use SAX, which will run through your XML file and trigger events whenever it encounters a tag. On these events you just ignore some tags while responding on others.
The other option is the DOM model, or JDOM in Java, which will read the whole XML file into memory and then allows you to access the values and properties as if it's just a huge, multi-dimensional, dynamic array.

There's also an option that could help you to convert a complex XML file to a simpler XML format, which are just called (XSLT) XML translations. In those cases, you create an XSL file that contains code in XML format that tells hot an incoming XML file must be translated to some other format. In your case, all you would have to do is flatten the original XML file to a simpler format.

Links:
http://www.xml.com/
http://xmlspy.com/
http://www.stylusstudio.com/index.html

And I further want to note that publisher O'Reilly has published a book with CD-Rom called "The XML CD Bookshelf" which is a digital collection of 7 XML books including "Java & XML, 2nd Edition" By Brett McLaughlin and "Java and XSLT" By Eric M. Burke. If you want to do more with XML in Java, BUY IT! It's really worth the money.
Avatar of mohan21_kumar

ASKER

my xml file is as follows :

<smil>  
<head>  
  <meta name="title" content="my mms"/>  
 <meta name="copyright" content="InterNexium 2002"/>    
<layout>    
 <root-layout background-color="white" width="160px" height="120px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>
      <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="text" width="160px" height="120px" left="0px" top="0px"/>  
   <region id="Image" width="160px" height="120px" left="0px" top="0px"/>  
    <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>  
   <region id="Image" width="160px" height="120px" left="0px" top="0px"/>  
   <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>  
   <region id="Image" width="160px" height="120px" left="0px" top="0px"/>  
   <region id="Image" width="160px" height="120px" left="0px" top="0px"/>  
   <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>      
<region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>      
<region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
  <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>      
<region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>      
<region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>      
<region id="Image" width="160px" height="120px" left="0px" top="0px"/>      
<region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
 <region id="Image" width="160px" height="120px" left="0px" top="0px"/>      
<region id="Image" width="160px" height="120px" left="0px" top="0px"/>      
<region id="Image" width="160px" height="120px" left="0px" top="0px"/>    
</layout>  
</head>  
<body>    
<par dur="4900ms">  
 <std_time>2000</std_time>  
 <repeat_factor>0</repeat_factor>    
 <text src="A.txt" region="text"/>    
</par>    <par dur="9950ms">    
<std_time>2000</std_time>    
<repeat_factor>0</repeat_factor>      
<img src="B.jpg" region="Image" start="0" end="2000" frame="0" posX="85" posY="0" sizeX="137" sizeY="152" fit="fit no distortion"/>    
 <text src="C.txt" region="text"/>    
</par>    <par dur="3000ms">  
 <std_time>2000</std_time>  
  <repeat_factor>0</repeat_factor>    
  <img src="D.gif" region="Image" start="0" end="2000" frame="0" sizeX="275" sizeY="195"/>    
</par>  
 <par dur="5450ms">  
 <std_time>2000</std_time>  
 <repeat_factor>0</repeat_factor>    
 <text src="E.txt" region="text"/>    
</par>    <par dur="2000ms">    
<std_time>2000</std_time>    
<repeat_factor>0</repeat_factor>      
<img src="F.gif" region="Image" start="0" end="2000" frame="0" sizeX="200" sizeY="254" fit="fit"/>    
</par>  
</body>
</smil>


and i have user the following jsp code to display the elements and attributes  of the xml tag
but i'm not able to get the attribute of img tag and text  tag pls suggest me where i have gone wrong

JSP code is as follows

<html>
<body>
<%@ page import="java.io.File,org.w3c.dom.Document,org.w3c.dom.*,org.w3c.dom.*,javax.xml.parsers.DocumentBuilderFactory" %>

<%@ page import ="javax.xml.parsers.DocumentBuilder,org.xml.sax.SAXException,org.xml.sax.SAXParseException" %>
<%

      String path = "c:/java/bin/xml/smil.xml";      
      String retrnvalues = "";
      try
      {
            DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
            Document doc = docBuilder.parse (new File(path));
            
            doc.getDocumentElement ().normalize ();
      
            NodeList listOfPersons = doc.getElementsByTagName("par");
            int totalPersons = listOfPersons.getLength();
            for(int s=0; s<listOfPersons.getLength() ; s++)
            {
      
                  Node firstPersonNode = listOfPersons.item(s);
      
                  if(firstPersonNode.getNodeType() == Node.ELEMENT_NODE)
                  {
      
                        Element firstPersonElement = (Element)firstPersonNode;
      
                        NodeList firstNameList = firstPersonElement.getElementsByTagName("std_time");
      
                        Element firstNameElement = (Element)firstNameList.item(0);
      
                        NodeList textFNList = firstNameElement.getChildNodes();
                              
                        out.println(((Node)textFNList.item(0)).getNodeValue().trim() + "<br>");
      
                        retrnvalues = retrnvalues + "&&" + ((Node)textFNList.item(0)).getNodeValue().trim();
      
                        NodeList firstNameList1 = firstPersonElement.getElementsByTagName("repeat_factor");
      
                        Element firstNameElement1 = (Element)firstNameList1.item(0);
      
                        NodeList textFNList1 = firstNameElement1.getChildNodes();
      
                        out.println(((Node)textFNList1.item(0)).getNodeValue().trim() + "<br>");
      
                        retrnvalues = retrnvalues + "&&" + ((Node)textFNList1.item(0)).getNodeValue().trim();
      
                        Element src = (Element)firstPersonNode;
      
                        NodeList srctxt = src.getElementsByTagName("text");
      
                        Element srcname = (Element)srctxt.item(0);
                        
                        out.println(srcname + "<br>");
      
                        retrnvalues = retrnvalues + "&&" + srcname;
      
                        Element imgsrc = (Element)firstPersonNode;
      
                        NodeList imgsrctxt = imgsrc.getElementsByTagName("img");
      
                        Element imgsrcname = (Element)imgsrctxt.item(0);
                        
                        out.println(imgsrcname + "<br>");
      
                        retrnvalues = retrnvalues + "&&" + imgsrcname;
      
                        Element audiosrc = (Element)firstPersonNode;
      
                        NodeList audiosrctxt = audiosrc.getElementsByTagName("audio");
      
                        Element audiosrcname = (Element)audiosrctxt.item(0);
                        
                        out.println(audiosrcname + "<br>");
      
                        retrnvalues = retrnvalues + "&&" + audiosrcname;
      
                  }
      
            }
      }
      catch(SAXParseException err)
      {
            System.out.println ("** Parsing error" + ", line " + err.getLineNumber () + ", uri " + err.getSystemId ());
            System.out.println(" " + err.getMessage ());
      }
      catch(SAXException e)
      {
            Exception x = e.getException ();
            ((x == null) ? e : x).printStackTrace ();
      }
      catch(Throwable t)
      {
            t.printStackTrace ();
      }
      

%>
</body>
</html>
Use JaxP (comes with JDK 1.4 and can be downloaded from sun).  The easiest method is probably to load the entire document into a DOM Document and parse the tree in memory to extract the data.

DocumentBuilder build=DocumentBuilderFactory.newDocumentBuilder();
Document doc=build.parse(xmlFileName);
...
NodeList parNodes=doc.getElementsByTagName("par");
for(i=0;i<parNodes.getLength();i++) {
   Element parElement=(Element) parNodes.item(i);
   ParBean par=new ParBean(parElement.getAttribute("dur"));

   par.setStdTime(parElement.getElementsByTagName("std_time").item(0).getFirstChild().getNodeValue());
   par.setImg(parElement.getElementsByTagName("std_time").item(0));
   ...extract the rest of the nodes...
}

Also look into JDOM (http://www.jdom.org/).  It is a simpler DOM api than the W3C DOM API I just showed you.
Some of the <par> nodes do not have the <img> tag so you need to do a check for existance.  Then you need to get the src attribute once you have the <img> node.  For Example:

   NodeList imgsrctxt = imgsrc.getElementsByTagName("img");
   if (imgsrctxt.getLength()>0) {
     Element imgNode=(Element)imgsrctxt.item(0);
   
     String imgsrcname = imgNode.getAttribute("src");
     out.println(imgsrcname + "<br>");
     retrnvalues = retrnvalues + "&&" + imgsrcname;
  }
Also consider wraping the W3C dom api in a utility class that hides the complexity and does the repeated tasks.  The API would be something like this:

class XMLUtils
{
   static public Document loadDocument(String fileName) {...}
   static public Element selectSingleNode(String nodeName) {...}
   static public String getNodeValue(Element node) {...}
...Some more functions that help you parse the XML DOM...
}

This would make the JSP pages much simpler to write
The XML header is still missing. The correct layout for a SMIL file is:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE smil PUBLIC "-//W3C//DTD SMIL 2.0//EN" "http://www.w3.org/2001/SMIL20/SMIL20.dtd">
<smil xmlns="">
</smil>

Doesn't matter much, though. It is also invalid because the region tag has duplicate ID values, which seems to be illegal in SMIL 2.0... It could well be that whatever solution you're using, it is trying to validate the file and discovers it isn't valid.
It seems to me though that you're trying to convert part of the SMIL file to HTML output. Am I correct? In that case, just use an XSLT translation and invoke it from Java. That's just a lot easier.
now i have used the following code but i'm not getting the exact flow as per the xml file

the code is as follows

<html>
<body>
<%@ page import="java.io.File,org.w3c.dom.Document,org.w3c.dom.*,org.w3c.dom.*,javax.xml.parsers.DocumentBuilderFactory" %>

<%@ page import ="javax.xml.parsers.DocumentBuilder,org.xml.sax.SAXException,org.xml.sax.SAXParseException" %>
<%

      String path = "c:/java/bin/xml/smil.xml";      
      String retrnvalues = "";
      try
      {
            DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
            Document doc = docBuilder.parse (new File(path));
            
            doc.getDocumentElement ().normalize ();
      
            NodeList listOfPersons = doc.getElementsByTagName("par");
            int totalPersons = listOfPersons.getLength();
            for(int s=0; s<listOfPersons.getLength() ; s++)
            {
      
                  Node firstPersonNode = listOfPersons.item(s);
      
                  if(firstPersonNode.getNodeType() == Node.ELEMENT_NODE)
                  {
                  
                        Element firstPersonElement = (Element)firstPersonNode;
                        
                        NodeList firstNameList = firstPersonElement.getElementsByTagName("std_time");
      
                        Element firstNameElement = (Element)firstNameList.item(0);
      
                        NodeList textFNList = firstNameElement.getChildNodes();
                        
                        retrnvalues = retrnvalues + "&&" + ((Node)textFNList.item(0)).getNodeValue().trim();
                        
                        NodeList firstNameList1 = firstPersonElement.getElementsByTagName("repeat_factor");
      
                        Element firstNameElement1 = (Element)firstNameList1.item(0);
      
                        NodeList textFNList1 = firstNameElement1.getChildNodes();
      
                        retrnvalues = retrnvalues + "&&" + ((Node)textFNList1.item(0)).getNodeValue().trim();
                              
                        NodeList txtsrctxt = firstPersonElement.getElementsByTagName("text");
                                                
                        Element txtsrcname = (Element)txtsrctxt.item(0);
                                                
                        if (txtsrctxt.getLength()>0)
                        {
                              Element txtNode=(Element)txtsrctxt.item(0);
                                                   
                              String txtsrcname1 = txtNode.getAttribute("src");
                                                     
                              out.println(txtsrcname1 + "<br>");
                                                           
                              retrnvalues = retrnvalues + "&&" + txtsrcname1;
                        }
                        
                        
                        NodeList imgsrctxt = firstPersonElement.getElementsByTagName("img");
                        
                        Element imgsrcname = (Element)imgsrctxt.item(0);
                        
                        
                        if (imgsrctxt.getLength()>0)
                        {
                        
                              Element imgNode=(Element)imgsrctxt.item(0);
                           
                                   String imgsrcname1 = imgNode.getAttribute("src");
                             
                                   out.println(imgsrcname1 + "<br>");
                                   
                                   retrnvalues = retrnvalues + "&&" + imgsrcname1;
                        }
                        
                        NodeList audiosrctxt = firstPersonElement.getElementsByTagName("audio");
                                                                        
                        Element audiosrcname = (Element)audiosrctxt.item(0);
                                                                        
                        if (audiosrctxt.getLength()>0)
                        {
                              Element audioNode=(Element)audiosrctxt.item(0);
                                                                           
                              String audiosrcname1 = audioNode.getAttribute("src");
                                                                             
                              out.println(audiosrcname1 + "<br>");
                                                                                   
                              retrnvalues = retrnvalues + "&&" + audiosrcname1;
                        }
                        

                        
                  }
      
            }
      }
      catch(SAXParseException err)
      {
            System.out.println ("** Parsing error" + ", line " + err.getLineNumber () + ", uri " + err.getSystemId ());
            System.out.println(" " + err.getMessage ());
      }
      catch(SAXException e)
      {
            Exception x = e.getException ();
            ((x == null) ? e : x).printStackTrace ();
      }
      catch(Throwable t)
      {
            t.printStackTrace ();
      }
      
      //out.println("<br>");
      out.println(retrnvalues);

%>
</body>
</html>
Do you mean that the nodes aren't displaying in the correct order?
ya the nodes are not displaying in the correct order
javadoc quote:
getElementsByTagName(java.lang.String tagname)
          Returns a NodeList of all the Elements with a given tag name in the order in which they would be encountered in a preorder traversal of the Document tree.

The par nodes are being processed in order.  The output I got was:
A.txt<br>
C.txt<br>
B.jpg<br>
D.gif<br>
E.txt<br>
F.gif<br>
&&2000&&0&&A.txt&&2000&&0&&C.txt&&B.jpg&&2000&&0&&D.gif&&2000&&0&&E.txt&&2000&&0&&F.gif


If you want to process the child nodes of <par> in the same order as the document, then you will have to do it in a loop of firstPersonElement.getChildNodes();

Maybe something like this:
      NodeList personNodes=firstPersonElement.getChildNodes();
      for (int i = 0; i < personNodes.getLength(); i++) {
            Node eachChild=personNodes.item(i);
            if (eachChild.getNodeType()!=Node.ELEMENT_NODE)
                  continue;
          Element child=(Element) eachChild;
          String nodeName = child.getNodeName();
            if (nodeName.equals("text")) {
                  String txtsrcname1 = child.getAttribute("src");
                  System.out.println(txtsrcname1 + "<br>");
                  retrnvalues = retrnvalues + "&&" + txtsrcname1;
          } else if (nodeName.equals("img")) {
                  String imgsrcname1 = child.getAttribute("src");
                  System.out.println(imgsrcname1 + "<br>");
                  retrnvalues = retrnvalues + "&&" + imgsrcname1;
            } else if (nodeName.equals("audio")) {
                  String imgsrcname1 = child.getAttribute("src");
                  System.out.println(imgsrcname1 + "<br>");
                  retrnvalues = retrnvalues + "&&" + imgsrcname1;
          } else {
                  retrnvalues+="&&" + child.getFirstChild().getNodeValue();
          }
          
      }

ASKER CERTIFIED SOLUTION
Avatar of MogalManic
MogalManic
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank Q for the help and i have made use of class file to read the xml file