Parsing an xml file using Java

Sreejith22
Sreejith22 used Ask the Experts™
on
Hello experts,

http://gdata.youtube.com/schemas/2007/categories.cat

I need to parse and fetch the following data from the cat file from the aforesaid link, without saving the cat file to anywhere.

1)label
2)browsable regions

As this is not a standard xml format, I do not know how to parse and get the above given data.

I would be obliged if any of the experts here can provide a solution for the same.

Any help in this regard will be well appreciated with points.

Looking forward,
Regards,
Sreejith
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Mick BarryJava Developer
Top Expert 2010

Commented:

Author

Commented:
Hi objects,
Thanks for the link. But unfortunately that provides only a small portion of my requirement.
<<As this is not a standard xml format, I do not know how to parse and get the above given data.>>
i am not able to open the link here, can you please post/attach the content of it
Announcing the Winners!

The results are in for the 15th Annual Expert Awards! Congratulations to the winners, and thank you to everyone who participated in the nominations. We are so grateful for the valuable contributions experts make on a daily basis. Click to read more about this year’s recipients!

Author

Commented:
Hi,
It comes as a file by name categories.cat when you click that link.
<?xml version='1.0' encoding='UTF-8'?><app:categories xmlns:app='http://www.w3.org/2007/app' xmlns:atom='http://www.w3.org/2005/Atom' xmlns:yt='http://gdata.youtube.com/schemas/2007' fixed='yes' scheme='http://gdata.youtube.com/schemas/2007/categories.cat'><atom:category term='Film' label='Film &amp; Animation' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Autos' label='Autos &amp; Vehicles' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Music' label='Music' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Animals' label='Pets &amp; Animals' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Sports' label='Sports' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Travel' label='Travel &amp; Events' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Shortmov' label='Short Movies' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Videoblog' label='Videoblogging' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Games' label='Gaming' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Comedy' label='Comedy' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='People' label='People &amp; Blogs' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='News' label='News &amp; Politics' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Entertainment' label='Entertainment' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Education' label='Education' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Howto' label='Howto &amp; Style' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Nonprofit' label='Nonprofits &amp; Activism' xml:lang='en-US'><yt:assignable/><yt:browsable regions='US'/></atom:category><atom:category term='Tech' label='Science &amp; Technology' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Movies_Anime_animation' label='Movies - Anime/Animation' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies' label='Movies' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Comedy' label='Movies - Comedy' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Documentary' label='Movies - Documentary' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Action_adventure' label='Movies - Action/Adventure' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Classics' label='Movies - Classics' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Foreign' label='Movies - Foreign' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Horror' label='Movies - Horror' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Drama' label='Movies - Drama' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Family' label='Movies - Family' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Shorts' label='Movies - Shorts' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Shows' label='Shows' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Sci_fi_fantasy' label='Movies - Sci-Fi/Fantasy' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Thriller' label='Movies - Thriller' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Trailers' label='Trailers' xml:lang='en-US'><yt:deprecated/></atom:category></app:categories>

Open in new window

Author

Commented:
hmm..well.... I have not parsed any file which comes in this xml format...that is why I asked..
<<in this xml format>>
what difference do you see in this xml file than you have parse before?

Author

Commented:
First thing, parsing without saving, secondly, it comes with yt: etc.. i have not seen any such xml's yet, in my limited experience.

Anyway, I am glad to see that you respond quickly, unlike http://www.experts-exchange.com/Programming/Languages/Java/Q_26545830.html , in which you asked for confirmation and then kept mum.
:) i kept mum because i had suggested a link which does the auto-complete, and wanted a confirmation if you were working on Android SDK or not. If you were not working with Android SDK, so my first reply would have been wrong. Any ways, i had not worked on that functionality myself, nor  i am going to have this requirement in future, so i only did some search and suggested that link.
But i am not dis-agreeing with you, if you say that i could not help you. it is fair enough on your part to say that.

Coming back to the current question, you can parse XML from the string source using
see this example
http://www.rgagnon.com/javadetails/java-0573.html

Also 'yt' is not problem, it is just for namespace resolution
check the 'xmlns:yt' attribute in 'app:categories' tag.



Author

Commented:
Hi,
Why does not this work, and gives nullpointer..
import javax.xml.parsers.*;
import org.xml.sax.InputSource;
import org.w3c.dom.*;
import java.io.*;

public class ParseXMLString {

  public static void main(String arg[]) {
     String xmlRecords =
      "<?xml version='1.0' encoding='UTF-8'?><app:categories xmlns:app='http://www.w3.org/2007/app' xmlns:atom='http://www.w3.org/2005/Atom' xmlns:yt='http://gdata.youtube.com/schemas/2007' fixed='yes' scheme='http://gdata.youtube.com/schemas/2007/categories.cat'><atom:category term='Film' label='Film &amp; Animation' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Autos' label='Autos &amp; Vehicles' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Music' label='Music' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Animals' label='Pets &amp; Animals' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Sports' label='Sports' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Travel' label='Travel &amp; Events' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Shortmov' label='Short Movies' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Videoblog' label='Videoblogging' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Games' label='Gaming' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Comedy' label='Comedy' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='People' label='People &amp; Blogs' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='News' label='News &amp; Politics' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Entertainment' label='Entertainment' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Education' label='Education' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Howto' label='Howto &amp; Style' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Nonprofit' label='Nonprofits &amp; Activism' xml:lang='en-US'><yt:assignable/><yt:browsable regions='US'/></atom:category><atom:category term='Tech' label='Science &amp; Technology' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Movies_Anime_animation' label='Movies - Anime/Animation' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies' label='Movies' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Comedy' label='Movies - Comedy' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Documentary' label='Movies - Documentary' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Action_adventure' label='Movies - Action/Adventure' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Classics' label='Movies - Classics' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Foreign' label='Movies - Foreign' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Horror' label='Movies - Horror' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Drama' label='Movies - Drama' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Family' label='Movies - Family' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Shorts' label='Movies - Shorts' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Shows' label='Shows' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Sci_fi_fantasy' label='Movies - Sci-Fi/Fantasy' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Thriller' label='Movies - Thriller' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Trailers' label='Trailers' xml:lang='en-US'><yt:deprecated/></atom:category></app:categories>";

    try {
        DocumentBuilderFactory dbf =
            DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        InputSource is = new InputSource();
        is.setCharacterStream(new StringReader(xmlRecords));

        Document doc = db.parse(is);
        NodeList nodes = doc.getElementsByTagName("atom:category");

        // iterate the employees
        for (int i = 0; i < nodes.getLength(); i++) {
           Element element = (Element) nodes.item(i);

           NodeList name = element.getElementsByTagName("term");
           Element line = (Element) name.item(0);
           System.out.println("Term:"+  getCharacterDataFromElement(line));

           /*NodeList title = element.getElementsByTagName("title");
           line = (Element) title.item(0);
           System.out.println("Title:"+  getCharacterDataFromElement(line));*/
        }
    }
    catch (Exception e) {
        e.printStackTrace();
    }
    /*
    output :
        Name: John
        Title: Manager
        Name: Sara
        Title: Clerk
    */    
    
  }

  public static String getCharacterDataFromElement(Element e) {
    Node child = e.getFirstChild();
    if (child instanceof CharacterData) {
       CharacterData cd = (CharacterData) child;
       return cd.getData();
    }
    return "?";
  }
}

Open in new window

mccarlIT Business Systems Analyst / Software Developer
Top Expert 2015

Commented:
Try using xpath, makes it a lot easier... (note this is with Java 1.6). There may be differences if you are using an earlier version

import java.net.URL;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;

public class XMLParser
{

    public static void main(String[] args) throws Exception
    {
        URL url = new URL("http://gdata.youtube.com/schemas/2007/categories.cat");

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document document = builder.parse(url.openStream());
        //Document document = builder.parse(new InputSource(new StringReader("<?xml version='1.0' encoding='UTF-8'?><app:categories xmlns:app='http://www.w3.org/2007/app' xmlns:atom='http://www.w3.org/2005/Atom' xmlns:yt='http://gdata.youtube.com/schemas/2007' fixed='yes' scheme='http://gdata.youtube.com/schemas/2007/categories.cat'><atom:category term='Film' label='Film &amp; Animation' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Autos' label='Autos &amp; Vehicles' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Music' label='Music' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Animals' label='Pets &amp; Animals' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Sports' label='Sports' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Travel' label='Travel &amp; Events' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Shortmov' label='Short Movies' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Videoblog' label='Videoblogging' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Games' label='Gaming' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Comedy' label='Comedy' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='People' label='People &amp; Blogs' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='News' label='News &amp; Politics' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Entertainment' label='Entertainment' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Education' label='Education' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Howto' label='Howto &amp; Style' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Nonprofit' label='Nonprofits &amp; Activism' xml:lang='en-US'><yt:assignable/><yt:browsable regions='US'/></atom:category><atom:category term='Tech' label='Science &amp; Technology' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Movies_Anime_animation' label='Movies - Anime/Animation' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies' label='Movies' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Comedy' label='Movies - Comedy' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Documentary' label='Movies - Documentary' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Action_adventure' label='Movies - Action/Adventure' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Classics' label='Movies - Classics' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Foreign' label='Movies - Foreign' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Horror' label='Movies - Horror' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Drama' label='Movies - Drama' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Family' label='Movies - Family' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Shorts' label='Movies - Shorts' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Shows' label='Shows' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Sci_fi_fantasy' label='Movies - Sci-Fi/Fantasy' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Thriller' label='Movies - Thriller' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Trailers' label='Trailers' xml:lang='en-US'><yt:deprecated/></atom:category></app:categories>")));

        XPath xpath = XPathFactory.newInstance().newXPath();
        NodeList result = (NodeList) xpath.evaluate("/categories/category", document, XPathConstants.NODESET);

        for(int i = 0; i < result.getLength(); i++)
        {
            System.out.println(xpath.evaluate("@label", result.item(i)));
            System.out.println(xpath.evaluate("browsable/@regions", result.item(i)));
        }
    }
}

Open in new window

change line 20 to
        NodeList nodes = doc.getElementsByTagName("app:categories/atom:category");

Author

Commented:
Hi mccarl,
Thanks a lot. Your code was perfect and it gave me what I wanted and helped me understand something new.

Hi Gurvinder,
I tried your suggestion
@change line 20 to
        NodeList nodes = doc.getElementsByTagName("app:categories/atom:category");

But I do not know why, it does not even get inside my for loop -> for (int i = 0; i < nodes.getLength(); i++)

Please advise. I love to close this ASAP with points awarded.

Looking forward,
Sree
mccarlIT Business Systems Analyst / Software Developer
Top Expert 2015

Commented:
@Sree

In response to your question about the updated line 20 from the code that you posted, there are probably 2 options that may work, see how you go.

   NodeList nodes = doc.getElementsByTagName("category");

ie. Just find me any tag that is called "category" in doc (regardless of its namespace)

or

   NodeList nodes = doc.getElementsByTagNameNS("http://www.w3.org/2005/Atom", "category");

ie. find me any tag that is called "category" in the "http://www.w3.org/2005/Atom" namespace.


I don't think there is any way to use the namespace alias (eg. atom, or app) in these methods. And what Gurvinder was suggesting won't work, you can't use XPath expressions in getElementByTagName methods.

Author

Commented:
HI mccarl,

Still, it does not work. It does not even get inside for loop.

Regards,
Sree
// NodeList nodes = doc.getElementsByTagName("category");
        
        NodeList nodes = doc.getElementsByTagNameNS("http://www.w3.org/2005/Atom", "category");


        // iterate the employees
        for (int i = 0; i < nodes.getLength(); i++) {
        	System.out.println("in here");
           Element element = (Element) nodes.item(i);

           NodeList name = element.getElementsByTagName("term");
           Element line = (Element) name.item(0);
           System.out.println("Term:"+  getCharacterDataFromElement(line));
        }

Open in new window

IT Business Systems Analyst / Software Developer
Top Expert 2015
Commented:
Sorry, I was a bit mislead by looking too much at what Gurvinder wrote and not testing it myself. You had the correct line 20 (ie. the correct getElementsByTagName call) but the problem was with retrieving the attribute value. You were trying to get the attribute value by also using getElementsByTagName but "term" is an attribute and not a tag/element so that is why it was returning a null.

Try the following...

import javax.xml.parsers.*;
import org.xml.sax.InputSource;
import org.w3c.dom.*;
import java.io.*;

public class ParseXMLString {

  public static void main(String arg[]) {
     String xmlRecords =
      "<?xml version='1.0' encoding='UTF-8'?><app:categories xmlns:app='http://www.w3.org/2007/app' xmlns:atom='http://www.w3.org/2005/Atom' xmlns:yt='http://gdata.youtube.com/schemas/2007' fixed='yes' scheme='http://gdata.youtube.com/schemas/2007/categories.cat'><atom:category term='Film' label='Film &amp; Animation' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Autos' label='Autos &amp; Vehicles' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Music' label='Music' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Animals' label='Pets &amp; Animals' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Sports' label='Sports' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Travel' label='Travel &amp; Events' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Shortmov' label='Short Movies' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Videoblog' label='Videoblogging' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Games' label='Gaming' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Comedy' label='Comedy' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='People' label='People &amp; Blogs' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='News' label='News &amp; Politics' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Entertainment' label='Entertainment' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Education' label='Education' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Howto' label='Howto &amp; Style' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Nonprofit' label='Nonprofits &amp; Activism' xml:lang='en-US'><yt:assignable/><yt:browsable regions='US'/></atom:category><atom:category term='Tech' label='Science &amp; Technology' xml:lang='en-US'><yt:assignable/><yt:browsable regions='AM AR AU BG BR CA CZ DE DK ES FI FR GB GR HK HR HU IE IL IN IT JP KR LT MX NL NO NZ PL PT RO RU SE SK SL SR TW US VI ZA'/></atom:category><atom:category term='Movies_Anime_animation' label='Movies - Anime/Animation' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies' label='Movies' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Comedy' label='Movies - Comedy' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Documentary' label='Movies - Documentary' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Action_adventure' label='Movies - Action/Adventure' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Classics' label='Movies - Classics' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Foreign' label='Movies - Foreign' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Horror' label='Movies - Horror' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Drama' label='Movies - Drama' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Family' label='Movies - Family' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Shorts' label='Movies - Shorts' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Shows' label='Shows' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Sci_fi_fantasy' label='Movies - Sci-Fi/Fantasy' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Movies_Thriller' label='Movies - Thriller' xml:lang='en-US'><yt:deprecated/></atom:category><atom:category term='Trailers' label='Trailers' xml:lang='en-US'><yt:deprecated/></atom:category></app:categories>";

    try {
        DocumentBuilderFactory dbf =
            DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        InputSource is = new InputSource();
        is.setCharacterStream(new StringReader(xmlRecords));

        Document doc = db.parse(is);
        NodeList nodes = doc.getElementsByTagName("atom:category");

        // iterate the employees
        for (int i = 0; i < nodes.getLength(); i++) {
           Element element = (Element) nodes.item(i);

           //NodeList name = element.getElementsByTagName("term");
           //Element line = (Element) name.item(0);
           System.out.println("Term:"+  element.getAttribute("term"));

           /*NodeList title = element.getElementsByTagName("title");
           line = (Element) title.item(0);
           System.out.println("Title:"+  getCharacterDataFromElement(line));*/
        }
    }
    catch (Exception e) {
        e.printStackTrace();
    }
    /*
    output :
        Name: John
        Title: Manager
        Name: Sara
        Title: Clerk
    */

  }

  public static String getCharacterDataFromElement(Element e) {
    Node child = e.getFirstChild();
    if (child instanceof CharacterData) {
       CharacterData cd = (CharacterData) child;
       return cd.getData();
    }
    return "?";
  }
}

Open in new window

Author

Commented:
Simply excellent

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial