[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1102
  • Last Modified:

Need to indent an XML file I have which is fully left justified to reflect tag hierarchy

Dear fellow Java/XML developers:

I have a very large xml file that is fully left justified, and I want to write a small program in Java that converts this file to reflect a properly structured xml file with tag hierarchy eg.

At present my XML file looks like this:

<A a="" b="" c="">
<B>
<C a="" b="" c="">blah blah blah</C>
</B>
</A>

I want it to look like this:

<A a="" b="" c="">
     <B>
          <C a="" b="" c="">blah blah blah</C>
     </B>
</A>

The above example is exactly that, an example.  The actual xml file I have is a bit more complicated, however the concept is still the same.  I would like this program to simply read in the xml file, and produce a properly structured one.  

Any help would be greatly appreciated.

Thanks in advance.
0
fsyed
Asked:
fsyed
  • 6
  • 3
  • 3
2 Solutions
 
objectsCommented:
first load the xml into a dom

http://helpdesk.objects.com.au/java/how-do-i-create-a-dom-document-from-an-xml-file

then export the dom to a file

http://helpdesk.objects.com.au/java/how-to-save-a-xml-dom-document-to-a-file

set the following proiperties on the transformer to specify indentation

transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

0
 
fsyedAuthor Commented:
Thanks so much for your quick reply.  In the code listed below, have I placed the lines you are recommending in the right place?

Just wondering.
1.TransformerFactory factory = TransformerFactory.newInstance();
2.Transformer transformer = factory.newTransformer();
3.transformer.setOutputProperty(OutputKeys.INDENT, "yes");
4.transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
5.Result result = new StreamResult(new File(xmlOutputFilePath));
6.Source source = new DOMSource(document);
7.transformer.transform(source, result);

Open in new window

0
 
objectsCommented:
yep, that looks good

0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
fsyedAuthor Commented:
Is it just me, or do text editors in general NOT display indentation?  I ask this because I wrote the program as you described to perform the indentation, but unfortunately when I try to view my new xml file in notepad/wordpad/textpad, etc., I do not see the indentation of my tags.  I have listed my javacode below.

Thanks again for your time and patience.
package sample;
 
import java.io.File;
import java.io.IOException;
 
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
 
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
 
public class Indent {
 
	
	
	
	public static void main(String[] args) throws ParserConfigurationException, TransformerConfigurationException, TransformerException, SAXException, IOException {
		// TODO Auto-generated method stub
		File xmlDoc = new File("C:\\xmltest\\SahihAlBukhariComplete.xml");
		File xmlNewDoc = new File("C:\\xmltest\\SahihAlBukhariCompleteNew.xml");
		Indent test = new Indent();
		Document doc = test.createDOMDoc(xmlDoc);
		test.xmlDocToFile(doc, xmlNewDoc);
	}
	
	public Document createDOMDoc(File file) throws ParserConfigurationException, SAXException, IOException{
		
		DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
		DocumentBuilder builder = factory.newDocumentBuilder();
		Document document = builder.parse(file);
		return document;
	}
	
	public void xmlDocToFile(Document document, File xmlOutputFilePath) throws TransformerConfigurationException, TransformerException{
		
		TransformerFactory factory = TransformerFactory.newInstance();
		Transformer transformer = factory.newTransformer();
		transformer.setOutputProperty(OutputKeys.INDENT, "yes");
		transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "5");
		Result result = new StreamResult(xmlOutputFilePath);
		Source source = new DOMSource(document);
		transformer.transform(source, result);
	}
 
}

Open in new window

0
 
objectsCommented:
what version of java are you using?  there was a bug in java 5

0
 
fsyedAuthor Commented:
I am using jdk6 in eclipse.  I also notice in my output file that extra spaces are inserted within the content of some of the elements, i.e. some words will have two or more spaces in between.  Is this expected?

Thanks again for your help.  If you are unable to figure out this last hurdle, don't worry about it.  You've done more than enough to deserve full points.  :-)
0
 
CEHJCommented:
>>but unfortunately when I try to view my new xml file in notepad/wordpad/textpad, etc., I do not see the indentation of my tags

That's nothing to do with your editor - it's because the code won't work. Try the below:
import org.w3c.dom.*;
import org.w3c.dom.ls.*;
 
import java.io.*;
 
import javax.xml.parsers.*;
 
 
class Serializer {
    public boolean serialize(String inFile, String outFile) {
	boolean result = false;
        Document doc = null;
        DOMImplementationLS DOMiLS = null;
        FileOutputStream fos = null;
 
        try {
            try {
                //Create a DocumentBuilderFactory object, a DocumentBuilder object and get the DOM tree
                doc = DocumentBuilderFactory.newInstance().newDocumentBuilder()
                                            .parse(new File(inFile));
            } catch (javax.xml.parsers.ParserConfigurationException e) {
                throw new RuntimeException(e);
            } catch (org.xml.sax.SAXException e) {
                throw new RuntimeException(e);
            } catch (java.io.IOException e) {
                throw new RuntimeException(e);
            }
 
            //testing the support for DOM Load and Save
            if ((doc.getFeature("Core", "3.0") != null) &&
                    (doc.getFeature("LS", "3.0") != null)) {
                DOMiLS = (DOMImplementationLS) (doc.getImplementation()).getFeature("LS",
                        "3.0");
            } else {
                throw new RuntimeException("DOM Load and Save unsupported");
            }
 
            //get a LSOutput object
            LSOutput lso = DOMiLS.createLSOutput();
 
            //setting the location for storing the result of serialization
            try {
                fos = new FileOutputStream(outFile);
                lso.setByteStream((OutputStream) fos);
            } catch (java.io.FileNotFoundException e) {
                throw new RuntimeException(e);
            }
 
            //get a LSSerializer object
            LSSerializer lss = DOMiLS.createLSSerializer();
            lss.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);
 
            //do the serialization and collect the result
            result = lss.write(doc, lso);
        } finally {
            try {
                if (fos != null) {
                    fos.close();
                }
            } catch (java.io.IOException e) {
                throw new RuntimeException(e);
            }
        }
	return result;
    }
 
    public static void main(String[] args) {
	if (args.length != 2) {
	    System.err.println("Usage: java Serialize <infile> <outfile>");
	    System.exit(1);
	}
        Serializer t = new Serializer();
        boolean result = t.serialize(args[0], args[1]);
	System.exit(result? 0 : 1);
    }
}

Open in new window

0
 
fsyedAuthor Commented:
Thanks very much CEHJ for your solution.  It works!  One follow up question I have is that within some of the elements of my input xml file, I have large amounts of text.  I noticed in my output xml file that there are additional spaces between the words.  Is there any reason for this?  I would like to preserve the text the way it is, if possible.  

Just wondering.
0
 
CEHJCommented:
Could you please attach input and output files
0
 
fsyedAuthor Commented:
My mistake CEHJ, I didn't realize this, but the spaces are actually in the original xml document file also, so your code works exactly as I need it to.  Thanks so much for your solution, and thanks very much to Savant for your efforts also.
0
 
CEHJCommented:
:-)

No problem
0
 
fsyedAuthor Commented:
Thanks CEHJ for your time and patience in providing me an excellent solution, you truly are a genius :-)  Thank you Savant for your efforts, I do appreciate it!
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 6
  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now