Link to home
Start Free TrialLog in
Avatar of fsyed
fsyed

asked on

Need to indent an XML file I have which is fully left justified to reflect tag hierarchy

Dear fellow Java/XML developers:

I have a very large xml file that is fully left justified, and I want to write a small program in Java that converts this file to reflect a properly structured xml file with tag hierarchy eg.

At present my XML file looks like this:

<A a="" b="" c="">
<B>
<C a="" b="" c="">blah blah blah</C>
</B>
</A>

I want it to look like this:

<A a="" b="" c="">
     <B>
          <C a="" b="" c="">blah blah blah</C>
     </B>
</A>

The above example is exactly that, an example.  The actual xml file I have is a bit more complicated, however the concept is still the same.  I would like this program to simply read in the xml file, and produce a properly structured one.  

Any help would be greatly appreciated.

Thanks in advance.
SOLUTION
Avatar of Mick Barry
Mick Barry
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of fsyed
fsyed

ASKER

Thanks so much for your quick reply.  In the code listed below, have I placed the lines you are recommending in the right place?

Just wondering.
1.TransformerFactory factory = TransformerFactory.newInstance();
2.Transformer transformer = factory.newTransformer();
3.transformer.setOutputProperty(OutputKeys.INDENT, "yes");
4.transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
5.Result result = new StreamResult(new File(xmlOutputFilePath));
6.Source source = new DOMSource(document);
7.transformer.transform(source, result);

Open in new window

yep, that looks good

Avatar of fsyed

ASKER

Is it just me, or do text editors in general NOT display indentation?  I ask this because I wrote the program as you described to perform the indentation, but unfortunately when I try to view my new xml file in notepad/wordpad/textpad, etc., I do not see the indentation of my tags.  I have listed my javacode below.

Thanks again for your time and patience.
package sample;
 
import java.io.File;
import java.io.IOException;
 
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
 
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
 
public class Indent {
 
	
	
	
	public static void main(String[] args) throws ParserConfigurationException, TransformerConfigurationException, TransformerException, SAXException, IOException {
		// TODO Auto-generated method stub
		File xmlDoc = new File("C:\\xmltest\\SahihAlBukhariComplete.xml");
		File xmlNewDoc = new File("C:\\xmltest\\SahihAlBukhariCompleteNew.xml");
		Indent test = new Indent();
		Document doc = test.createDOMDoc(xmlDoc);
		test.xmlDocToFile(doc, xmlNewDoc);
	}
	
	public Document createDOMDoc(File file) throws ParserConfigurationException, SAXException, IOException{
		
		DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
		DocumentBuilder builder = factory.newDocumentBuilder();
		Document document = builder.parse(file);
		return document;
	}
	
	public void xmlDocToFile(Document document, File xmlOutputFilePath) throws TransformerConfigurationException, TransformerException{
		
		TransformerFactory factory = TransformerFactory.newInstance();
		Transformer transformer = factory.newTransformer();
		transformer.setOutputProperty(OutputKeys.INDENT, "yes");
		transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "5");
		Result result = new StreamResult(xmlOutputFilePath);
		Source source = new DOMSource(document);
		transformer.transform(source, result);
	}
 
}

Open in new window

what version of java are you using?  there was a bug in java 5

Avatar of fsyed

ASKER

I am using jdk6 in eclipse.  I also notice in my output file that extra spaces are inserted within the content of some of the elements, i.e. some words will have two or more spaces in between.  Is this expected?

Thanks again for your help.  If you are unable to figure out this last hurdle, don't worry about it.  You've done more than enough to deserve full points.  :-)
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of fsyed

ASKER

Thanks very much CEHJ for your solution.  It works!  One follow up question I have is that within some of the elements of my input xml file, I have large amounts of text.  I noticed in my output xml file that there are additional spaces between the words.  Is there any reason for this?  I would like to preserve the text the way it is, if possible.  

Just wondering.
Could you please attach input and output files
Avatar of fsyed

ASKER

My mistake CEHJ, I didn't realize this, but the spaces are actually in the original xml document file also, so your code works exactly as I need it to.  Thanks so much for your solution, and thanks very much to Savant for your efforts also.
:-)

No problem
Avatar of fsyed

ASKER

Thanks CEHJ for your time and patience in providing me an excellent solution, you truly are a genius :-)  Thank you Savant for your efforts, I do appreciate it!