fsyed
asked on
Need to indent an XML file I have which is fully left justified to reflect tag hierarchy
Dear fellow Java/XML developers:
I have a very large xml file that is fully left justified, and I want to write a small program in Java that converts this file to reflect a properly structured xml file with tag hierarchy eg.
At present my XML file looks like this:
<A a="" b="" c="">
<B>
<C a="" b="" c="">blah blah blah</C>
</B>
</A>
I want it to look like this:
<A a="" b="" c="">
<B>
<C a="" b="" c="">blah blah blah</C>
</B>
</A>
The above example is exactly that, an example. The actual xml file I have is a bit more complicated, however the concept is still the same. I would like this program to simply read in the xml file, and produce a properly structured one.
Any help would be greatly appreciated.
Thanks in advance.
I have a very large xml file that is fully left justified, and I want to write a small program in Java that converts this file to reflect a properly structured xml file with tag hierarchy eg.
At present my XML file looks like this:
<A a="" b="" c="">
<B>
<C a="" b="" c="">blah blah blah</C>
</B>
</A>
I want it to look like this:
<A a="" b="" c="">
<B>
<C a="" b="" c="">blah blah blah</C>
</B>
</A>
The above example is exactly that, an example. The actual xml file I have is a bit more complicated, however the concept is still the same. I would like this program to simply read in the xml file, and produce a properly structured one.
Any help would be greatly appreciated.
Thanks in advance.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
yep, that looks good
ASKER
Is it just me, or do text editors in general NOT display indentation? I ask this because I wrote the program as you described to perform the indentation, but unfortunately when I try to view my new xml file in notepad/wordpad/textpad, etc., I do not see the indentation of my tags. I have listed my javacode below.
Thanks again for your time and patience.
Thanks again for your time and patience.
package sample;
import java.io.File;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
public class Indent {
public static void main(String[] args) throws ParserConfigurationException, TransformerConfigurationException, TransformerException, SAXException, IOException {
// TODO Auto-generated method stub
File xmlDoc = new File("C:\\xmltest\\SahihAlBukhariComplete.xml");
File xmlNewDoc = new File("C:\\xmltest\\SahihAlBukhariCompleteNew.xml");
Indent test = new Indent();
Document doc = test.createDOMDoc(xmlDoc);
test.xmlDocToFile(doc, xmlNewDoc);
}
public Document createDOMDoc(File file) throws ParserConfigurationException, SAXException, IOException{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(file);
return document;
}
public void xmlDocToFile(Document document, File xmlOutputFilePath) throws TransformerConfigurationException, TransformerException{
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "5");
Result result = new StreamResult(xmlOutputFilePath);
Source source = new DOMSource(document);
transformer.transform(source, result);
}
}
what version of java are you using? there was a bug in java 5
ASKER
I am using jdk6 in eclipse. I also notice in my output file that extra spaces are inserted within the content of some of the elements, i.e. some words will have two or more spaces in between. Is this expected?
Thanks again for your help. If you are unable to figure out this last hurdle, don't worry about it. You've done more than enough to deserve full points. :-)
Thanks again for your help. If you are unable to figure out this last hurdle, don't worry about it. You've done more than enough to deserve full points. :-)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks very much CEHJ for your solution. It works! One follow up question I have is that within some of the elements of my input xml file, I have large amounts of text. I noticed in my output xml file that there are additional spaces between the words. Is there any reason for this? I would like to preserve the text the way it is, if possible.
Just wondering.
Just wondering.
Could you please attach input and output files
ASKER
My mistake CEHJ, I didn't realize this, but the spaces are actually in the original xml document file also, so your code works exactly as I need it to. Thanks so much for your solution, and thanks very much to Savant for your efforts also.
:-)
No problem
No problem
ASKER
Thanks CEHJ for your time and patience in providing me an excellent solution, you truly are a genius :-) Thank you Savant for your efforts, I do appreciate it!
ASKER
Just wondering.
Open in new window