vlad_oz
asked on
Using Java SAX XML writer to construct an XML file with elements containing HTML tags
Hi Experts!
I'm using Java XML SAX writer to write an XML file from a map object containing some data. The can get it create the file just fine. However some data elements contain HTML syntax which I need to preserve. SAX writer for some reason is re-processing the HTML tags when reading in the data e.g.
<p>One, two! One, two! And through and through<br />
becomes:
<p>One, two! One, two! And through and through<br />&nbsp;
Basically I don't want the writer to modify HTML tags at all, I just want it to include them in the resulting XML file as is. Is there anyway to stop this from happening?
Thanks!
I'm using Java XML SAX writer to write an XML file from a map object containing some data. The can get it create the file just fine. However some data elements contain HTML syntax which I need to preserve. SAX writer for some reason is re-processing the HTML tags when reading in the data e.g.
<p>One, two! One, two! And through and through<br />
becomes:
<p>One, two! One, two! And through and through<br />&nbsp;
Basically I don't want the writer to modify HTML tags at all, I just want it to include them in the resulting XML file as is. Is there anyway to stop this from happening?
Thanks!
You can try setting xmlns="http://www.w3.org/1999/xhtml" to indicate tags without any namespace should be treated as XHTML. You can then prefix all other XML tags with another namespace to distinguish if needed.
ASKER
mwvisa1,
Thanks for the quick response! Could you please provide a small example of what you mean.
Thanks for the quick response! Could you please provide a small example of what you mean.
Please post what you have so far and I will post in what I am talking about in context of your code so it is clear.
ASKER
here is the method responsible for creating XML elements
private void writeSimpleProp(ContentHandler handler, String name, String value)
throws SAXException
{
String nodeName;
AttributesImpl attrs = null;
if (validName(name)) {
nodeName = name;
} else {
nodeName = "prop";
attrs = new AttributesImpl();
attrs.addAttribute(nsURI ,localName, "name", "CDATA", name);
}
handler.startElement(nsURI, localName, nodeName, attrs);
char[] str = String.valueOf(value).toCharArray();
handler.characters(str, 0, str.length);
handler.endElement(nsURI, localName, nodeName);
}
ASKER
Also, I'm open to other Java XML writers, doesn't have to be SAX
So your HTML is coming through the Value attribute?
Sorry, I meant to say value parameter of the method.
ASKER
Yes, the HTML is contained within the value parameter, basically its a String.
I'm experimenting with DOM4J XML parser at the moment, seems to be a lot easier to use then SAX. So far I've got the HTML being written into the XML file as is within CDATA elements. E.g.:
<additionalInformation><![ CDATA[<p>O ne, two! One, two! And through and through<br /> The vorpal blade went snicker-snack!<br />He left it dead, and with its head<br /> He went galumphing back.</p>]]></additionalIn formation>
This probably will be acceptable as the XML reader shouldn't care if the element is CDATA or not. Still, would have been good to know if it was possible to avoid using CDATA.
I'm experimenting with DOM4J XML parser at the moment, seems to be a lot easier to use then SAX. So far I've got the HTML being written into the XML file as is within CDATA elements. E.g.:
<additionalInformation><![
This probably will be acceptable as the XML reader shouldn't care if the element is CDATA or not. Still, would have been good to know if it was possible to avoid using CDATA.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hey Kevin,
I think I'll go with DOM4J for now, I know what you meant by text for html element nodes, but hoping to avoid doing it.
Thanks for your help!
Cheers,
Vlad
I think I'll go with DOM4J for now, I know what you meant by text for html element nodes, but hoping to avoid doing it.
Thanks for your help!
Cheers,
Vlad