Solved

Using Java SAX XML writer to construct an XML file with elements containing HTML tags

Posted on 2008-09-29
10
2,862 Views
Last Modified: 2012-05-05
Hi Experts!

I'm using Java XML SAX writer to write an XML file from a map object containing some data. The can get it create the file just fine. However some data elements contain HTML syntax which I need to preserve. SAX writer for some reason is re-processing the HTML tags when reading in the data e.g.

<p>One, two! One, two! And through and through<br />&nbsp;

becomes:

&lt;p&gt;One, two! One, two! And through and through&lt;br /&gt;&amp;nbsp;

Basically I don't want the writer to modify HTML tags at all, I just want it to include them in the resulting XML file as is. Is there anyway to stop this from happening?

Thanks!
0
Comment
Question by:vlad_oz
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
10 Comments
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 22602684
You can try setting xmlns="http://www.w3.org/1999/xhtml" to indicate tags without any namespace should be treated as XHTML.  You can then prefix all other XML tags with another namespace to distinguish if needed.
0
 

Author Comment

by:vlad_oz
ID: 22602706
mwvisa1,
Thanks for the quick response! Could you please provide a small example of what you mean.
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 22602733
Please post what you have so far and I will post in what I am talking about in context of your code so it is clear.
0
Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

 

Author Comment

by:vlad_oz
ID: 22602767
here is the method responsible for creating XML elements
private void writeSimpleProp(ContentHandler handler, String name, String value)
			throws SAXException
			
	{
		String nodeName;
		AttributesImpl attrs = null;
 
		if (validName(name)) {
			nodeName = name;
		} else {
			nodeName = "prop";
			attrs = new AttributesImpl();
			attrs.addAttribute(nsURI ,localName, "name", "CDATA", name);
		}
		handler.startElement(nsURI, localName, nodeName, attrs);
		char[] str = String.valueOf(value).toCharArray();
                handler.characters(str, 0, str.length);
		handler.endElement(nsURI, localName, nodeName);
	}

Open in new window

0
 

Author Comment

by:vlad_oz
ID: 22602879
Also, I'm open to other Java XML writers, doesn't have to be SAX
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 22606240
So your HTML is coming through the Value attribute?
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 22606243
Sorry, I meant to say value parameter of the method.
0
 

Author Comment

by:vlad_oz
ID: 22610187
Yes, the HTML is contained within the value parameter, basically its a String.

I'm experimenting with DOM4J XML parser at the moment, seems to be a lot easier to use then SAX. So far I've got the HTML being written into the XML file as is within CDATA elements. E.g.:

<additionalInformation><![CDATA[<p>One, two! One, two! And through and through<br />&nbsp; The vorpal blade went snicker-snack!<br />He left it dead, and with its head<br />&nbsp; He went galumphing back.</p>]]></additionalInformation>

This probably will be acceptable as the XML reader shouldn't care if the element is CDATA or not. Still, would have been good to know if it was possible to avoid using CDATA.
0
 
LVL 60

Accepted Solution

by:
Kevin Cross earned 500 total points
ID: 22610220
Think so, but would have to parse the string value into its own XML element/node and then add that into your XML.

Since it is textnode right now, it gets treated like other text; however, if you could parse your text and find the nodes (i.e. any thing between <?> and add to your XML then add text and other HTML data to that).  Not sure if I am explaining it correctly, but sounds like you have a good alternative.

Regards,
Kevin
0
 

Author Comment

by:vlad_oz
ID: 22611905
Hey Kevin,

I think I'll go with DOM4J for now, I know what you meant by text for html element nodes, but hoping to avoid doing it.
Thanks for your help!

Cheers,
Vlad
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
restrict decimal places for double datatype 10 49
How does proportional-column-width work in xsl fo 4 72
DTD and JAVA versions 1 55
SQL Server XML Select sub tables 4 64
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
Viewers learn about the third conditional statement “else if” and use it in an example program. Then additional information about conditional statements is provided, covering the topic thoroughly. Viewers learn about the third conditional statement …
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:
Suggested Courses

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question