Improve company productivity with a Business Account.Sign Up

x
?
Solved

html to xhtml

Posted on 2009-07-10
4
Medium Priority
?
260 Views
Last Modified: 2012-05-07
How would I convert the html string to xhtml before parsing it.  The below works fine, unless I take out the quotes around 'white', making it non-xhtml compliant.  Would Tidy do the job?  If so, how would I code such?
<%@ page import="java.io.*,java.net.*,java.text.*,java.util.*,javax.xml.parsers.*,javax.xml.xpath.*,org.w3c.dom.*,org.w3c.dom.*,org.xml.sax.*" %>
<%
String htm;
 
htm = "<html>" +
      "<body bgcolor='white'>" +
      "<head>" +
      "<title>Hello World</title>" +
      "</head>" +
      "</body>" +
      "</html>";
 
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
factory.setIgnoringElementContentWhitespace(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(htm)));
document.getDocumentElement().normalize();
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xpath.evaluate("//title/text()",document,XPathConstants.NODESET);
 
if (nodeList.getLength() > 0) {
  for (int i = 0; i < nodeList.getLength(); i++) {
    out.print(nodeList.item(i).toString());
  }
}else{
  out.print("not found");
}
%>

Open in new window

0
Comment
Question by:arichexe
  • 3
4 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 24829670
0
 

Author Comment

by:arichexe
ID: 24830927
How would I modify my code to utilize Tidy?
<%@ page import="java.io.*,java.net.*,java.text.*,java.util.*,javax.xml.parsers.*,javax.xml.xpath.*,org.w3c.dom.*,org.w3c.dom.*,org.xml.sax.*" %>
<%
String htm;
 
htm = "<html>" +
      "<body bgcolor='white'>" +
      "<head>" +
      "<title>Hello World</title>" +
      "</head>" +
      "</body>" +
      "</html>";
 
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
factory.setIgnoringElementContentWhitespace(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(htm)));
document.getDocumentElement().normalize();
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xpath.evaluate("//title/text()",document,XPathConstants.NODESET);
 
if (nodeList.getLength() > 0) {
  for (int i = 0; i < nodeList.getLength(); i++) {
    out.print(nodeList.item(i).toString());
  }
}else{
  out.print("not found");
}
%>

Open in new window

0
 
LVL 86

Accepted Solution

by:
CEHJ earned 800 total points
ID: 24831126
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 24902022
:-)
0

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

This was posted to the Netbeans forum a Feb, 2010 and I also sent it to Verisign. Who didn't help much in my struggles to get my application signed. ------------------------- Start The idea here is to target your cell phones with the correct…
Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.

580 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question