Solved

html to xhtml

Posted on 2009-07-10
4
252 Views
Last Modified: 2012-05-07
How would I convert the html string to xhtml before parsing it.  The below works fine, unless I take out the quotes around 'white', making it non-xhtml compliant.  Would Tidy do the job?  If so, how would I code such?
<%@ page import="java.io.*,java.net.*,java.text.*,java.util.*,javax.xml.parsers.*,javax.xml.xpath.*,org.w3c.dom.*,org.w3c.dom.*,org.xml.sax.*" %>
<%
String htm;
 
htm = "<html>" +
      "<body bgcolor='white'>" +
      "<head>" +
      "<title>Hello World</title>" +
      "</head>" +
      "</body>" +
      "</html>";
 
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
factory.setIgnoringElementContentWhitespace(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(htm)));
document.getDocumentElement().normalize();
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xpath.evaluate("//title/text()",document,XPathConstants.NODESET);
 
if (nodeList.getLength() > 0) {
  for (int i = 0; i < nodeList.getLength(); i++) {
    out.print(nodeList.item(i).toString());
  }
}else{
  out.print("not found");
}
%>

Open in new window

0
Comment
Question by:arichexe
  • 3
4 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 24829670
0
 

Author Comment

by:arichexe
ID: 24830927
How would I modify my code to utilize Tidy?
<%@ page import="java.io.*,java.net.*,java.text.*,java.util.*,javax.xml.parsers.*,javax.xml.xpath.*,org.w3c.dom.*,org.w3c.dom.*,org.xml.sax.*" %>
<%
String htm;
 
htm = "<html>" +
      "<body bgcolor='white'>" +
      "<head>" +
      "<title>Hello World</title>" +
      "</head>" +
      "</body>" +
      "</html>";
 
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
factory.setIgnoringElementContentWhitespace(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(htm)));
document.getDocumentElement().normalize();
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xpath.evaluate("//title/text()",document,XPathConstants.NODESET);
 
if (nodeList.getLength() > 0) {
  for (int i = 0; i < nodeList.getLength(); i++) {
    out.print(nodeList.item(i).toString());
  }
}else{
  out.print("not found");
}
%>

Open in new window

0
 
LVL 86

Accepted Solution

by:
CEHJ earned 200 total points
ID: 24831126
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 24902022
:-)
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
Viewers learn about the third conditional statement “else if” and use it in an example program. Then additional information about conditional statements is provided, covering the topic thoroughly. Viewers learn about the third conditional statement …
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question