Solved

html to xml

Posted on 2004-09-26
9
237 Views
Last Modified: 2006-11-17
hi
I want to convert html pages to xml . is there any way that we can achieve this using java programming.

TIA
0
Comment
Question by:kousis
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
9 Comments
 
LVL 2

Accepted Solution

by:
Breadstick earned 100 total points
ID: 12155455
I'm not how you want the data converted, or if you're thinking about this the right way.
"XML was designed to describe data and to focus on what data is."
"HTML was designed to display data and to focus on how data looks."

http://www.w3schools.com/xml/xml_whatis.asp


Here's some tutorials on how to process XML with Java:
http://www.cafeconleche.org/books/xmljava/
http://www.javaworld.com/jw-03-2000/jw-03-xmlsax.html
http://www.bearcave.com/software/java/xml/
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12156036
You can convert html pages to xhtml (a kind of xml) using JTidy
0
 
LVL 4

Expert Comment

by:aratani
ID: 12157382
HTML is a form of XML if you think of it since it have opening and closing tags. There are some tags in HTML that don't close; like <img> and <br>. So, to overcome this there is a new form of HTML coming up where everything is well-formed ie XHTML.

Why would you want to do this though?

AJ
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 86

Expert Comment

by:CEHJ
ID: 12158262
>>HTML is a form of XML

No, it isn't actually. xhtml *is* though
0
 
LVL 21

Expert Comment

by:MogalManic
ID: 12158913
Look in to tidy (http://www.w3.org/People/Raggett/tidy/).  It is a tool that cleans up html files.  To convert the html files into XHTML just issue the following command:
   tidy -asxhtml file.html
The product is available in many forms (including JTIDY which is the Java version).  The product is not perfect and you will still have to manually edit the files.

If you want to convert the data contained in the HTML, here is one process that might work (assuming the data is in tabular form).

  1) load the HTML pages into excel
  2) Remove unnecessary rows/columns
  3) Save the file as CSV
  4) Write a process to convert the CSV to XML format.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12159281
JTidy has already been mentioned ;-)
0
 

Author Comment

by:kousis
ID: 12161168
my html pages changes, is ti possible to write code to generate xml pages.
0
 
LVL 4

Expert Comment

by:aratani
ID: 12161391
you could probably issue the jtidy command above by MogalManic on the fly to dynamically generally XHTML content.

AJ
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 12163845
>>is ti possible to write code to generate xml pages.

Yes, but what have you got in mind?

You could also look at the Neko html parser:

http://www.apache.org/~andyc/neko/doc/html/
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Should localization be done inside spring controller 5 37
Android development question 2 78
Problem to Alipay 10 70
servlet and mdb, jms error 1 57
Are you developing a Java application and want to create Excel Spreadsheets? You have come to the right place, this article will describe how you can create Excel Spreadsheets from a Java Application. For the purposes of this article, I will be u…
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question