Solved

Need help parsing html into JDK1.4.2's HTML DOM

Posted on 2004-04-21
11
277 Views
Last Modified: 2013-11-23
I am writing an app that culls information from a website such as forms or pertinent information.  I need to grab the html, parse it into a tree structure such as JDK1.4's HTMLDocument or my own, have the app generate a gui off the model, gather user input, update the model, and submit the results back through the website.  I have looked into using regular expressions to parse the site, but am finding it to be too complex, not the parsing part if I know what I'm looking for, but in looping through the nested tables and mapping the inputs to java components.  I learned recently of the org.w3c.dom.html packages in jdk1.4.2, but it does not support the full dom2 specification which seems to be what I am looking for.  On top of that, I can't figure out how to parse the html into the htmDOM, let alone how to update the model with user input and submit the results.  I don't have any guarantees that the html is well formed, and the parsing must be pretty fast.

Any help/examples would be greatly appreciated. Thanks.
0
Comment
Question by:tigress298
  • 4
  • 3
  • 2
  • +1
11 Comments
 
LVL 92

Expert Comment

by:objects
ID: 10883425
Try HTMLEditorKit
0
 
LVL 92

Expert Comment

by:objects
ID: 10883428
0
 
LVL 92

Expert Comment

by:objects
ID: 10883429
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
LVL 86

Accepted Solution

by:
CEHJ earned 500 total points
ID: 10884016
>>but it does not support the full dom2 specification which seems to be what I am looking for.  

You would be probably better off with http://www.apache.org/~andyc/neko/doc/html/
0
 

Author Comment

by:tigress298
ID: 10888669
I can't use any third party software for this task either.
0
 
LVL 23

Expert Comment

by:rama_krishna580
ID: 10889487
0
 
LVL 23

Expert Comment

by:rama_krishna580
ID: 10889507
try this also...

http://www.html2xml.nl/Services/html2xml/version1/Html2Xml.asmx?op=Url2XmlNode

which can parse and verify u r html document and display...you implement the webservice..have a look..

best of luck..

R.K.
0
 

Author Comment

by:tigress298
ID: 10891323
The webservices site is really great, but as what I'm working on will eventually go into a classified arena, I can't utilize anything web-based or 3rd Party.  I really need to get ahold of some open source code or use native api's to convert poor formed html to well formed xml, or use a native java parser to parse potentially poor formatted html directly.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 10891762
>>.  I really need to get ahold of some open source code

I thought you couldn't use 3rd-party apis? What you've just described is a perfect description of what lies at the link i posted!
0
 
LVL 92

Expert Comment

by:objects
ID: 10894456
> it does not support the full dom2 specification which seems to be what I am looking for.

have you tried HTMLEditorKit? worth trying to see how it performs.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 10922219
8-)
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
object oriented programming comparison 5 77
Java exception bubble up 2 21
by zero exception 10 52
ejb entity bean example 2 7
An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.

789 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question