parsing question

Posted on 2004-09-02
Medium Priority
Last Modified: 2010-03-31
hi im trying to get a simple parser going below is my bit of code, I m to read all the contents of an url and store it in a database, the question is how do i know when i reached the end of a page? so that i know when to store everything in my database and move on to the next link

 public void handleStartTag(HTML.Tag t,MutableAttributeSet a, int p)
               if (t == HTML.Tag.A)
                 ahreflink = (String)a.getAttribute(HTML.Attribute.HREF);

          if (t == HTML.Tag.TITLE)


            public void handleText(char[] data, int pos)
                title = new String(data);
                content =new String(data);
                       text = text + " " + content;                   
                         System.out.println("Title: "+ title);

                  }catch(Exception p){p.printStackTrace();}                          
            }//end of handleText

Question by:HomerrSimpson
  • 2

Assisted Solution

primusmagestri earned 60 total points
ID: 11962861
Look for the html end tag: </html>. After this tag you can, at most, have some comments.
LVL 35

Accepted Solution

TimYates earned 240 total points
ID: 11962896
public void handleEndTag( HTML.Tag t, int pos )

Author Comment

ID: 11963108
do you mean something like

public void handleEndTag(HTML.Tag t, int pos)
   if (t == HTML.Tag.HTML)

    store "text" in database

LVL 35

Expert Comment

ID: 11963367
yup...that should do it...

Featured Post

The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collectio…
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.
Suggested Courses

600 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question