HomerrSimpson
asked on
parsing question
hi im trying to get a simple parser going below is my bit of code, I m to read all the contents of an url and store it in a database, the question is how do i know when i reached the end of a page? so that i know when to store everything in my database and move on to the next link
public void handleStartTag(HTML.Tag t,MutableAttributeSet a, int p)
{
if (t == HTML.Tag.A)
{
ahreflink = (String)a.getAttribute(HTM L.Attribut e.HREF);
searchList.add(ahreflink);
}
if (t == HTML.Tag.TITLE)
{
titleFlag=true;
}
}
public void handleText(char[] data, int pos)
{
try{
title = new String(data);
content =new String(data);
if(titleFlag==false)
{
text = text + " " + content;
}
if(titleFlag==true)
{
System.out.println("Title: "+ title);
titleFlag=false;
}
}catch(Exception p){p.printStackTrace();}
}//end of handleText
public void handleStartTag(HTML.Tag t,MutableAttributeSet a, int p)
{
if (t == HTML.Tag.A)
{
ahreflink = (String)a.getAttribute(HTM
searchList.add(ahreflink);
}
if (t == HTML.Tag.TITLE)
{
titleFlag=true;
}
}
public void handleText(char[] data, int pos)
{
try{
title = new String(data);
content =new String(data);
if(titleFlag==false)
{
text = text + " " + content;
}
if(titleFlag==true)
{
System.out.println("Title:
titleFlag=false;
}
}catch(Exception p){p.printStackTrace();}
}//end of handleText
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
yup...that should do it...
ASKER
public void handleEndTag(HTML.Tag t, int pos)
{
if (t == HTML.Tag.HTML)
{
store "text" in database
}
}