asked on

how to connect a servlet program to internet?.

Hi,
I have written a servlet program to scan the request ie what ever typed in the url address field i can able to receive in my program. is there any possibility to connect my servlet program to internet like

for eg: http://localhost:8084/pack/main/www.google.com

once i type www.gogle.com in the url after this local host and press enter na is there any possibility to connect to internet.

kindly help me:)

CEHJ

YEs. Open a url to the site you want

CEHJ

You'd be better off sending the page address as a form parameter

Mick Barry

you can just do a redirect to the specified url.

sunshine737

ASKER

hello objects,

how to redirect the page. pls give me some same code r link if possible.

ASKER CERTIFIED SOLUTION

Mick Barry

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Mick Barry

http://javaalmanac.com/egs/javax.servlet.jsp/redirect.html

sunshine737

ASKER

Hi objects,

Is there any possibilities to cache the page completely, html and images and all completely.

Mick Barry

not easily no.

sunshine737

ASKER

for me it is possible to read the html contents but not the images and all. can u help me:)

Mick Barry

you need to parse the html and find image tags and download images

following examples show you ways to parse html:

http://www.javaalmanac.com/egs/javax.swing.text.html/GetLinks.html
http://www.javaalmanac.com/egs/javax.swing.text.html/GetText.html

CEHJ

The client browser will cache the content anyway

sunshine737

ASKER

is there any software to download the webpages not the complete website.

Mick Barry

> The client browser will cache the content anyway

not really relevant

Mick Barry

> is there any software to download the webpages not the complete website.

the links i posted above will dowload the page contents

CEHJ

>>not really relevant

Why not?

sunshine737

ASKER

i need to have a seperate content of the pages browsed r else any way to read from the browser content

CEHJ

>>i need to have a seperate content of the pages browsed r else any way to read from the browser content

The browser just caches the files that have been requested

sunshine737

ASKER

no, i am trying to make an algorithm for that i need the pages browsed seperately in my table for that only i am asking any way to read r download the pages.

Mick Barry

Did u look at the links I posted earlier

sunshine737

ASKER

i tried but i dont understand what the code does,very difficult for me to understand the code

sunshine737

ASKER

hi objects,

Do u have link from any other site r different links if so pls give me.that code is difficult for me to understand.

Mick Barry

The code downlaods and parses a html page from a specific url allowing you to also download any additional resources referenced by page such as images.

sunshine737

ASKER

Hi objects,

i tried but i cannot able to retrive the HTML r the images. i am pasting my code.
can u pls check whether i have any mistake.

class parsing
{
      public static void main(String args[])
      {
            String s="http://www.google.com";

            String ss=getText(s);
            System.out.println(ss);

      }
      public static String getText(String uriStr) {
final StringBuffer buf = new StringBuffer(1000);

try {
// Create an HTML document that appends all text to buf
HTMLDocument doc = new HTMLDocument() {

public HTMLEditorKit.ParserCallback getReader(int pos) {
return new HTMLEditorKit.ParserCallback() {
// This method is whenever text is encountered in the HTML file
public void handleText(char[] data, int pos) {

buf.append(data);
buf.append('\n');
}
};
}
};

// Create a reader on the HTML content
URL url = new URI(uriStr).toURL();
URLConnection conn = url.openConnection();
System.out.println("entering");
Reader rd = new InputStreamReader(conn.getInputStream());

// Parse the HTML
EditorKit kit = new HTMLEditorKit();
kit.read(rd, doc, 0);
} catch (MalformedURLException e) {
} catch (URISyntaxException e) {
} catch (BadLocationException e) {
} catch (IOException e) {
}

// Return the text
return buf.toString();
}


}

Mick Barry

hang while i find some code, did a similiar thing for a client recenlty.

Mick Barry

Add the followining to your callback class where base is the context of the page being parsed (eg.http://www.google.com in your example above)

public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos)
{
try
{
if (t.equals(HTML.Tag.IMG))
{
// <img> tag

// Get the img src

String src = (String) a.getAttribute(HTML.Attribute.SRC);
URL u = new URL(base, src);
ImageIcon image = new ImageIcon(u);
}
}
catch (Exception ex)
{
System.out.println(ex+": "+u);
}
}
}