Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 165
  • Last Modified:

how to download a webpage including images,applets,..

Hi,

How can I write a routine that will download a web page and store it on my hard disk.  It also has to download all associated objects including Images,Applets, etc ?

for eg:if i type "www.yahoo.com" ,i should download the complete webpage including all.

i have tried with the following code i can not able to get the images

                        URL url=new URL("http://www.kumudam.com");
                        InputStream conn=url.openStream();
              BufferedReader input = new BufferedReader(new InputStreamReader(conn));
                    String line ;
                        byte data[]= new byte[1000];
                        int size = input.read();
                        System.out.println(size);
                        while ((line = input.readLine()) != null)
                  
                            System.out.println(line);
                            input.close();
Thanks in advance
0
vihar123
Asked:
vihar123
  • 4
  • 3
  • 2
  • +1
2 Solutions
 
objectsCommented:
you'll need to parse the html, here's an example showing how to extract all the links.

http://www.javaalmanac.com/egs/javax.swing.text.html/GetLinks.html

You can then download the data pointed to in the link.
0
 
MogalManicCommented:
objects is right.

When a browser access a page, it loads the page and all of its assocated objects through a series of GETs.  The first get is the HTML of course.  Then the browser parses the HTML and as it encounters a tag that refers to an external resource it issues another GET to retrieve and render the resource.  Your code would have to do something simalar.
0
 
armoghanCommented:
and when you get the image links as described by the link given by objects,
you can save the images to local file system like

http://forum.java.sun.com/thread.jsp?thread=433352&forum=31&message=1940583

basically you would be looking for
HTML.Attribute.IMG

you will need to find the <APPLET tag as well for downloading applets


0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
vihar123Author Commented:
hi objects,

i have seen the same code and tried i am getting javax.swing.text.changedcharsetexception,i cannot able to rectify.

pls help me  :)
0
 
objectsCommented:
doc.putProperties("IgnoreCharacterSet", new Boolean(true));
0
 
vihar123Author Commented:
hi,
this HTML parser is not working for all websites.pls help me out :)
0
 
objectsCommented:
yes if it not standard html, or uses recent features it will have problems.
0
 
vihar123Author Commented:
hi objects,
what to do in this case? any idea..
0
 
objectsCommented:
you need to look at a different parser, perhaps a commercial one.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 4
  • 3
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now