sunshine737
asked on
how to download a webpage including images,applets,..
Hi,
How can I write a routine that will download a web page and store it on my hard disk. It also has to download all associated objects including Images,Applets, etc ?
for eg:if i type "www.yahoo.com" ,i should download the complete webpage including all.
i have tried with the following code i can not able to get the images
URL url=new URL("http://www.kumudam.com");
InputStream conn=url.openStream();
BufferedReader input = new BufferedReader(new InputStreamReader(conn));
String line ;
byte data[]= new byte[1000];
int size = input.read();
System.out.println(size);
while ((line = input.readLine()) != null)
System.out.println(line);
input.close();
Thanks in advance
How can I write a routine that will download a web page and store it on my hard disk. It also has to download all associated objects including Images,Applets, etc ?
for eg:if i type "www.yahoo.com" ,i should download the complete webpage including all.
i have tried with the following code i can not able to get the images
URL url=new URL("http://www.kumudam.com");
InputStream conn=url.openStream();
BufferedReader input = new BufferedReader(new InputStreamReader(conn));
String line ;
byte data[]= new byte[1000];
int size = input.read();
System.out.println(size);
while ((line = input.readLine()) != null)
System.out.println(line);
input.close();
Thanks in advance
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
and when you get the image links as described by the link given by objects,
you can save the images to local file system like
http://forum.java.sun.com/thread.jsp?thread=433352&forum=31&message=1940583
basically you would be looking for
HTML.Attribute.IMG
you will need to find the <APPLET tag as well for downloading applets
you can save the images to local file system like
http://forum.java.sun.com/thread.jsp?thread=433352&forum=31&message=1940583
basically you would be looking for
HTML.Attribute.IMG
you will need to find the <APPLET tag as well for downloading applets
ASKER
hi objects,
i have seen the same code and tried i am getting javax.swing.text.changedch arsetexcep tion,i cannot able to rectify.
pls help me :)
i have seen the same code and tried i am getting javax.swing.text.changedch
pls help me :)
doc.putProperties("IgnoreC haracterSe t", new Boolean(true));
ASKER
hi,
this HTML parser is not working for all websites.pls help me out :)
this HTML parser is not working for all websites.pls help me out :)
yes if it not standard html, or uses recent features it will have problems.
ASKER
hi objects,
what to do in this case? any idea..
what to do in this case? any idea..
you need to look at a different parser, perhaps a commercial one.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
When a browser access a page, it loads the page and all of its assocated objects through a series of GETs. The first get is the HTML of course. Then the browser parses the HTML and as it encounters a tag that refers to an external resource it issues another GET to retrieve and render the resource. Your code would have to do something simalar.