Akindo
asked on
Download webpage
I need to know how to get a java program to download a webpage given the address. Basically if I input an webaddress/html file to locate, I want the program to retrieve it.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
URL url;
InputStream is = null;
DataInputStream dis;
String s;
try {
url = new URL("http://test/index.html");
is = url.openStream();
dis = new DataInputStream(new BufferedInputStream(is));
while ((s = dis.readLine()) != null) {
System.out.println(s);
}
} catch (MalformedURLException mue) {
//do your errorhandling
} catch (IOException ioe) {
//do your errorhandling
} finally {
try {
is.close();
} catch (IOException ioe) {
}
}
InputStream is = null;
DataInputStream dis;
String s;
try {
url = new URL("http://test/index.html");
is = url.openStream();
dis = new DataInputStream(new BufferedInputStream(is));
while ((s = dis.readLine()) != null) {
System.out.println(s);
}
} catch (MalformedURLException mue) {
//do your errorhandling
} catch (IOException ioe) {
//do your errorhandling
} finally {
try {
is.close();
} catch (IOException ioe) {
}
}
Also this:
try {
// Create a URL for the desired page
URL url = new URL("http://hostname:80/index.html");
// Read all the text returned by the server
BufferedReader in = new BufferedReader(new InputStreamReader(url.open Stream())) ;
String str;
while ((str = in.readLine()) != null) {
// str is one line of text; readLine() strips the newline character(s)
}
in.close();
} catch (MalformedURLException e) {
} catch (IOException e) {
}
from: http://javaalmanac.com/egs/java.net/ReadFromURL.html
try {
// Create a URL for the desired page
URL url = new URL("http://hostname:80/index.html");
// Read all the text returned by the server
BufferedReader in = new BufferedReader(new InputStreamReader(url.open
String str;
while ((str = in.readLine()) != null) {
// str is one line of text; readLine() strips the newline character(s)
}
in.close();
} catch (MalformedURLException e) {
} catch (IOException e) {
}
from: http://javaalmanac.com/egs/java.net/ReadFromURL.html
URL url = new URL("www.msn.com");
URLConnection urlConnection = url.openConnection();
InputStream stream = urlConnection.getInputStre am();
InputStreamReader reader = new InputStreamReader(stream);
BufferedReader input = new BufferedReader(reader);
String html = "";
String line;
while((line = input.readLine()) != null)
html += line;
System.out.println(html);
URLConnection urlConnection = url.openConnection();
InputStream stream = urlConnection.getInputStre
InputStreamReader reader = new InputStreamReader(stream);
BufferedReader input = new BufferedReader(reader);
String html = "";
String line;
while((line = input.readLine()) != null)
html += line;
System.out.println(html);
mine is pretty much the same as girionis's, but it wasn't up there when I hit submit :)
Posting on a cached question also happens to me all the time :)
Hiya,
What do you want to do with the page you are retreiving? To get a handle on the page you could use a URLConnection in the following way:
URL url = new URL("http://website.whatever/page.html");
// open the URL Connection to the resource requested
HttpURLConnection urlcon = (HttpURLConnection) url.openConnection();
Then you can do any number of things with this, including reading and writing to the resource - check out the JDK APIs for more information on this!
eg. writing to resource:
urlcon.setDoOutput(true);
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(urlcon. getOutputS tream()));
// Write to the output stream
writer.write("Blah");
writer.flush();
writer.close();
urlcon.disconnect();
This should get you started, if you need anything more specific or there's any confusion give us a shout.
What do you want to do with the page you are retreiving? To get a handle on the page you could use a URLConnection in the following way:
URL url = new URL("http://website.whatever/page.html");
// open the URL Connection to the resource requested
HttpURLConnection urlcon = (HttpURLConnection) url.openConnection();
Then you can do any number of things with this, including reading and writing to the resource - check out the JDK APIs for more information on this!
eg. writing to resource:
urlcon.setDoOutput(true);
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(urlcon.
// Write to the output stream
writer.write("Blah");
writer.flush();
writer.close();
urlcon.disconnect();
This should get you started, if you need anything more specific or there's any confusion give us a shout.
Ooops, me to, I was doing other things and had the page open for ages (then again I've not really said how to read the page in!)
:) so much examples how to use: URLConnection class :))
it depends what are you planning to use downloaded html page for. if you need to get some info from it, or navigate through structure. You can use HtmlUnit or HttpUnit packages. They allow you to easily open page from URL, and return to you clear object model, so you easy get info you want.
it depends what are you planning to use downloaded html page for. if you need to get some info from it, or navigate through structure. You can use HtmlUnit or HttpUnit packages. They allow you to easily open page from URL, and return to you clear object model, so you easy get info you want.
Just pass the URL to this method >
//........................ .......... .......... .......... .......... .......... .......... .......... .......... .........
public void fetchURL(String s) {
try {
URL u = new URL(s);
try {
Object o = u.getContent();
if (o instanceof InputStream) {
showText((InputStream) o);
} else {
//showText(txtField.getTex t()); // you dont need this overloaded method so its gone.
}
} catch (IOException e) {
e.printStackTrace();
showText("Could not connect to " + u.getHost());
} catch (NullPointerException e) {
e.printStackTrace();
showText("There was a problem with the content.");
}
} catch (MalformedURLException e) {
e.printStackTrace();
showText(txtField.getText( ) + " is not a valid URL");
}
}
//........................ .......... .......... .......... .......... .......... .......... .......... .......... .........
//........................ .......... .......... .......... .......... .......... .......... .......... .......... .........
public void showText(InputStream is) {
String nextline = null;
txtArea.setText("");
try {
DataInputStream dis = new DataInputStream(is);
while ((nextline = dis.readLine()) != null) {
txtArea.appendText(nextlin e + "\n");
}
} catch (IOException e) {
e.printStackTrace();
txtArea.appendText(e.toStr ing());
}
}
//........................ .......... .......... .......... .......... .......... .......... .......... .......... .........
//........................
public void fetchURL(String s) {
try {
URL u = new URL(s);
try {
Object o = u.getContent();
if (o instanceof InputStream) {
showText((InputStream) o);
} else {
//showText(txtField.getTex
}
} catch (IOException e) {
e.printStackTrace();
showText("Could not connect to " + u.getHost());
} catch (NullPointerException e) {
e.printStackTrace();
showText("There was a problem with the content.");
}
} catch (MalformedURLException e) {
e.printStackTrace();
showText(txtField.getText(
}
}
//........................
//........................
public void showText(InputStream is) {
String nextline = null;
txtArea.setText("");
try {
DataInputStream dis = new DataInputStream(is);
while ((nextline = dis.readLine()) != null) {
txtArea.appendText(nextlin
}
} catch (IOException e) {
e.printStackTrace();
txtArea.appendText(e.toStr
}
}
//........................
this should be
while (data != -1)