Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Download webpage

Posted on 2003-11-21
11
Medium Priority
?
672 Views
Last Modified: 2013-12-16
I need to know how to get a java program to download a webpage given the address. Basically if I input an webaddress/html file to locate, I want the program to retrieve it.
0
Comment
Question by:Akindo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
  • +3
11 Comments
 
LVL 35

Accepted Solution

by:
girionis earned 500 total points
ID: 9795493
 Just open a URLConnection to the site and read from the input stream. *Something* (syntax is off the top of my head) along the lines:

  URLConnection urlcon = new HTTPURLConnection(new URL("http://www.yahoo.com"))
  urlcon.connect();
  InputStream is = urlcon.getInputStream();
  int data = is.read();
  while (data != 0)
  {
    data = is.read();
  }

  urlcon.disconenct();
0
 
LVL 35

Expert Comment

by:girionis
ID: 9795497
>  while (data != 0)

  this should be

 while (data != -1)
0
 
LVL 2

Expert Comment

by:DidierD
ID: 9795502
URL url;
InputStream is = null;
DataInputStream dis;
String s;

      try {

         url = new URL("http://test/index.html");
         is = url.openStream();      
         dis = new DataInputStream(new BufferedInputStream(is));
         while ((s = dis.readLine()) != null) {
            System.out.println(s);
         }
      } catch (MalformedURLException mue) {
         //do your errorhandling
      } catch (IOException ioe) {
        //do your errorhandling
      } finally {

         try {
            is.close();
         } catch (IOException ioe) {
         
         }

      }
0
Learn how to optimize MySQL for your business need

With the increasing importance of apps & networks in both business & personal interconnections, perfor. has become one of the key metrics of successful communication. This ebook is a hands-on business-case-driven guide to understanding MySQL query parameter tuning & database perf

 
LVL 35

Expert Comment

by:girionis
ID: 9795503
 Also this:

 try {
        // Create a URL for the desired page
        URL url = new URL("http://hostname:80/index.html");
   
        // Read all the text returned by the server
        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
        String str;
        while ((str = in.readLine()) != null) {
            // str is one line of text; readLine() strips the newline character(s)
        }
        in.close();
    } catch (MalformedURLException e) {
    } catch (IOException e) {
    }

  from: http://javaalmanac.com/egs/java.net/ReadFromURL.html
0
 
LVL 5

Expert Comment

by:lwinkenb
ID: 9795513
URL url = new URL("www.msn.com");
URLConnection urlConnection = url.openConnection();
InputStream stream = urlConnection.getInputStream();
InputStreamReader reader = new InputStreamReader(stream);
BufferedReader input = new BufferedReader(reader);

String html = "";
String line;

while((line = input.readLine()) != null)
  html += line;

System.out.println(html);
0
 
LVL 5

Expert Comment

by:lwinkenb
ID: 9795517
mine is pretty much the same as girionis's, but it wasn't up there when I hit submit :)
0
 
LVL 35

Expert Comment

by:girionis
ID: 9795520
 Posting on a cached question also happens to me all the time :)
0
 

Expert Comment

by:pjgould
ID: 9795575
Hiya,

What do you want to do with the page you are retreiving? To get a handle on the page you could use a URLConnection in the following way:

    URL url = new URL("http://website.whatever/page.html");
                  
    // open the URL Connection to the resource requested
    HttpURLConnection urlcon = (HttpURLConnection) url.openConnection();

Then you can do any number of things with this, including reading and writing to the resource - check out the JDK APIs for more information on this!

eg. writing to resource:

    urlcon.setDoOutput(true);
                  
    BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(urlcon.getOutputStream()));
                  
    // Write to the output stream
    writer.write("Blah");
    writer.flush();
    writer.close();

    urlcon.disconnect();

This should get you started, if you need anything more specific or there's any confusion give us a shout.
0
 

Expert Comment

by:pjgould
ID: 9795630
Ooops, me to, I was doing other things and had the page open for ages (then again I've not really said how to read the page in!)
0
 
LVL 15

Expert Comment

by:dualsoul
ID: 9795896
:) so much examples how to use: URLConnection class :))

it depends what are you planning to use downloaded html page for. if you need to get some info from it, or navigate through structure. You can use HtmlUnit or HttpUnit packages. They allow you to easily open page from URL, and return to you clear object model, so you easy get info you want.
0
 
LVL 16

Expert Comment

by:krakatoa
ID: 9798683
Just pass the URL to this method >



//.................................................................................................................
    public void fetchURL(String s) {

        try {
            URL u = new URL(s);
            try {
                Object o = u.getContent();


                if (o instanceof InputStream) {
                    showText((InputStream) o);
                } else {

                    //showText(txtField.getText()); // you dont need this overloaded method so its gone.

                }

            } catch (IOException e) {
                e.printStackTrace();
                showText("Could not connect to " + u.getHost());
            } catch (NullPointerException e) {
                e.printStackTrace();
                showText("There was a problem with the content.");
            }

        } catch (MalformedURLException e) {
            e.printStackTrace();
            showText(txtField.getText() + " is not a valid URL");
        }
    }
//.................................................................................................................



//.................................................................................................................
    public void showText(InputStream is) {

        String nextline = null;

        txtArea.setText("");

        try {
            DataInputStream dis = new DataInputStream(is);
            while ((nextline = dis.readLine()) != null) {
                txtArea.appendText(nextline + "\n");
            }

        } catch (IOException e) {
            e.printStackTrace();
            txtArea.appendText(e.toString());
        }
    }
//.................................................................................................................


   
0

Featured Post

Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Java can be integrated with native programs using an interface called JNI(Java Native Interface). Native programs are programs which can directly run on the processor. JNI is simply a naming and calling convention so that the JVM (Java…
Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
The purpose of this video is to demonstrate how to set up basic WordPress SEO. This will be demonstrated using a Windows 8 PC. The plugin used will be WordPress SEO by Yoast. Go to your WordPress login page. This will look like the following: myw…
The purpose of this video is to demonstrate how to set up the permalinks on a WordPress Website. This will be demonstrated using a Windows 8 PC. Go to your WordPress login page. This will look like the following: mywebsite.com/wp-login.php : Go t…
Suggested Courses

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question