Java Link Checker

Hi,looking to create a java link checker for urls, Want to return the URL if exists or not. If it exists, download its content, examine all the pages it links to, and display broken links. Want to put in place a check for this,

I am new to java , I know I have to use java.net and java.io packages,  below is what code I have to work with, I would like to improve on it if possible...

import java.io.*;
import java.net.*;

public class JavaGetUrl {

   public static void main (String[] args) {

              URL u;
      InputStream is = null;
      DataInputStream dis;
      String s;

      try {

                 u = new URL("http://www.msn.com");

                 is = u.openStream();         // throws an IOException

        
         dis = new DataInputStream(new BufferedInputStream(is));

                  

         while ((s = dis.readLine()) != null) {
            System.out.println(s);
         }

      } catch (MalformedURLException mue) {

         System.out.println("Ouch - a MalformedURLException happened.");
         mue.printStackTrace();
         System.exit(1);

      } catch (IOException ioe) {

         System.out.println("Oops- an IOException happened.");
         ioe.printStackTrace();
         System.exit(1);

      } {

                 try {
            is.close();
         } catch (IOException ioe) {
                     }

      } 

   }  

}

Open in new window

LVL 1
Indie101Asked:
Who is Participating?
 
VenabiliConnect With a Mentor Commented:
In doing what?
Write your own one? What for?
If you really want to see sourcew codes, just look through the links that DaveBaldwin provided - all of these are open source.

If you really insist on reinventing the wheel:
http://java.sun.com/developer/technicalArticles/ThirdParty/WebCrawler/ - the good old "Writing a Web Crawler in the Java Programming Language" artcile from Sun from 10 years ago. As valid as ever - even if you have better ways to write some of the code

http://www.cs.princeton.edu/introcs/72regular/WebCrawler.java.html is a good start

A few more articles I had bookmarked through the years:
http://www.devarticles.com/c/a/Java/Crawling-the-Web-with-Java/
http://andreas-hess.info/programming/webcrawler/index.html - "How to write a multi-threaded webcrawler"

happy reading...
0
 
Dave BaldwinFixer of ProblemsCommented:
Here's a list of open source java web crawlers: http://java-source.net/open-source/crawlers.  I'd look at those before I started writing my own.
0
 
Indie101Author Commented:
Thanks for the advice, looking for any pointers in doing this (with code supplied or more) tbh
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
Dave BaldwinFixer of ProblemsCommented:
That's why I gave you that link.  Open Source means the code is available for you to read and use.  Some may have a discussion about the code too.  At least one was an 'exercise' in doing it.
0
 
Indie101Author Commented:
Good answer, wasnt exactly what was looking for
0
 
VenabiliCommented:
>Author Comments:
Good answer, wasnt exactly what was looking for

What were you looking for? Unfortunatelly my crystal ball is still stuck at an airport in Europe so I cannot guess if you do not specdify anything.

Anyway - good luck.
0
 
Indie101Author Commented:
Well I posted code, and gave some pointers in general for what I wanted to do, An A would have been code examples rather than links to read, I appreciate your answer just in my noob java (I work in 3rd level support and doing Java by night) didnt think it was exactly what I was looking for, best of luck :-)
0
 
VenabiliCommented:
The problem is that fixing your code will require reinventing the wheel and as a newbie Java developer, there is no real point doing it :)

No worries for the grade - but you might want  to rethink the idea of doing all from scratch - I would usually start from an open source code (or one of the ones in my comment) and change it here and there if I need to :)

Anyway - good luck with your Java.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.