Solved

project to keep track of webpages

Posted on 2011-09-22
4
299 Views
Last Modified: 2012-05-12
I want to develop a project in java & may be in Linux to download web pages n keep track of those web pages which are changing.

First how to download the pages?

Next how do I come to know whether a web page is changing or not?
For example , if i have www.google.com in my database and if I had subscribed for that page, so how do I come to know that google has changed ?

what are the parameters that I should look that I come to know whether google has changed when the server periodically downloads it ?

Do I need to calculate the md5 of the page?

Ideas please>>>>>>
0
Comment
Question by:sriramvemaraju2000
  • 3
4 Comments
 
LVL 86

Accepted Solution

by:
CEHJ earned 500 total points
ID: 36582219
Very few pages these days will be static. In the rare cases where they are, and where the web server supports it, you can check

http://www.feedthebot.com/ifmodified.html

Otherwise, yes, a checksumming approach would be one way. If the pages are small enough to hold in memory twice you can do a string comparison - it'll be faster
0
 

Author Comment

by:sriramvemaraju2000
ID: 36582333
can you tell me how to do this in java? what is the best approach?
particularly how to get the web pages and how to track the modifications?
I think this will be useful for security purposes.. I can tell whether my page has hacked or not..
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 36584115
Begin with a java open source web crawler and take it from there - that's what i'd do

http://java-source.net/open-source/crawlers
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 36585698
:)
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
throw exception 21 58
What is wrong with the below insert statement. Getting error when executing. 5 45
Running JavaFX on JDeveloper 12C 1 52
mysql jsp example issue 32 48
Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question