Link to home
Start Free TrialLog in
Avatar of sriramvemaraju2000
sriramvemaraju2000

asked on

project to keep track of webpages

I want to develop a project in java & may be in Linux to download web pages n keep track of those web pages which are changing.

First how to download the pages?

Next how do I come to know whether a web page is changing or not?
For example , if i have www.google.com in my database and if I had subscribed for that page, so how do I come to know that google has changed ?

what are the parameters that I should look that I come to know whether google has changed when the server periodically downloads it ?

Do I need to calculate the md5 of the page?

Ideas please>>>>>>
ASKER CERTIFIED SOLUTION
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of sriramvemaraju2000
sriramvemaraju2000

ASKER

can you tell me how to do this in java? what is the best approach?
particularly how to get the web pages and how to track the modifications?
I think this will be useful for security purposes.. I can tell whether my page has hacked or not..
Begin with a java open source web crawler and take it from there - that's what i'd do

http://java-source.net/open-source/crawlers
:)