Link to home
Start Free TrialLog in
Avatar of sigma19
sigma19Flag for United States of America

asked on

How to browse from command prompt.

I want to write a program in java,
which takes aruguments, gets the output and takes some links from that.

Example:
My program should go to google.com
give some string I have and search.
once i get the output: I want to query the first 3 pages and get the data to console?

I used wget in link to get a page.
I used curl to put and get in linux.

is there easy way to do all the steps I mentioned in java
Avatar of for_yan
for_yan
Flag of United States of America image

YOu can use regular URLConnection and read and parse pages
or you can use HttpUnit
This is HttpUnit which has better facility to read Web pages

http://httpunit.sourceforge.net/
ASKER CERTIFIED SOLUTION
Avatar of for_yan
for_yan
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
One of the common ways to parse is to calean HTML using or number of similar packages like
HTML Tidy - and then parse cleaned HTML using XML parsing methods
you may also want to look into this..

http://htmlunit.sourceforge.net/
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
theres no need to parse html pages to get links from google search
theres already an api available
http://code.google.com/apis/customsearch/v1/overview.html