[Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 331
  • Last Modified:

retriving information using wget command in Java

Hello, I am trying to write a program using the wget command in java that goes to a website picks a link in that website and click it than goes to the second link clicks it and goes to the third link and grabs a particular string.

An example would be to go to http://www.w3schools.com/ click on "learn SQL" link than click on the "SQL SELECT TOP" link and grab a particular string from that page. I am not sure how to write it.
0
yescobar2012
Asked:
yescobar2012
  • 6
  • 3
  • 2
2 Solutions
 
BAKADYCommented:
I don't think wget is your best solution. it isn't available at macs or windows by default.
Use Apache Java Frameworks to powered your application.

use a httpclient to make page requests like:

http://hc.apache.org/httpcomponents-client-ga/index.html

and a html parse to load links like this:

http://tika.apache.org/1.3/parser.html

Regards
0
 
CEHJCommented:
BAKADY is right - wget is not the right tool. HttpClient though is not really the right tool - it's too low level. You'll find you have to write much less code with something like HtmlUnit
0
 
yescobar2012Author Commented:
Oh I see... how about HTMLParser
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
CEHJCommented:
EDITED
0
 
CEHJCommented:
Oh I see... how about HTMLParser
What's that?
0
 
yescobar2012Author Commented:
It is similar to this parsing i found some good example of what i was looking for

http://www.mkyong.com/java/jsoup-html-parser-hello-world-examples/
0
 
CEHJCommented:
You can use that but there's no need to go so low-level
0
 
yescobar2012Author Commented:
you had recommended to use HtmlUnit, if I use HtmlUnit can i navigate thru a website  or traverse the website? As what I had mention in my question.

An example would be to go to http://www.w3schools.com/ click on "learn SQL" link than click on the "SQL SELECT TOP" link and grab a particular string from that page. I am not sure how to write it.

Would you have any working examples I can take a look at that navigate thru a website (from a website navigates to a child website and a child website and grabs a String?
0
 
CEHJCommented:
if I use HtmlUnit can i navigate thru a website  or traverse the website?
Yes - or i would not have mentioned it ;)

http://htmlunit.sourceforge.net/gettingStarted.html
0
 
BAKADYCommented:
you need a Framework - learn is a hard work - or Basis-Knowledge about HTTP and HTML and build it from scratch.

if you know what you need and what you are doing, it isn't complicate. Something like a little http/https proxy can works with only 300 lines code - including comments and no line bigger that 50 chars.

good luck
0
 
CEHJCommented:
:)
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 6
  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now