Java http client - automated web browsing


I asked this question in the Web Languages section previosuly, but didn't get any replies.

I decided in any case that I would like to go with a Java implementation if possible, since I am familiar with this language.  Does anyone know of any existing libraries or classes that would make something like this easier?  In particular classes that allow establishment of an http connection, storing of cookies, etc.  I was thinking about using httpUnit.  Is this a good choice?

Original question:

I need to automate browsing of a particular site.  That is, I must be able to programatically download the pages associated with the site for parsing and analysis.  Additionally, I need to be able to fill out and submit forms in an automated way, as well as support cookies (the site might require the information in the cookie in order to provide context for certain pages).

There are probably several ways to crack this egg, so in the case where there are several arguably equal solutions, I would prefer the ones that leverage the languages or technologies that I am familar with: C++, Java, Sockets.  I have superficial familarity with html, javascript, VB etc.  This application needs to run on the Win32 platform.

Thanks in advance for your suggestions.
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Yes httpunit seems like a good match
travdAuthor Commented:
Looking at those links, it seems there are three good possibilites - jWebUnit, htmlUnit and httpUnit.   Does anyone have any experience with any of these?  Is it possible to use these without JUnit - I'm not really running a unit test, but instead trying to have automatic form submission, combined with harvesting and processing the returned data.
I've used httpUnit to do the first part of what you are asking (automatically accessing a web site and then parsing the result to see if I've got the data I want).  It works fine for this.

The WebResponse object that you get back (in response to a WebRequest), allows you to identify Forms, Links, Tables etc. fairly easily.

It can also handle Cookies and you should be able to submit Form data back to the server as necessary.

httpUnit requires JUnit, since it is effectively an extension to it (but you should look into this too because it's very useful to automatically test your Java applications ;-))

I haven't used jWebUnit or htmlUnit.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

travdAuthor Commented:
OK, so I need to have JUnit, but I just want to run this as a normal application - I don't need to actually run it as a unit test, do I?

> I don't need to actually run it as a unit test

just a matter of semantics isn't it.
That's right, just because you have the facilities availble to do "testing", doesn't mean you need to use them.  The application should just run via a normal "main" method.  Just don't use the "assert..." methods ;-)
travdAuthor Commented:
Thanks for your help, I'll go ahead and use http unit. By concern over using the unit test framework was that I didn't want to invoke it using junit and have to see the graphical fail/pass GUI or anything like that - I just want this to be a straight java app.

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.