How to re-write this code with httpclient and/or httpURLconnection instead of httpunit?

Hi,
I have the following code which uses com.meterware.httpunit package.

How can I rewrite this code using httpclient and/or httpURLconnection instead of httpunit package?

Note: This code parses 2nd table in the web page and generates something like:

text1 --> texta
text2 --> textb
text3 --> textc

The source of this page looks like this:
		   *  <tr>
		   *  <td><a href="somepath/check.html#check_del_dir">del_dir</a>
		   *  <td><a href="report.check.log.html#del_dir"><font color=green>pass</font></a>
		   *  <tr>
		   *  <td><a href="somepath/check.html#check_type">type</a>
		   *  <td><a href="report.check.log.html#type"><font color=red>fail</font></a>

Open in new window



			try {
    		    HttpUnitOptions.setScriptingEnabled(false);
    		    WebConversation wc = new WebConversation();
    		    // Check if the URL is valid
    		    if (checkReportPath.startsWith("http://") && (checkReportPath.indexOf("/check/report.html") > -1)){
	    		    WebResponse wr = wc.getResponse(checkReportPath);
	    		    WebTable table = wr.getTables()[2];
	    		    java.lang.String[][] cells = table.asText();
	    		    
	    		    for(java.lang.String[] row : cells) {
	    		    	boolean column = true;
	    				for(java.lang.String cell : row) {
	    					if (column){
	        			    	checkActivity.append(cell).append(" --> ");
	        			    	column = false;
	    					}
	    					else {
	    						checkActivity.append(cell);
	    					}
	    				}
	    				checkActivity.append("\n");
	    			}	
    		    }
    		}

Open in new window

TolgarAsked:
Who is Participating?
 
for_yanCommented:
0
 
for_yanCommented:
I recently did just that.
This is how it was (I now commented it out):

//           HttpUnitOptions.setScriptingEnabled(true);
    //     HttpUnitOptions.setExceptionsThrownOnScriptError( false );
    //     WebConversation wc = new WebConversation();
    //    String urlString = "http://your_target_address.com";
//   WebRequest req = new GetMethodWebRequest( urlString );
           //       response = wc.getResponse( req );

              //    String result =  response.getText();

Open in new window


This is how it is now without HttpUnit:

                    String result = "";
                     URL my_source = new URL("http://your_target_address.com");
          URLConnection yc = yahoo.openConnection();
          BufferedReader in = new BufferedReader(
                                  new InputStreamReader(
                                  yc.getInputStream()));
          String inputLine;

          while ((inputLine = in.readLine()) != null)
            //  System.out.println(inputLine);
                     result += inputLine + System.getProperty("line.separator");
          in.close();

Open in new window


0
 
for_yanCommented:
I guess, your main point is how to parse the tables.
And HttpUnit does provide some conveniences there, that is true,
but with HttpUnit I recently experienced OutOfMemory
errors after some time of operation,
and I had to switch to standard Java classes.

There is a number of HTML parsers around, perhaps they can help to manipulate
the output and get table data; there was recently a question with a list
of such parsers. Let me try to find it.
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
TolgarAuthor Commented:
Sorry,  I couldn't get why you used yahoo.openConnection in this example.

Thanks,
0
 
TolgarAuthor Commented:
Yes, I need to parse the second table. If you can find a better parser it would be great.

Thanks,
0
 
for_yanCommented:
I'm sorry:
yahoo.openConnection()
should of course  be changed to
my_source.openConnection()
in this context (I changed in one place but noty in another)

I'm still looking for this question where there were links to many parsers, I should find it

0
 
for_yanCommented:


For some reason cannot find that question (I never can find anything using EE search - don't know why),
but these are some links with discussions of different parsers:

http://stackoverflow.com/questions/3152138/what-are-the-pros-and-cons-of-the-leading-java-html-parsers

http://www.benmccann.com/dev-blog/java-html-parsing-library-comparison/

Why do you want to go away from HttpUnit - are you experiencing something similar
which I had  with OutOfMemory issue?

0
 
TolgarAuthor Commented:
No. It is just a design decision. The reason is it may overload the code with some unnecessary stuff. It looks like it was written for unit testing.

0
 
for_yanCommented:

OK, then you probably don't want to use any parser, and want to somehow
parse it yourself.
I actually thought that parsing piece of HTttpUnit was fine, and I wouyld have been happy with it, but I had
this issue with OutOfMemory
0
 
CEHJCommented:
Why do you want to avoid HttpUnit?
0
 
TolgarAuthor Commented:
It was just a design decision.

I thought it may be so heavy for just parsing a source code/

Thanks,
0
 
objectsCommented:
0
 
CEHJCommented:
Those are precisely the apis that HttpUnit uses ;) The only way you're going to get faster is to do (probably) more work.

If you're lucky you might find a pure Java parse could work (use the following for the appropriate table tag(s)):

http://www.exampledepot.com/egs/javax.swing.text.html/GetLinks.html
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.