Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 839
  • Last Modified:

Simulate a web browser to browse a web site

Hi. I have a requirement to write a Java application that does the following:

- Simulate that I browsed to a certain web page that has a login form and post a given username and password as if I entered them and pressed Login.
- Next, maintain the session so it can get the contents  of some pages. This is also needed to be done through posting form parameters and get the returned content.


Anyone has an idea how to do it?

I think Jakarta's HttpClient fits here but i need some help about how to use it in a case like this.

Thanks in advance to every contributer.
0
tech_lover
Asked:
tech_lover
1 Solution
 
aozarovCommented:
I think jakarta Jmeter is a better fit for this task -> http://jakarta.apache.org/jmeter/
the website has a good manual but you can also read about it as a nice tutorial at : http://www.onjava.com/lpt/a/3066

Another option is to use httpunit http://httpunit.sourceforge.net/doc/cookbook.html 
httpunit is more suitable for Junit style testing and Jmeter is more suitable for load simulation and performance testing
0
 
Mig-OCommented:
Webclient.java:

/*
 * Created on 08.01.2004
 *
 * To change the template for this generated file go to
 * Window>Preferences>Java>Code Generation>Code and Comments
 */
package de.nurgenial.utils.net.webclient;

import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Enumeration;
import java.util.Hashtable;

import org.apache.commons.httpclient.HostConfiguration;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.HttpMethodBase;
import org.apache.commons.httpclient.HttpState;
import org.apache.commons.httpclient.NameValuePair;
import org.apache.commons.httpclient.methods.EntityEnclosingMethod;
import org.apache.commons.httpclient.methods.GetMethod;
import org.apache.commons.httpclient.methods.PostMethod;

import de.nurgenial.utils.streams.StreamUtils;

/**
 * Diese Klasse simuliert einen Browser samt kompletten Kennungen. Sie
 * verwendet die verschiedenen Protokoll-Clients unter
 * de.nurgenial.utils.net.protocols. Der Browser kann Cookies annehmen,
 * redirects automatisch verfolgen, Proxies verwenden, und den Referrer
 * mitübermitteln.
 */
public class WebClient {
      
      private WebClientConfig config = new WebClientConfig();
      private String previousLocation = null;
      private String currentLocation = "";
      private String currentReferer = null;
    private HttpClient httpClient = new HttpClient();
    private ArrayList cookies = new ArrayList();
    private HttpMethodBase lastMethod;
   
      public void setConfig(WebClientConfig config) {
            this.config = config;
      }

      public WebClientConfig getConfig() {
            return config;
      }
      
      public GetMethod doGet(String location) throws HttpException, IOException  {
        return doGet(location,null);
      }

      public GetMethod doGet(String location, Hashtable parameters) throws HttpException, IOException  {

            previousLocation = currentLocation;
 
            // Location absolut machen
            location = resolvNewLocation(location);
 
            // Connection erzeugen und configurieren            
            GetMethod get = new GetMethod(location);
            get.setRequestHeader("User-Agent","Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031225 Firebird/0.7");
 
            HttpState state = httpClient.getState();      
            HostConfiguration hostConfig = httpClient.getHostConfiguration();        
            hostConfig.setProxy(config.getProxyHost(),config.getProxyPort());
            System.setProperty("org.apache.commons.logging.simplelog.defaultlog","info");
            
            //httpClient.setConnectionTimeout(3000);
            httpClient.setHostConfiguration(hostConfig);
            httpClient.setState(state);
               
            // Verbindung aufbauen
            httpClient.executeMethod(get);            
 
            // Referer für das nächste Mal setzen, wenn erfolgreich
            previousLocation = currentLocation;
            currentLocation = location;
            
            lastMethod = get;
            return get;
      }
      
      public GetMethod doPost(String location, Hashtable parameters) throws HttpException, IOException  {

            previousLocation = currentLocation;

            // Location absolut machen
            
            location = resolvNewLocation(location);
 
            // Connection erzeugen und configurieren
            PostMethod post = new PostMethod(location);
            post.setFollowRedirects(config.isFollowRedirects());
            post.setRequestHeader("User-Agent","Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031225 Firebird/0.7");
            post.setRequestBody(convertToNameValuePair(parameters));
            
            HttpState state = httpClient.getState();
            HostConfiguration hostConfig = httpClient.getHostConfiguration();
       
            hostConfig.setProxy(config.getProxyHost(),config.getProxyPort());
            System.setProperty("org.apache.commons.logging.simplelog.defaultlog","debug");
            httpClient.setConnectionTimeout(15000);
            httpClient.setHostConfiguration(hostConfig);
            httpClient.setState(state);

            // Verbindung aufbauen
          int resultcode = httpClient.executeMethod(post);
          
          // Evtl. Redirect
          if( config.isFollowRedirects() && resultcode >= 300 && 
                       resultcode < 400) {
              String newLocation = post.getResponseHeader("location").getValue();
              if( resultcode == 302 ) {
                    return doGet(newLocation);
                    
              }
                int redirectRetries = 10;
                while( resultcode >= 300 && 
                       resultcode < 400 && 
                       redirectRetries-- > 0) {
                        currentLocation=newLocation;
                    System.out.println("Redirecting from " + location + " to "+newLocation);  
                      post.setPath( newLocation );
                      post.recycle();                
                        resultcode = httpClient.executeMethod(post);
                }
            }
 
            // Referer für das nächste Mal setzen, wenn erfolgreich
            previousLocation = currentLocation;
            currentLocation = location;
            
            lastMethod = post;
            return post;
      }
      
      public String getLocation() {
            return currentLocation;            
      }
      
      public void saveLocationAs(String location, String filename) throws HttpException, IOException {
            StreamUtils.saveInputStreamTo(
                doGet(location).getResponseBodyAsStream(),filename);
      }
      
      public InputStream getLocationAsInputStream(String location) throws HttpException, IOException {
            return doGet(location).getResponseBodyAsStream();
      }
      
      public InputStream getLocationAsInputStream(String location, Hashtable parameters) throws HttpException, IOException {
            return doGet(location, parameters).getResponseBodyAsStream();
      }
      
      public InputStream postLocationAsInputStream(String location, Hashtable parameters) throws HttpException, IOException {
        return doPost(location, parameters).getResponseBodyAsStream();
      }
      
      
      /* ---------------------------- */
      
      public void setReferrer(String referer) {
            currentReferer = referer;
      }
      
      public String getResponseHeaderField(String name) {
            return lastMethod.getResponseHeader(name).getValue();
      }
      
      /* -------------------- INTERNALS ----------------------- */
      
      
      private String resolvNewLocation(String newLocation) {
            // Test if newLocation is absolute with protocol
            if( newLocation.indexOf("://")!=-1 ) {
                  return newLocation;
            }
            // Test if newLocation is just absolute on same protocol
            if( newLocation.startsWith("//") ) {
                  return "http:" + newLocation;
            }
            // Test if newLocation is just absolute on same server
            if( newLocation.startsWith("/") ) {
                  return getServerPartFromLocation(currentLocation) + newLocation;
            }
            // return location that is based on relative path
            String currentLocationsPath = getPathFromLocation(currentLocation);
            newLocation = currentLocationsPath + newLocation;      

            if( newLocation.indexOf("://")==-1 ) {
                  newLocation = "http://" + newLocation;
            }
            return newLocation;
      }
      
      private String getPathFromLocation(String location) {
            if( location.indexOf("://")!=-1 ) {
                  if( location.lastIndexOf("/") < 8 ) {
                        return location + "/";
                  } else {
                        return location.substring(0,location.lastIndexOf("/")) + "/";
                  }       
            } else {
                  if( location.indexOf("/") == -1 ) {
                        if( location.equals("") ) {
                              return "";
                        } else {
                              return "http://" + location + "/";
                        }
                  } else {       
                        return "http://" + location.substring(0,location.lastIndexOf("/") )+ "/";
                  }
            }
      }
      
      private String getServerPartFromLocation(String location) {
            if( location.indexOf("://")!=-1 ) {
                  if( location.lastIndexOf("/") < 7 ) {
                        return location + "/";
                  } else {
                        return location.substring(0,location.indexOf("/",7));
                  }       
            } else {
                  if( location.indexOf("/") == -1 ) {
                        if( location.equals("") ) {
                              return "";
                        } else {
                              return "http://" + location + "/";
                        }
                  } else {       
                        return "http://" + location;
                  }
            }
      }
      
      private NameValuePair[] convertToNameValuePair(Hashtable table) {
            NameValuePair[] retVal = new NameValuePair[table.size()];
            Enumeration enum = table.keys();
            for( int i=0; i<table.size(); i++ ) {
                  String nextKey = (String)enum.nextElement();                  
                  retVal[i] = new NameValuePair(nextKey, (String)table.get(nextKey));
            }
            return retVal;
      }
      
      /* ---------------------- TESTS ------------------------- */
      
      public static void main(String args[]) throws Exception {
            //testSaveToFile();
            //testProxy();
            //testReferrer();
            testHTTPS();
      }
      
      public static void testSaveToFile() {
            System.out.println("Saving www.Mig-O.de to /home/Mig-O/test.html ...");
            WebClient client = new WebClient();
            WebClientConfig config = client.getConfig();
            try {
                  client.saveLocationAs("134.106.121.92/index.html","/home/Mig-O/test.html");
            } catch( Exception ex ) {
                  System.out.println("Could not save Document: "+ex);
            }    
            System.out.println("done.");
      }
      
      public static void testProxy() {
            System.out.println("Saving www.Mig-O.de to /home/Mig-O/test.html using Proxy 134.106.121.2 ...");
            WebClient client = new WebClient();
            WebClientConfig config = client.getConfig();
            config.setProxy("134.106.121.2:81");
            try {
                  client.saveLocationAs("134.106.121.92/index.html","/home/Mig-O/test.html");
            } catch( Exception ex ) {
                  System.out.println("Could not save Document: "+ex);
            }    
            System.out.println("done.");
      }
      
      public static void testReferrer() {
            System.out.println("Saving www.Mig-O.de to /home/Mig-O/test.html using Referrer www.google.de ...");
            WebClient client = new WebClient();
            client.setReferrer("http://www.google.com");
            try {
                  client.saveLocationAs("134.106.121.92/index.html","/home/Mig-O/test.html");
            } catch( Exception ex ) {
                  System.out.println("Could not save Document: "+ex);
            }    
            System.out.println("done.");
      }
      
      public static void testHTTPS() {
            System.out.println("Saving https://service.gmx.net/de/cgi/nreg?AREA=2&TARIF=0&NT=1 to C:\test.html ...");
            WebClient client = new WebClient();
            WebClientConfig config = client.getConfig();
            try {
                  client.saveLocationAs("https://service.gmx.net/de/cgi/nreg?AREA=2&TARIF=0&NT=1","C:\\test.html");
            } catch( Exception ex ) {
                  System.out.println("Could not save Document: "+ex);
            }    
            System.out.println("done.");
            
      }
      
      
}
0
 
Mig-OCommented:
Download the whole source code (12 classes) from here: www.Mig-O.de/webscraper.zip
It contains my full webscraper, using httpclient from apache, and providing functions for filling out forms, extraciting links, navigating through the documet, parsing it with jtidy and so on...

Should be more than enough for a short "how to do it?" :)
0
 
edwardiiiCommented:
Hi, tech lover.

Regarding logging in programmatically using Jakarta's HttpClient, the following link might be helpful.  It's found in the Sample Code section as "FormLoginDemo.java" (http://svn.apache.org/viewcvs.cgi/jakarta/commons/proper/httpclient/branches/HTTPCLIENT_2_0_BRANCH/src/examples/FormLoginDemo.java?view=markup).
0
 
tech_loverAuthor Commented:
Thanks Mig-O. It worked for me.
0

Featured Post

Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now