Solved

I need write a script that will script a login process then grab the information from a page once logged in.

Posted on 2004-04-15
3
145 Views
Last Modified: 2010-04-06
Scenario:

I am making a screen scraping program that will gather information every day on a site.  The site does not have an rss feed so i cannot gather information that way, and their page is fairly static so this will allow me to screen scrape the information from the site.  I have proper permission to screen scrape and i have a valid user name and password, but my problem is that i do not know how to script a login.  

I need a script in vb.net or vbscript, that will basically do the following.

1. Open a web browser
2. Go to http://www.HowDoIDoThis.com/Login.asp // Arbitrary page
3. Some how type in a given user name and password
4. Submit the form
5. Once logged in, redirect to http://www.HowDoIDoThis.com/Information.asp // Arbitrary page
6. Grab all of the html from that page.

My problem is that trying to access the Information.asp page is protected, no users without passwords can go directly to it.  You have to login first, which sounds right because they want only their paying customers to view the content which i am so i just need to know how to do this bit!  

thanks ahead of time!

Flyin
0
Comment
Question by:flyin69
3 Comments
 
LVL 1

Accepted Solution

by:
Darix earned 250 total points
Comment Utility
try to use or rewrite this class:

using System;
using System.Configuration;
using System.Net;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

namespace Tamas
{
      public class AisPoster
      {
            private AisPoster(){}

            private string baseUrl;
            private string user;
            private string password;
            private string xml;
            private string regex1;

            public AisPoster(string user,string password,string xml)
            {
                  System.Collections.Specialized.NameValueCollection config
                        = System.Configuration.ConfigurationSettings.AppSettings;
                  this.baseUrl=config["baseUrl"];
                  this.regex1=config["regex1"];
                  if(!this.baseUrl.EndsWith("/"))
                        this.baseUrl=this.baseUrl+"/";
                  this.user=user;
                  this.password=password;
                  this.xml=xml;
            }
            /// <summary>
            /// Posts data to AIS site and returns response stream.
            /// </summary>
            /// <returns></returns>
            public System.IO.Stream Post()
            {
                  string loginUrl=baseUrl+"main/login.php";
                  PostWebRequest pReq=new PostWebRequest(loginUrl);
                  pReq.AddData("Action","Login");
                  pReq.AddData("Sub_User",user);
                  pReq.AddData("Sub_Pass",password);

                  string response1=pReq.GetResponseString(pReq.GetResponse());

                  pReq.Url=baseUrl+"import/import_priimti.php";
                  pReq.ContentType=ContentType.Multipart;
                  pReq.AddData("Dokumentas","DAA");
                  pReq.AddData("Tikslas","duomenu_baze");
                  pReq.AddData("Formatas","xml");
                  pReq.AddData("userfile",@"c:\upload.xml",xml);

                  string response=pReq.GetResponseString(pReq.GetResponse());
                  System.Text.RegularExpressions.Regex regex=new Regex(regex1
                        ,RegexOptions.Compiled);
                  Match m=regex.Match(response);
                  if (!m.Success)
                        throw new Exception("Response error");
                  else
                        response=m.Result("${value}");
                  

                  pReq.Url=baseUrl+"aapjp/pj_frm_daa.php";
                  pReq.AddData("Action","import");
                  
                  pReq.AddData("Filename",response);
                  pReq.AddData("Tikslas","duomenu_baze");
                  pReq.AddData("Formatas","xml");
                  pReq.ContentType=ContentType.NonMultipart;
                  string response2=pReq.GetResponseString(pReq.GetResponse());


                  pReq.Url=baseUrl+"import/import_daa.php";
                  pReq.AddData("DokNr","");
                  pReq.AddData("Action","");
                  pReq.AddData("Filename",response);
                  pReq.AddData("Tikslas","duomenu_baze");
                  pReq.AddData("Formatas","xml");
                  pReq.ContentType=ContentType.NonMultipart;
                  return pReq.GetResponse().GetResponseStream();
            
            }
      }
}
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

When you work with shopping cart / ecommerce relates web sites, you need to pass the certain form post details to the payment gateway process page with required details for the products items you give to order. Also you may need to track the ordered…
I found this questions asking how to do this in many different forums, so I will describe here how to implement a solution using PHP and AJAX. The logical flow for the problem should be: Write an event handler for the first drop down box to get …
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now