I need write a script that will script a login process then grab the information from a page once logged in.


I am making a screen scraping program that will gather information every day on a site.  The site does not have an rss feed so i cannot gather information that way, and their page is fairly static so this will allow me to screen scrape the information from the site.  I have proper permission to screen scrape and i have a valid user name and password, but my problem is that i do not know how to script a login.  

I need a script in vb.net or vbscript, that will basically do the following.

1. Open a web browser
2. Go to http://www.HowDoIDoThis.com/Login.asp // Arbitrary page
3. Some how type in a given user name and password
4. Submit the form
5. Once logged in, redirect to http://www.HowDoIDoThis.com/Information.asp // Arbitrary page
6. Grab all of the html from that page.

My problem is that trying to access the Information.asp page is protected, no users without passwords can go directly to it.  You have to login first, which sounds right because they want only their paying customers to view the content which i am so i just need to know how to do this bit!  

thanks ahead of time!

Who is Participating?
DarixConnect With a Mentor Commented:
try to use or rewrite this class:

using System;
using System.Configuration;
using System.Net;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

namespace Tamas
      public class AisPoster
            private AisPoster(){}

            private string baseUrl;
            private string user;
            private string password;
            private string xml;
            private string regex1;

            public AisPoster(string user,string password,string xml)
                  System.Collections.Specialized.NameValueCollection config
                        = System.Configuration.ConfigurationSettings.AppSettings;
            /// <summary>
            /// Posts data to AIS site and returns response stream.
            /// </summary>
            /// <returns></returns>
            public System.IO.Stream Post()
                  string loginUrl=baseUrl+"main/login.php";
                  PostWebRequest pReq=new PostWebRequest(loginUrl);

                  string response1=pReq.GetResponseString(pReq.GetResponse());


                  string response=pReq.GetResponseString(pReq.GetResponse());
                  System.Text.RegularExpressions.Regex regex=new Regex(regex1
                  Match m=regex.Match(response);
                  if (!m.Success)
                        throw new Exception("Response error");

                  string response2=pReq.GetResponseString(pReq.GetResponse());

                  return pReq.GetResponse().GetResponseStream();
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.