Browser Bot -- Automate Browsing Sequences With C++ (PART TWO)

DanRollins
CERTIFIED EXPERT
Published:
In PART ONE of this series, we created a simple dialog-based program that opens a web page and displays it... all in preparation for filling-in some text input boxes and clicking a [Submit] button.

In this article, PART TWO, we will access the web browser control and "pull the strings" on the page.  We'll automate the task of logging in to a specific website.

In PART THREE, we will automatically surf to a specific page in the site, and collect some information from it.Final Result of 3-part ArticleStart out by following the steps from PART ONE.  You are working in Visual Studio 2008 and have created a project named WebRobot.  You have added a control-type variable and some event handlers to your main windows  (a CDialog-derived object named CWebRobotDlg).

1. Getting Element IDs From the Source Page

Obtain the id values of the controls we want to manipulate.

Use a browser to get to the desired web page.  Right-click in the window and select View Source. On the example page, https://login.yahoo.com/config/login , I found these pieces of HTML:

   <input name="login"  id="username" value="" type="text" ...etc...
   
   <input name="passwd" id="passwd" value="" type="password" ...etc...

   <input type="submit" name=".save" value="Sign In">

Our task now is to fill-in the two input boxes with text and then click the [Sign In] button.

2. Write a Function to Get a DOM Pointer

Let's add a utility function to our program that will return an element pointer, given its HTML ID value.

Open the WebRobotDlg.h header file and add this to the "public" section:
void* ElemFromID( LPWSTR szID, IID nTypeIID );

Open in new window

Now add this function to the end of the WebRobotDlg.cpp file:
#include <mshtml.h> 
                      #include <atlbase.h>
                      #include <comdef.h>
                      //-------------------------------------------------------
                      // Returns a desired type of IHTMLElement pointer
                      // to provide access to the object with a specific ID
                      //
                      void* CWebRobotDlg::ElemFromID( LPWSTR szID, IID nTypeIID )
                      {
                      	HRESULT hr;
                      	IHTMLDocument2* pDoc= (IHTMLDocument2*) m_ctlBrowser.get_Document();
                      	IHTMLElementCollection* pAll=  NULL;
                       
                      	hr= pDoc->get_all( &pAll ); 
                      
                      	CComVariant vElement( szID ); // the id or name of the control
                      	CComVariant vIndex(0,VT_I4);  // 0 (presume it's not a collection)
                       
                      	IDispatch* pDisp;
                       	hr= pAll->item(vElement,vIndex,&pDisp);
                       
                      	void* pElement;   // will coerce to desired type later
                       	hr= pDisp->QueryInterface( nTypeIID,(void**)&pElement);
                      
                      	return( pElement );
                      }

Open in new window

That source code there, dear reader, is the core of this entire operation.  In order to manipulate the DOM (Document Object Model), we need to:

1) Get a pointer to the DOM (here, we get it from the ActiveX object),
2) Get a collection of all of the elements on the page,
3) Get one of them from that list, and
4) Get an interface to the right type of object so that we can call its member functions.

The first tricky part is accessing the "all" collection.  It is an object of type
   IHTMLElementCollection*
We could enumerate through this collection if we needed to, but in this case, we know the name or id of each of the objects we want ("username", "passwd", and ".save" ).

The next tricky part is locating the desired item in the returned collection.  We need to use the pAll->item() method, and it expects two VARIANT parameters.  The first is obvious -- the id of the element (we can also use the name of the element if we don't have its id).  The second VARIANT parameter is more obscure.  If you pass in an id that is used by several elements, then you would use the second parameter to select from among them.  We happen to know that on this particular web page, there is only one of each of the elements of interest, so in this case, we can set the second parameter to a value of 0.

That handy little ElemFromID() function will be used three times in our automated login sequence, and it will be used whenever we need to get an interface to a particular HTML element object.

3. Write the 'Real' DoLogin Function

So now we can pound out the code for the [Do LogIn] button we added in PART ONE.  
Replace the existing function with this:
void CWebRobotDlg::OnBnClickedDologin()
                      {
                      	void* pElem= ElemFromID(L"username", IID_IHTMLInputElement) ;
                      	IHTMLInputElement* pTextBoxUser= (IHTMLInputElement*)pElem;
                      
                      	bstr_t sUser( L"MyUserName" );  // <<<-- your login name here
                      
                      	pTextBoxUser->put_value( sUser ); // populate a text box!
                      
                      	pElem= ElemFromID(L"passwd", IID_IHTMLInputElement) ;
                      	IHTMLInputElement* pTextBoxPswd= (IHTMLInputElement*)pElem;
                      
                      	bstr_t sPswd( L"MyPasword" );   // <<<-- your login password here
                      
                      	HRESULT hr= pTextBoxPswd->put_value( sPswd ); 
                      
                      	//------------------------------ now click the "Sign In" button 
                      	pElem= GetElemFromID(L".save", IID_IHTMLElement);
                      	IHTMLElement* pSubmit= (IHTMLElement*)pElem;
                      
                      	hr= pSubmit->click(); // GO! 
                      }

Open in new window

Of course, I've used phony credentials in the sUser and sPswd variables.  You need to put valid values in there.

Note that in the calls to our ElemFromID() function, we pass in the name of the desired element and an "IID" value.  Those are constants that are defined in the <mshtml.h> header.  All we really need to know about them (for now) is that the spelling is the same as the Interface name, but with IID_ inserted at the front.

How do we know which IID value and which interface pointer type to use?  It boils down to experience and knowledge of the DOM.  You look at the MSDN documentation for an element interface and see if it does what you need.  For instance, there is no click() function for IHTMLInputElement but there is one for IHTMLElement.  So I had to use the latter.

4. Try It Out!

Build the program and run it.  On a click of the [Do Login] button, the browser control will process the click and submit the form to the host.  If your name and password are valid, the host will log you in to the site and advance you to a new page.

Note:  It's worthwhile to comment out the final line (the pSubmit->click() line)  and run the program one time to try the function without actually submitting the form.  That way, you can visually verify that the two input boxes do indeed get filled-in by the earlier lines of code.
Review:
We saw how to access the Internet Explorer's DOM by...
* Using the get_Document() function of the Web Browser ActiveX control and...
* Obtaining a pointer to the all collection and ...
* Using the item() function to select a desired element by id or name.
We used IHTMLInputElement pointers and the put_value() function to automatically enter some text into a standard input box and a password input box.
We used an IHTMLElement pointer and its click() function to automatically click a button and (in this case) submit the form to the host.

In PART THREE , we will continue with our web bot development by navigating to a specific web page and pulling some specific information from it.

References:
Interfaces and Scripting Objects
http://msdn.microsoft.com/en-us/library/aa741322(VS.85).aspx

IHTMLDocument2 Interface
http://msdn.microsoft.com/en-us/library/aa752574(VS.85).aspx

IHTMLElementCollection Interface
http://msdn.microsoft.com/en-us/library/aa703928(VS.85).aspx

IHTMLElement Interface
http://msdn.microsoft.com/en-us/library/aa752279(VS.85).aspx

IHTMLInputElement Interface
http://msdn.microsoft.com/en-us/library/aa703817(VS.85).aspx

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
If you liked this article and want to see more from this author,  please click the Yes button near the:
      Was this article helpful?
label that is just below and to the right of this text.   Thanks!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
3
8,153 Views
DanRollins
CERTIFIED EXPERT

Comments (3)

Subrat (C++ windows/Linux)Software Engineer
CERTIFIED EXPERT

Commented:
Really a nice article. It'd be great if writting another article using C++ WIN32.

Commented:
1. Maybe here is a small mistake:
void* ElemFromID( LPCSTR szID, long nTypeIID );

Seems like it should be:
void* ElemFromID( LPCWSTR szID,  IID nTypeIID );

2. for GMail these input elements are L"email", L"passwd" and L"singIn".

My "yes" is above.

CERTIFIED EXPERT
Author of the Year 2009

Author

Commented:
Indeed, the header line in step 2 was a mismatch for the subsequent CPP file code.  Thanks for pointing that out.  I've corrected the text.

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.