Web based program


I have a question regarding a web based application in MFC C++.  I am wondering if it is possible to program an application in Visual C++ that would interact with a specific web page on the internet.

To further explain, I would like to create a program that would be able to point to a specific edit box on a web page, and be able to set data and retrieve data from this box just as you would in MFC C++ edit box.  Also if there are mulitiple dialog buttons on this web page, I would like the program to be able to differentiate between them, and 'click' on the appropriate button when the program calls it.

Any idea's on how to do this would be appriciated.

Who is Participating?
DanRollinsConnect With a Mentor Commented:
1) For an obscure reason, this is called "screen scraping."   You need to know what the page looks like and hope that the provider does not alter the layout drastically.  You download the page using any of several techniques, then search the text for some distinct text.  Having found that, you know (for instance) that the next 4 characters are the ticker symbol and then next 6 characters are the current price and then next eight character are the data and time that that price was in effect... etc.

One way to programmatically get the source text of a page is to use code like this in the above example

     MSHTML::IHTMLElementPtr pBody= pDoc->Getbody();
     bstr_t bstr= pBody->GetinnerText();
     CString s= (LPCSTR)bstr;

     int nOffset= s.Find(...etc...)

Another way is to use the lower-level CHttpXxxx functions.  See the 'Tear' example (it 'tears off a page' from the Inet)


2. ...select specific dialog buttons and check boxes on a web page.

Thel 'all' collection will have items for each button.  The MSHTML::IHTMLOptionButtonPtr interface provides a Putchecked function for setting radio buttons.

The MSHTML::IHTMLElementPtr interface provides a click() fn that simulate clicking on the element.

-- Dan
>> I am wondering if it is possible to
program an application in Visual C++ that would interact with a specific web page on the internet.


Do you have a specific page in mind?  You need to analyze the page and determine how it interacts with the server.  Then you just open the URL and use either GET or POST to pass object data to the server.

Reading information is a bit more complicated since you must parse HTML that is returned by the server but if the page is static in its presentation it can be done.
MFC provides some very powerful tools to do waht you want.  Lets get started:

1) Use the AppWizard to create a dialog-based app

2) Add an activeX Control: Microsoft WebBrowser

3) Click it and press Ctrl+W to bring up the ClassWizard  Have tge CLassWizard create a 'Control' category member variable.  Let it automatically generate the file webbrowser2.cpp   Name the control m_ctrBrowser.

4) Add a button to the dialog.   Double-click it and  create this handler:

        m_ctlBrowse.Navigate("http://www.yahoo.com", 0,0,0,0 );

You now have a program that hosts a web browser control and will download and display specified page.  Play around with it.  If you want, you can examine the ClassView pane to see some of the functions available.  Also the ClassWizard (Ctrl+W) will let you easily provide your own handlers for some events.  For instance, with a couple of clicks, you can provide a handler for "BeforeNavigate2" and get a look at each URL before it is displayed.  So you can either stop it or redirect it or write your own code to handle the click of a hyperlink in a custom way.

How to access the elements in the HTML document.

-- Dan
How to access the HTML Document Object Model:

Add these lines to StdAfx.h:

#pragma warning(disable : 4192)
#pragma warning(disable : 4049)
#pragma warning(disable : 4146)
//--------- note: For Win9x use Windows instead of winnt  
#import "c:\winnt\system32\mshtml.tlb"
#pragma warning(default: 4192)
#pragma warning(default: 4049)
#pragma warning(default: 4146)

(I don't know why these warning occur, but an MSDN technote -- Q231931 -- says they are benign)

Now add a second button to your dialog.  Double-click it and add this hander code:

//-------------------------- IDC_BUTTON2 Handler
void CMyDlg::OnButton2()
     MSHTML::IHTMLDocument2Ptr pDoc= m_ctlBrowser.GetDocument();
     MSHTML::IHTMLElementCollectionPtr pAllElems= pDoc->all;

     // int nCnt= pAllElems->length; // eyeball check.  It WORKS!

     MSHTML::IHTMLInputTextElementPtr pTextInput= pAllElems->item("p");
     pTextInput->value= "here is some input!";

This code places some text into an text-input element named "p" I found the ID of this element by examining the source code of the yahoo.com page.

When you run the program, click button 1 and then wait for the page to load and then clcik butto2.  In your final app, you need to be careful to wait until the page loads before trying to access the DOM.  Typically, you must add a handler for the DocumentComplete event, and do things like this only when it gets fired.
You may get confused by all of the IHTMLThis and IHTMLThat refs.  There is scads of documentation of the methods and attributes.  Here is a good starting point for seeing what elements are available:


It is also in MSDN, in the Index (just type in IHTML and start looking around).

Note that this is not the only way to do some of what you want to do.  For instance, you can set up all of the form data and do a POST directly to a web host without needing to actually load the page.  In this method, you are basically 'pretending' that you have opened the page, filled in all of the inputs, and clicked the [Submit] button.  An example of this is in an EE Question I have answered before:

   (sorry, EE search is broken again)

-- Dan
PastorDwayneAuthor Commented:

The comments that you made were teriffic...  I tested the code that you provided and it works great!

I just have a few more questions in regards to this:

1.  You gave a good example of inputing text into a web page; how would you go about retrieve text from a web page. For example, I would like to retrieve stock market data from a specific stock on a web page, and plug the stock price and other data into a CString.

2.  I am also wondering about how to select specific dialog buttons and check boxes on a web page.

thanks very much for your help.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.