Web based program

Posted on 2002-03-03
Last Modified: 2008-02-01

I have a question regarding a web based application in MFC C++.  I am wondering if it is possible to program an application in Visual C++ that would interact with a specific web page on the internet.

To further explain, I would like to create a program that would be able to point to a specific edit box on a web page, and be able to set data and retrieve data from this box just as you would in MFC C++ edit box.  Also if there are mulitiple dialog buttons on this web page, I would like the program to be able to differentiate between them, and 'click' on the appropriate button when the program calls it.

Any idea's on how to do this would be appriciated.

Question by:PastorDwayne
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
LVL 32

Expert Comment

ID: 6837414
>> I am wondering if it is possible to
program an application in Visual C++ that would interact with a specific web page on the internet.


Do you have a specific page in mind?  You need to analyze the page and determine how it interacts with the server.  Then you just open the URL and use either GET or POST to pass object data to the server.

Reading information is a bit more complicated since you must parse HTML that is returned by the server but if the page is static in its presentation it can be done.
LVL 49

Expert Comment

ID: 6837904
MFC provides some very powerful tools to do waht you want.  Lets get started:

1) Use the AppWizard to create a dialog-based app

2) Add an activeX Control: Microsoft WebBrowser

3) Click it and press Ctrl+W to bring up the ClassWizard  Have tge CLassWizard create a 'Control' category member variable.  Let it automatically generate the file webbrowser2.cpp   Name the control m_ctrBrowser.

4) Add a button to the dialog.   Double-click it and  create this handler:

        m_ctlBrowse.Navigate("", 0,0,0,0 );

You now have a program that hosts a web browser control and will download and display specified page.  Play around with it.  If you want, you can examine the ClassView pane to see some of the functions available.  Also the ClassWizard (Ctrl+W) will let you easily provide your own handlers for some events.  For instance, with a couple of clicks, you can provide a handler for "BeforeNavigate2" and get a look at each URL before it is displayed.  So you can either stop it or redirect it or write your own code to handle the click of a hyperlink in a custom way.

How to access the elements in the HTML document.

-- Dan
LVL 49

Expert Comment

ID: 6837987
How to access the HTML Document Object Model:

Add these lines to StdAfx.h:

#pragma warning(disable : 4192)
#pragma warning(disable : 4049)
#pragma warning(disable : 4146)
//--------- note: For Win9x use Windows instead of winnt  
#import "c:\winnt\system32\mshtml.tlb"
#pragma warning(default: 4192)
#pragma warning(default: 4049)
#pragma warning(default: 4146)

(I don't know why these warning occur, but an MSDN technote -- Q231931 -- says they are benign)

Now add a second button to your dialog.  Double-click it and add this hander code:

//-------------------------- IDC_BUTTON2 Handler
void CMyDlg::OnButton2()
     MSHTML::IHTMLDocument2Ptr pDoc= m_ctlBrowser.GetDocument();
     MSHTML::IHTMLElementCollectionPtr pAllElems= pDoc->all;

     // int nCnt= pAllElems->length; // eyeball check.  It WORKS!

     MSHTML::IHTMLInputTextElementPtr pTextInput= pAllElems->item("p");
     pTextInput->value= "here is some input!";

This code places some text into an text-input element named "p" I found the ID of this element by examining the source code of the page.

When you run the program, click button 1 and then wait for the page to load and then clcik butto2.  In your final app, you need to be careful to wait until the page loads before trying to access the DOM.  Typically, you must add a handler for the DocumentComplete event, and do things like this only when it gets fired.
You may get confused by all of the IHTMLThis and IHTMLThat refs.  There is scads of documentation of the methods and attributes.  Here is a good starting point for seeing what elements are available:

It is also in MSDN, in the Index (just type in IHTML and start looking around).

Note that this is not the only way to do some of what you want to do.  For instance, you can set up all of the form data and do a POST directly to a web host without needing to actually load the page.  In this method, you are basically 'pretending' that you have opened the page, filled in all of the inputs, and clicked the [Submit] button.  An example of this is in an EE Question I have answered before:

   (sorry, EE search is broken again)

-- Dan

Author Comment

ID: 6838704

The comments that you made were teriffic...  I tested the code that you provided and it works great!

I just have a few more questions in regards to this:

1.  You gave a good example of inputing text into a web page; how would you go about retrieve text from a web page. For example, I would like to retrieve stock market data from a specific stock on a web page, and plug the stock price and other data into a CString.

2.  I am also wondering about how to select specific dialog buttons and check boxes on a web page.

thanks very much for your help.
LVL 49

Accepted Solution

DanRollins earned 300 total points
ID: 6840020
1) For an obscure reason, this is called "screen scraping."   You need to know what the page looks like and hope that the provider does not alter the layout drastically.  You download the page using any of several techniques, then search the text for some distinct text.  Having found that, you know (for instance) that the next 4 characters are the ticker symbol and then next 6 characters are the current price and then next eight character are the data and time that that price was in effect... etc.

One way to programmatically get the source text of a page is to use code like this in the above example

     MSHTML::IHTMLElementPtr pBody= pDoc->Getbody();
     bstr_t bstr= pBody->GetinnerText();
     CString s= (LPCSTR)bstr;

     int nOffset= s.Find(...etc...)

Another way is to use the lower-level CHttpXxxx functions.  See the 'Tear' example (it 'tears off a page' from the Inet)

2. specific dialog buttons and check boxes on a web page.

Thel 'all' collection will have items for each button.  The MSHTML::IHTMLOptionButtonPtr interface provides a Putchecked function for setting radio buttons.

The MSHTML::IHTMLElementPtr interface provides a click() fn that simulate clicking on the element.

-- Dan

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In days of old, returning something by value from a function in C++ was necessarily avoided because it would, invariably, involve one or even two copies of the object being created and potentially costly calls to a copy-constructor and destructor. A…
Introduction This article is the first in a series of articles about the C/C++ Visual Studio Express debugger.  It provides a quick start guide in using the debugger. Part 2 focuses on additional topics in breakpoints.  Lastly, Part 3 focuses on th…
The viewer will learn additional member functions of the vector class. Specifically, the capacity and swap member functions will be introduced.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

635 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question