Link to home
Start Free TrialLog in
Avatar of tma050898
tma050898Flag for United States of America

asked on

Looking to enumerate HTML elements on a .NET Web page

I'm running a .NET Web application that I would like to automate via Visual C++/MFC/COM. How can I enumerate the fields/controls on the page in terms of getting their ids and names.
Avatar of itsmeandnobodyelse
itsmeandnobodyelse
Flag of Germany image

If you have a CHtmlView derived class for showing the web page, you should be able to enumerate  all controls by GetWindow calls

void MyHtmlView::OnInitialUpdate(...)
{
     CWnd* pWnd = GetWindow(GW_CHILD);

     while (pWnd != NULL)
     {
            // here the pWnd is one of the controls you were looking for
            ...
            // next control
            pWnd = GetWindow(GW_HWNDNEXT);
     }
}

}
Avatar of tma050898

ASKER

Sorry guys. I must have worded this very poorly. I'll try to do better this time.

My company has an ASP.NET application to which I have no access to the source code. I'm simply an end-user. Being 99% Windows/MFC developer, I know how to manipulate Windows apps very well. However, what I want to do is to automate this blasted Web "app".

As an example, the page has several edit controls and listboxes in which I need to enter data and make selections before hitting the Submit button. This is what I want to automate. However, I can't figure out the IDs or Names of the controls on the page (View Source didn't help). I'm using the MSHTML library, but am not getting back the element pointer so instead of me posting my code, I figured I would ask a generic question of "How can I enumerate all of the controls' IDs/Names" on the page from an MFC/COM application?

Hopefully, that makes it clearer. Please ask any questions - I'll share the small bit of source code I have if that'll help but that code actually attempts to get one specific field.

Thanks!
Tom
>>>>> "How can I enumerate all of the controls' IDs/Names"
You want get them by not programming?

A control is a window where you need the hwnd handle to retrieve the ID of which is a numeric (binary) number. You would need the control style and class id as well in order to find out whether it is a combobox, an edit or a button. The only non-programmatical approach I could think of is to using Spy of the Visual Studio tools (but I wonder what you want to do with that information if you have no program?). Note automation is also another thing than to enumerate controls. First the web application you want to control must have an automation interface app where you have to retrieve an interface of. Then, after getting a pointer of that interface you could call 'methods' this interface provides.
[itsmeandnobodyelse] You want get them by not programming?
[Tom] As I mentioned, I'm attempting to automate this Web app from an MFC/COM application.

My code is loading the active IE document, but I can't figure out how to access the specific fields within the ASP.NET application as I can't see the code-behind and therefore, don't know the field names. Therefore, I'd like to enumerate all of the application's fields/controls.



>>>> My code is loading the active IE document,
The ASP.NET application is running on some web server. It's code is only available to you if the author of the the website has made an automation interface. What most likely comes to your local computer is simple html. No controls, no ids. All other things would require a local component installed. In case the web application is a .NET assembly, your local .NET environment could provide some local functions but I really doubt that the page you see on screen is more than one single child window of the IE what actually was "painted" and where there is nothing such as controls which you could enumerate.  
Avatar of DanRollins
It sounds like you want to access the HTML page as a document ("DOM manipulation"), which contains collections such as INPUT controls, Images, links, etc.  That is certainly possible using a number of techniques.  The easiest might be to create an HTA that loads the page and uses simple JavaScript syntax like:
    var oTextArea= gwnd.document.all.SomeTextFieldId;
    oTextArea.value= "Hi there!";

Since you are using MFC, I suggest making a simple dialog-based app and placing a webbrowser control on it.  Then you load the page and use C++ statements to access the ovjects available in the IHTMLDocument2 interface.  There are many examples of that here on EE and elsewhere.  Here's a pretty good one:

    Web based program
    http://www.experts-exchange.com/Programming/Languages/CPP/Q_20272700.html
but you can search  EE (or elsewhere) for the word  
   IHTMLDocument2Ptr
to get a lot of info.
One important "gotcha" with this technique is that you MUST wait until the OnDocumentComplete function is called before trying read or modify ovjects in the DOM.
ASKER CERTIFIED SOLUTION
Avatar of Naveen_R
Naveen_R

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
FYI: Thanks to everyone that posted. I'm heads-down on some other tasks right now, but will read through and test these different solutions this weekend.

Thanks again!
Tom