Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Retrieve ‘name’ from ‘textarea’ or ‘input’ html tag using IHTMLElement interface

Posted on 2003-11-05
10
Medium Priority
?
2,045 Views
Last Modified: 2013-11-20
Does anyone know if it’s possible to retrieve the ‘name’ attribute from a ‘textarea’ or ‘input’ html tag using the IHTMLElement interface or do I have to use IHTMLTextAreaElement / IHTMLInputElement interfaces? Using IHTMLTextAreaElement isn’t such an issue but as the IHTMLInputElement interface requires IE 5.0 I was hoping I might be able to avoid it as some of my users may still be on IE 4.0.

Any kind of work around would be welcome. To give some background, my program searches sequentially through a page and grabs the inner text and the name of any ‘textarea’ or ‘input’ elements on the page.

Any help would be much appreciated.
Thanks!
0
Comment
Question by:wjdashwood
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
10 Comments
 
LVL 12

Assisted Solution

by:williamcampbell
williamcampbell earned 400 total points
ID: 9688651
Using the code below you can pull in the *entire* page
Once you have the entire page you can parse it yourself for input tags and extract the info you need.

CComPtr <IHTMLElementCollection> pHTMLElement;

// retrieve a reference to the ALL collection
if (SUCCEEDED(hr = spDoc->get_all( &pHTMLElement )))
{
    long cElems;  

   // retrieve the count of elements in the collection
    if (SUCCEEDED(hr = pHTMLElement->get_length( &cElems )))
    {
          int i=0;
       VARIANT vIndex;
        vIndex.vt = VT_UINT;
        vIndex.lVal = i;
        VARIANT var2 = { 0 };
        LPDISPATCH pDisp;  
           
        if (SUCCEEDED(hr = pHTMLElement->item( vIndex, var2, &pDisp )))
        {
            IHTMLElement* pElem = NULL;
            if (SUCCEEDED(hr = pDisp->QueryInterface( IID_IHTMLElement, (LPVOID*)&pElem )))
            {

                BSTR text;
               pElem->get_innerHTML(&text);
               //text contains everything between <HTML> and </HTML>
              }
         }
    }
}
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 9693075
Not all <input> or <textarea> tags have a name or ID.  

What you should do is cycle through all of the elements by getting the IHTMLElementCollection of the document (or maybe just document.all) and working through each item.  Each item will be an IHTMLElement and you can use the .tagName to see if it is an input or textarea.

-- Dan
0
 

Author Comment

by:wjdashwood
ID: 9693251
Thanks for the comments. Firstly, I've tried using get_innerHTML although the entire body doesn't seem to fit in the BSTR. It's not an ideal solution but not a bad idea :)

I also tried implementing it inside a loop and evaluated each element to find an input or textarea tag. Is there a way of retireving the whole tag and contents as a string from the IHTMLElement? get_innerHTML only gets the text from inside the textarea tag. If it's any help, my code is shown below.


IHTMLDocument2 *pHtmlDoc = NULL;
if (SUCCEEDED(GetDHtmlDocument(&pHtmlDoc)) && (pHtmlDoc != NULL))
{
      IHTMLElementCollection* pColl = NULL;
      if (SUCCEEDED(pHtmlDoc->get_all(&pColl)) && (pColl != NULL))
      {
            long nLength = 0;
            pColl->get_length(&nLength);
            for (long nCount = 0; nCount < nLength; nCount++)
            {
                  COleVariant vIdx(nCount, VT_I4);
                  IDispatch* pElemDispatch = NULL;
                  IHTMLElement * pElem = NULL;
                  if (SUCCEEDED(pColl->item(vIdx, vIdx, &pElemDispatch))
                        && (pElemDispatch != NULL))
                  {
                        if (SUCCEEDED(pElemDispatch->QueryInterface(IID_IHTMLElement,
                              (void**)&pElem)) && (pElem != NULL))
                        {
                              BSTR bstrTagName;
                              CString sTempTagName;
                              if (!FAILED(pElem->get_tagName(&bstrTagName)))
                              {
                                    sTempTagName = bstrTagName;
                                    sTempTagName.MakeLower();
                                    SysFreeString(bstrTagName);
                              }
                              if ((sTempTagName == _T("input")) || (sTempTagName == _T("textarea")))
                              {
                                    BSTR bstrText;
                                    pElem->get_innerHTML(&bstrText);
                                    SysFreeString(bstrText);
                              }
                              pElem->Release();
                        }
                        pElemDispatch->Release();
                  }
            }
            pColl->Release();
      }
      pHtmlDoc->Release();
}


Secondly, Dan, I'm writing the HTML for the program so all <input> or <textarea> tags must have a name. I only use ID's if I need to detect a click on that item. As shown above I work through each item but how do I find the name when I have an input or textarea tag in my grasp?!

Many thanks!
0
Moving data to the cloud? Find out if you’re ready

Before moving to the cloud, it is important to carefully define your db needs, plan for the migration & understand prod. environment. This wp explains how to define what you need from a cloud provider, plan for the migration & what putting a cloud solution into practice entails.

 

Author Comment

by:wjdashwood
ID: 9694098
I've come across getAttribute (don't know why I didn't notice it earlier) which should be exactly what I want but I'm having trouble using it. The VARIANT data type is not something I'm used to so perhaps I'm using it wrong. I had to include "comutil.h" which I assume is that correct header file.

Any ideas why varInputValue.bstrVal is NULL (0xcccccccc)?

VARIANT varInputName;
if (SUCCEEDED(pElem->getAttribute((BSTR)(_T("name")),0,&varInputName)) && varInputName.bstrVal)
{
      VARIANT varInputValue;
      if (SUCCEEDED(pElem->getAttribute((BSTR)(_T("value")),0,&varInputValue)) && varInputValue.bstrVal)
      {
            bstrText = varInputValue.bstrVal;
      }
}

The source code I found used _variant_t and _bstr_t instead of VARIANT and (BSTR) but they gave me the following compile error:

LNK2019: unresolved external symbol "void __stdcall _com_issue_error(long)" (?_com_issue_error@@YGXJ@Z) referenced in function "public: __thiscall _bstr_t::_bstr_t(char const *)" (??0_bstr_t@@QAE@PBD@Z)

Thanks once again for your help.
0
 
LVL 49

Accepted Solution

by:
DanRollins earned 1600 total points
ID: 9696580
You are working too hard!:

void CDlgWebDlg::OnButton2()
{
      MSHTML::IHTMLDocument2Ptr pDoc= m_ctlBrowser.GetDocument();
      MSHTML::IHTMLElementCollectionPtr pAllElems= pDoc->all;

      long nLength= pAllElems->length;
      for (long j= 0; j<nLength; j++) {
            MSHTML::IHTMLElementPtr pElem= pAllElems->item( _variant_t(j,VT_I4) );
            CString sTag= pElem->getAttribute("tagName",0).bstrVal;
            if ( sTag == "INPUT" ) {
                  CString sType= pElem->getAttribute("type",0).bstrVal;
                  if ( sType== "text" ) {
                        CString sName= pElem->getAttribute("name",0).bstrVal;
                        TRACE( "Found <input type=text>.  name= %s\r\n", (LPCSTR)sName );
                  }
            }
            if ( sTag == "TEXTAREA" ) {
                  CString sName= pElem->getAttribute("name",0).bstrVal;
                  TRACE( "Found <textarea>.  name= %s\r\n", (LPCSTR)sName );
            }
      }
}
0
 

Author Comment

by:wjdashwood
ID: 9697879
Cheers for the code Dan. If I put it directly in my SaveUserInput function it’s not a happy bunny! I guess because I’m using CDHtmlDialog and not hosting a WebBrowser control. Is it possible to stick with what I have? I’m sure the code I already have will work with a bit more tweaking.

Do I have to initialise the VARIANT variable or include a difference header file?

Thanks.
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 9698020
You should be able to make just one change... to get the pointer to the document (I don't know HTML Dialogs at all, but there is *always* a way to get that).

My guess is that it will involve a call to
    GetDHtmlDocument()

>>Do I have to initialise the VARIANT variable or include a difference header file?

You should just use
        _variant_t
and
        _bstr_t
objects.  They simplify all access to these COM datatypes; for instance, they free the associated memory when they destruct at the end of the scope.

-- Dan
0
 

Author Comment

by:wjdashwood
ID: 9698046
Thanks but I think I've cracked it! It was all to do with:

pElem->getAttribute((BSTR)(_T("name")),0,&varInputName)

which should have been:

pElem->getAttribute(L"name",0,&varInputName)

Just what do the L and the _T do anyway? Just out of interest, why did I get unresolved external symbol errors when I used _variant_t and _bstr_t instead of VARIANT and BSTR?

Many thanks for you time and effort. Well worth some points anyway.
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 9698131
I don't know why you would get about unresolved externals.  The headers are in
    #include <comdef.h>
which also pulls in several libraries...
    #pragma comment(lib, "comsupp.lib")
    #pragma comment(lib, "user32.lib")
    #pragma comment(lib, "ole32.lib")
    #pragma comment(lib, "oleaut32.lib")
They may have changed something in the MFC 7 (aka VC++.Net)

>>Just what do the L and the _T do anyway?
L"xxxx" forces the compiler to store the string literal internally as UNICODE (16-bit wide) characters.  All COM interfaces expect 16-bit wide characters.

_T("xxxxx") forces the compiler to store the string literal as either UNICODE or 8-bit characters, depending upon whether you have defined UNICODE or not in your preprocessor settings.

Given that
        pElem->getAttribute((BSTR)(_T("name")),0,&varInputName)
failed but
        pElem->getAttribute(L"name",0,&varInputName)
worked, I'd say that you are working with a non-UNICODE build.

-- Dan
0
 

Author Comment

by:wjdashwood
ID: 9698232
Ah I see. Many thanks for the help! Much appreciated. Night night.
0

Featured Post

Learn how to optimize MySQL for your business need

With the increasing importance of apps & networks in both business & personal interconnections, perfor. has become one of the key metrics of successful communication. This ebook is a hands-on business-case-driven guide to understanding MySQL query parameter tuning & database perf

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn different types of Android Layout and some basics of an Android App.
Without even knowing it, most of us are using web applications on a daily basis.  In fact, Gmail and Yahoo email, Twitter, Facebook, and eBay are used by most of us daily—and they are web applications. We generally confuse these web applications to…
The viewer will learn the benefit of using external CSS files and the relationship between class and ID selectors. Create your external css file by saving it as style.css then set up your style tags: (CODE) Reference the nav tag and set your prop…
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.

670 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question