Cropping Text

Hi experts...

I would like to ask about how to do this:

For example:

<FORM name=myLoginForm onsubmit="return checkrequired(this)" method=post><INPUT type=hidden name=forgotPasswordClicked> <INPUT type=hidden value=false name=cs> <INPUT type=hidden value=Billing.jsp?siteId=1&amp;jid=FFADF9475X6858E5D3X1CAXDEC29ED21&amp;platformId=2&amp;eGift=&amp;NL=true name=caller>
<TD colSpan=3><B><FONT color=#ff6600>Sign in for rewards and faster shopping </FONT></B></TD></TR>
<TD colSpan=3>We’ll fill in your preferences, reward points and saved billing information.<BR></TD></TR>
<TD colSpan=3><B>Handango Member ID (email address)</B><BR><INPUT class=Input name=requiredEmail> <BR></TD></TR>
<TD colSpan=3><B>Password</B><BR><INPUT class=Input type=password maxLength=20 value="" name=requiredPassword></TD></TR>
<TD vAlign=top colSpan=3><!--<input type="image" name="getPassword" src="images/english/buttons/forgot_pass.gif" height="15" border="0" onClick="TurnOffVerify()">--><A href="javascript:forgotPasswordSubmit()"><B>Forgot your password?</B></A> <BR>(We'll send an email with your password.) </TD></TR>
<TD vAlign=top width=20><INPUT type=checkbox value=true name=RememerLoginEmail></TD>
<TD vAlign=top>Remember my email when I return.&nbsp;</TD></TR>

i have this HTML source ( may not always look this but its a FORM )

The question is :

How to crop thoose source to become this :
all the line that contain <input....
just want to get the name="..... ( not include the name="" just in between the " ")

if not clear here some sample again
INPUT type=checkbox value=true name=RememerLoginEmail> become this ---> "RememerLoginEmail"
all begin with input or INPUT

either you could use a regular expression to match the <INPUT> tag and then pull out the name


use the MSHTML object to parse the HTML...that object will have a forms collection...which will have a element collection...iterate thru that pulling out the name...but the source url (which can be a file) must be valid HTML
''You have to add a reference to the Microsoft HTML Object Library

''this should work
   Dim objMSHTML As New MSHTML.HTMLDocument
    Dim objDocument As MSHTML.HTMLDocument
    Set objDocument = objMSHTML.createDocumentFromUrl(Url, vbNullString)

    ''wait until the d/l is complete
    While objDocument.readyState <> "complete"
    ''<TODO> add timer/counter here to eventualy timeout

   Dim objForm As HTMLFormElement
    Dim objInput As HTMLInputElement
    Dim obj As Object
    For Each objForm In objDocument.Forms
        For Each obj In objForm.elements
            If UCase(obj.nodeName) = "INPUT" Then
                Set objInput = obj
                '''set name using
            End If
you can also use objMSHTML.createDocumentFragment
so long as the tags match up...they don't in your example
abangbataxAuthor Commented:
Hi SweetsGreen,

If you dont mind, what is the source for webbrowser_downloadcomplete instead of  MSHTML.HTMLDocument
simply replace

Set objDocument = objMSHTML.createDocumentFromUrl(Url, vbNullString)


Set objDocument = wb.document ''where wb is your webbrowser control


I don't believe there is a way to parse out elements using the webbrowser control alone.
You might or might not need the reference to the "Microsoft HTML Object Library " since the WebBrowser control is based off of MSHTML...but I'm not sure off the top of my head.
abangbataxAuthor Commented:
Can anyone convert this VB.NET source to VB?

    Private Function FilterInput(ByVal htmlText As String, ByVal listB As ListBox) As String()

        Dim inputPattern As String = "<INPUT(?<input>[\w\s" & Chr(34) & "=']+)"
        Dim namePattern As String = "NAME=" & Chr(34) & "?(?<name>[A-Za-z0-9_]+)"

        Dim regexInput As New Regex(inputPattern, RegexOptions.IgnoreCase)
        Dim regexName As New Regex(namePattern, RegexOptions.IgnoreCase)

        For Each inputElements As Match In regexInput.Matches(htmlText)
            Dim input As String = inputElements.Groups("input").Value

            Dim name As Match = regexName.Match(input)

            If name.Success Then
            End If
        Next inputElements

    End Function
Well I guess you are not going the MSHTML route.

As for your request to convert the code here you go....
but a few things you should know.
1. VB6 does not have the support for regular expressions that .NET have to use VBScript regex, so you have to add a reference for "Microsoft VBScript Regular Expressions"
2. Your regex's supplied would not work...I changed them and they should be fine now.
3. If you are not familiar with regular expressions, I would of gone with the MSHTML solution, since it does all the parsing work for don't have to worry about using an incorrect regular expression.

    Dim inputPattern As String
    inputPattern = "<(INPUT)[^<>]+>"
    Dim namePattern As String
    namePattern = "name\s*=('|& Chr(34) &|\s)*\w*('|& Chr(34) &|\s)*\s*"
    Dim regexInput As New RegExp
    regexInput.IgnoreCase = True
    regexInput.Global = True ''since we want all matches
    regexInput.Pattern = inputPattern
    Dim regexName As New RegExp
    regexName.IgnoreCase = True
    regexName.Pattern = namePattern
    Dim regexCleanup As New RegExp
    regexCleanup.Global = True
    regexCleanup.IgnoreCase = True
    regexCleanup.Pattern = "(name\s*=\s*|'|& Chr(34) &)"

    Dim name As Match
    Dim inputString As String
    Dim inputElements As Match
    For Each inputElements In regexInput.Execute(HTMLText)
         inputString = inputElements.Value

            For Each name In regexName.Execute(inputString)
            ''remove any name= and '
                listB.Items.Add (regexCleanup.Replace(name.Value, ""))
                ''Debug.Print regexCleanup.Replace(name.Value, "")

make sure the reference you use is "Microsoft VBScript Regular Expressions 5.5"
