Cropping Text

Hi experts...

I would like to ask about how to do this:

For example:

<FORM name=myLoginForm onsubmit="return checkrequired(this)" method=post><INPUT type=hidden name=forgotPasswordClicked> <INPUT type=hidden value=false name=cs> <INPUT type=hidden value=Billing.jsp?siteId=1&amp;jid=FFADF9475X6858E5D3X1CAXDEC29ED21&amp;platformId=2&amp;eGift=&amp;NL=true name=caller>
<TD colSpan=3><B><FONT color=#ff6600>Sign in for rewards and faster shopping </FONT></B></TD></TR>
<TD colSpan=3>We’ll fill in your preferences, reward points and saved billing information.<BR></TD></TR>
<TD colSpan=3><B>Handango Member ID (email address)</B><BR><INPUT class=Input name=requiredEmail> <BR></TD></TR>
<TD colSpan=3><B>Password</B><BR><INPUT class=Input type=password maxLength=20 value="" name=requiredPassword></TD></TR>
<TD vAlign=top colSpan=3><!--<input type="image" name="getPassword" src="images/english/buttons/forgot_pass.gif" height="15" border="0" onClick="TurnOffVerify()">--><A href="javascript:forgotPasswordSubmit()"><B>Forgot your password?</B></A> <BR>(We'll send an email with your password.) </TD></TR>
<TD vAlign=top width=20><INPUT type=checkbox value=true name=RememerLoginEmail></TD>
<TD vAlign=top>Remember my email when I return.&nbsp;</TD></TR>

i have this HTML source ( may not always look this but its a FORM )

The question is :

How to crop thoose source to become this :
all the line that contain <input....
just want to get the name="..... ( not include the name="" just in between the " ")

if not clear here some sample again
INPUT type=checkbox value=true name=RememerLoginEmail> become this ---> "RememerLoginEmail"
all begin with input or INPUT

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

either you could use a regular expression to match the <INPUT> tag and then pull out the name


use the MSHTML object to parse the HTML...that object will have a forms collection...which will have a element collection...iterate thru that pulling out the name...but the source url (which can be a file) must be valid HTML
''You have to add a reference to the Microsoft HTML Object Library

''this should work
   Dim objMSHTML As New MSHTML.HTMLDocument
    Dim objDocument As MSHTML.HTMLDocument
    Set objDocument = objMSHTML.createDocumentFromUrl(Url, vbNullString)

    ''wait until the d/l is complete
    While objDocument.readyState <> "complete"
    ''<TODO> add timer/counter here to eventualy timeout

   Dim objForm As HTMLFormElement
    Dim objInput As HTMLInputElement
    Dim obj As Object
    For Each objForm In objDocument.Forms
        For Each obj In objForm.elements
            If UCase(obj.nodeName) = "INPUT" Then
                Set objInput = obj
                '''set name using
            End If
you can also use objMSHTML.createDocumentFragment
so long as the tags match up...they don't in your example
abangbataxAuthor Commented:
Hi SweetsGreen,

If you dont mind, what is the source for webbrowser_downloadcomplete instead of  MSHTML.HTMLDocument
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

simply replace

Set objDocument = objMSHTML.createDocumentFromUrl(Url, vbNullString)


Set objDocument = wb.document ''where wb is your webbrowser control


I don't believe there is a way to parse out elements using the webbrowser control alone.
You might or might not need the reference to the "Microsoft HTML Object Library " since the WebBrowser control is based off of MSHTML...but I'm not sure off the top of my head.
abangbataxAuthor Commented:
Can anyone convert this VB.NET source to VB?

    Private Function FilterInput(ByVal htmlText As String, ByVal listB As ListBox) As String()

        Dim inputPattern As String = "<INPUT(?<input>[\w\s" & Chr(34) & "=']+)"
        Dim namePattern As String = "NAME=" & Chr(34) & "?(?<name>[A-Za-z0-9_]+)"

        Dim regexInput As New Regex(inputPattern, RegexOptions.IgnoreCase)
        Dim regexName As New Regex(namePattern, RegexOptions.IgnoreCase)

        For Each inputElements As Match In regexInput.Matches(htmlText)
            Dim input As String = inputElements.Groups("input").Value

            Dim name As Match = regexName.Match(input)

            If name.Success Then
            End If
        Next inputElements

    End Function
Well I guess you are not going the MSHTML route.

As for your request to convert the code here you go....
but a few things you should know.
1. VB6 does not have the support for regular expressions that .NET have to use VBScript regex, so you have to add a reference for "Microsoft VBScript Regular Expressions"
2. Your regex's supplied would not work...I changed them and they should be fine now.
3. If you are not familiar with regular expressions, I would of gone with the MSHTML solution, since it does all the parsing work for don't have to worry about using an incorrect regular expression.

    Dim inputPattern As String
    inputPattern = "<(INPUT)[^<>]+>"
    Dim namePattern As String
    namePattern = "name\s*=('|& Chr(34) &|\s)*\w*('|& Chr(34) &|\s)*\s*"
    Dim regexInput As New RegExp
    regexInput.IgnoreCase = True
    regexInput.Global = True ''since we want all matches
    regexInput.Pattern = inputPattern
    Dim regexName As New RegExp
    regexName.IgnoreCase = True
    regexName.Pattern = namePattern
    Dim regexCleanup As New RegExp
    regexCleanup.Global = True
    regexCleanup.IgnoreCase = True
    regexCleanup.Pattern = "(name\s*=\s*|'|& Chr(34) &)"

    Dim name As Match
    Dim inputString As String
    Dim inputElements As Match
    For Each inputElements In regexInput.Execute(HTMLText)
         inputString = inputElements.Value

            For Each name In regexName.Execute(inputString)
            ''remove any name= and '
                listB.Items.Add (regexCleanup.Replace(name.Value, ""))
                ''Debug.Print regexCleanup.Replace(name.Value, "")

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
make sure the reference you use is "Microsoft VBScript Regular Expressions 5.5"
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic Classic

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.