• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 357
  • Last Modified:

Cropping Text

Hi experts...

I would like to ask about how to do this:

For example:

<FORM name=myLoginForm onsubmit="return checkrequired(this)" method=post><INPUT type=hidden name=forgotPasswordClicked> <INPUT type=hidden value=false name=cs> <INPUT type=hidden value=Billing.jsp?siteId=1&amp;jid=FFADF9475X6858E5D3X1CAXDEC29ED21&amp;platformId=2&amp;eGift=&amp;NL=true name=caller>
<TBODY>
<TR>
<TD colSpan=3><B><FONT color=#ff6600>Sign in for rewards and faster shopping </FONT></B></TD></TR>
<TR>
<TD colSpan=3>We’ll fill in your preferences, reward points and saved billing information.<BR></TD></TR>
<TR>
<TD colSpan=3><B>Handango Member ID (email address)</B><BR><INPUT class=Input name=requiredEmail> <BR></TD></TR>
<TR>
<TD colSpan=3><B>Password</B><BR><INPUT class=Input type=password maxLength=20 value="" name=requiredPassword></TD></TR>
<TR>
<TD vAlign=top colSpan=3><!--<input type="image" name="getPassword" src="images/english/buttons/forgot_pass.gif" height="15" border="0" onClick="TurnOffVerify()">--><A href="javascript:forgotPasswordSubmit()"><B>Forgot your password?</B></A> <BR>(We'll send an email with your password.) </TD></TR>
<TR>
<TD vAlign=top width=20><INPUT type=checkbox value=true name=RememerLoginEmail></TD>
<TD vAlign=top>Remember my email when I return.&nbsp;</TD></TR>
<TR>

i have this HTML source ( may not always look this but its a FORM )

The question is :

How to crop thoose source to become this :
all the line that contain <input....
just want to get the name="..... ( not include the name="" just in between the " ")

if not clear here some sample again
INPUT type=checkbox value=true name=RememerLoginEmail> become this ---> "RememerLoginEmail"
all begin with input or INPUT

Thanks
0
abangbatax
Asked:
abangbatax
  • 5
  • 2
1 Solution
 
SweetsGreenCommented:
either you could use a regular expression to match the <INPUT> tag and then pull out the name

--or--

use the MSHTML object to parse the HTML...that object will have a forms collection...which will have a element collection...iterate thru that pulling out the name...but the source url (which can be a file) must be valid HTML
''You have to add a reference to the Microsoft HTML Object Library

''this should work
   Dim objMSHTML As New MSHTML.HTMLDocument
    Dim objDocument As MSHTML.HTMLDocument
    Set objDocument = objMSHTML.createDocumentFromUrl(Url, vbNullString)

    ''wait until the d/l is complete
    While objDocument.readyState <> "complete"
    ''<TODO> add timer/counter here to eventualy timeout
        DoEvents
    Wend

   Dim objForm As HTMLFormElement
    Dim objInput As HTMLInputElement
    Dim obj As Object
   
    For Each objForm In objDocument.Forms
        For Each obj In objForm.elements
            If UCase(obj.nodeName) = "INPUT" Then
                Set objInput = obj
                '''set name using objInput.name
            End If
        Next
    Next
0
 
SweetsGreenCommented:
you can also use objMSHTML.createDocumentFragment
so long as the tags match up...they don't in your example
0
 
abangbataxAuthor Commented:
Hi SweetsGreen,

If you dont mind, what is the source for webbrowser_downloadcomplete instead of  MSHTML.HTMLDocument
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
SweetsGreenCommented:
simply replace

Set objDocument = objMSHTML.createDocumentFromUrl(Url, vbNullString)

with

Set objDocument = wb.document ''where wb is your webbrowser control

''''

I don't believe there is a way to parse out elements using the webbrowser control alone.
You might or might not need the reference to the "Microsoft HTML Object Library " since the WebBrowser control is based off of MSHTML...but I'm not sure off the top of my head.
0
 
abangbataxAuthor Commented:
Can anyone convert this VB.NET source to VB?

    Private Function FilterInput(ByVal htmlText As String, ByVal listB As ListBox) As String()

        Dim inputPattern As String = "<INPUT(?<input>[\w\s" & Chr(34) & "=']+)"
        Dim namePattern As String = "NAME=" & Chr(34) & "?(?<name>[A-Za-z0-9_]+)"

        Dim regexInput As New Regex(inputPattern, RegexOptions.IgnoreCase)
        Dim regexName As New Regex(namePattern, RegexOptions.IgnoreCase)

        For Each inputElements As Match In regexInput.Matches(htmlText)
            Dim input As String = inputElements.Groups("input").Value

            Dim name As Match = regexName.Match(input)

            If name.Success Then
                listB.Items.Add(name.Groups("name").Value)
            End If
        Next inputElements

    End Function
0
 
SweetsGreenCommented:
Well I guess you are not going the MSHTML route.

As for your request to convert the vb.net code here you go....
but a few things you should know.
1. VB6 does not have the support for regular expressions that .NET has...you have to use VBScript regex, so you have to add a reference for "Microsoft VBScript Regular Expressions"
2. Your regex's supplied would not work...I changed them and they should be fine now.
3. If you are not familiar with regular expressions, I would of gone with the MSHTML solution, since it does all the parsing work for you...you don't have to worry about using an incorrect regular expression.

    Dim inputPattern As String
    inputPattern = "<(INPUT)[^<>]+>"
   
    Dim namePattern As String
    namePattern = "name\s*=('|& Chr(34) &|\s)*\w*('|& Chr(34) &|\s)*\s*"
   
    Dim regexInput As New RegExp
    regexInput.IgnoreCase = True
    regexInput.Global = True ''since we want all matches
    regexInput.Pattern = inputPattern
   
    Dim regexName As New RegExp
    regexName.IgnoreCase = True
    regexName.Pattern = namePattern
   
    Dim regexCleanup As New RegExp
    regexCleanup.Global = True
    regexCleanup.IgnoreCase = True
    regexCleanup.Pattern = "(name\s*=\s*|'|& Chr(34) &)"

    Dim name As Match
    Dim inputString As String
    Dim inputElements As Match
   
    For Each inputElements In regexInput.Execute(HTMLText)
         inputString = inputElements.Value

            For Each name In regexName.Execute(inputString)
            ''remove any name= and '
                listB.Items.Add (regexCleanup.Replace(name.Value, ""))
                ''Debug.Print regexCleanup.Replace(name.Value, "")
            Next
    Next
0
 
SweetsGreenCommented:
make sure the reference you use is "Microsoft VBScript Regular Expressions 5.5"
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 5
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now