Link to home
Create AccountLog in
Avatar of csferguson
csferguson

asked on

How do you programmatically fill (autofill) web forms from a standalone application using VB.NET?

I am trying to write a standalone application (or an IE toolbar) that will automatically populate any web form for me.  I am trying to use the MS HTML Object Library and the Microsoft Internet Controls.  So far I have been able to iterate through all open browsers and read the titles and location names, but that is as far as I've been able to get.  I have used JavaScript and the DOM to populate fields in my own web pages before, but this seems to be significantly more difficult.  I would prefer to stick with VB.NET.  Any help's appreciated.
Avatar of S-Twilley
S-Twilley

I might be wrong... but what pages are you trying to auto-fill... I have a feeling that with secure webpages, you can't access their forms to edit them ... I could be wrong though
Avatar of csferguson

ASKER

For instance, Amazon.com.

If you go to check out as a new user you have to fill in all your personal info.  If you do a view source you can see all the html tags and field names.  As far as editing goes, I would only be trying to edit the values of the text boxes.  I would have to read in the HTML objects' names to know which ones were there so I know which values to have input into which textboxes.  There are many other applications out there that do what I am trying to do.  So, I would think it wouldn't be too terribly difficult.
I tested this on the front page of Amazon UK. My form had a WebBrowser control called WebA... I then got the document property and initiated a loop thru all the Input tags, im assuming here you might have a collection of paired-strings... matching a tag's name e.g. Forename, Lastname etc  to it's intended value. Here i just made a quick example. If I've misunderstood or been too vague, I'll be happy to clarify or be corrected


        Dim myDoc As mshtml.IHTMLDocument2
        Dim inpElement As mshtml.IHTMLInputElement

        myDoc = CType(WebA.Document, mshtml.IHTMLDocument2)

        For Each inpElement In myDoc.all.tags("input")
            If inpElement.name = "field-keywords" Then
                inpElement.value = "Final Fantasy"
            End If
        Next
I'd suggest testing this code on different pages... preferably ones which have more than one form with input text boxes in, ones where text boxes arent in forms, etc... may also be worth testing on secure pages... and should it not work (which I don't think it will), put in an IF statement so your code doesn't crash when trying to edit a secure form
Is WebA a reference to the IE object you have open or is it an actual object on your form?  I don't have a webBrowser control showing up in my toolbox.  Should I??  I've referenced mshtml and SHDocVw.

Are you doing this in your own web browser or an IE browser that you have manually navigated to?
I'm getting a cast type error trying to cast the document as type mshtml.IHTMLDocument2.

Do you know where I can find some real good reading on the mshtml and SHDocVw objects?
Alright, I found the WebBrowser control.  Can I do this w/out it though?  I want to automatically fill textboxes in a web form I already have open in Internet Explorer.
Sorry for not replying, not getting alerts on my e-mails... if you have created an Internet Explorer instance from your VB.NET program, I think you can use it without a webcontrol.

Say for instance
             Dim myIE as new SHDocVw.InternetExplorer()
            myIE.visible = True

then you can modify the code

        Dim myDoc As mshtml.IHTMLDocument2
        Dim inpElement As mshtml.IHTMLInputElement

        myDoc = CType(myIE.Document, mshtml.IHTMLDocument2)

        For Each inpElement In myDoc.all.tags("input")
            If inpElement.name = "field-keywords" Then
                inpElement.value = "Final Fantasy"
            End If
        Next

..........

in your code, you may want to add an event handler so that when your instance of IE ... so this is a different implementation


    Dim myIE As New SHDocVw.InternetExplorer()

    Private Sub Button3_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button3.Click
        myIE.Navigate2("http://www.amazon.co.uk")
        AddHandler myIE.DocumentComplete, AddressOf PageChange
        myIE.Visible = True
    End Sub

' you could use the sub below with different instances of an IE object, since it relies on the pDisp object (much like the sender object in other events)

    Sub PageChange(ByVal pdisp As Object, ByRef url As Object)
        Dim thisURL As String

        Try
            thisURL = CStr(url)

            If thisURL.ToLower.StartsWith("http://www.amazon.co.uk") Then
                Dim ieObj As SHDocVw.InternetExplorer
                ieObj = CType(pdisp, SHDocVw.InternetExplorer)
                Dim myDoc As mshtml.IHTMLDocument2
                Dim inpElement As mshtml.IHTMLInputElement

                myDoc = CType(ieObj.Document, mshtml.IHTMLDocument2)

                For Each inpElement In myDoc.all.tags("input")
                    If inpElement.name = "field-keywords" Then
                        inpElement.value = "Final Fantasy"
                    End If
                Next
            End If
        Catch

        End Try
    End Sub
As for obtaining the Document object of some IE instance that was created outside of your program, I don't know, sorry
Also... as for references/guides to using mshtml, when I was originally learning about it, I didn't find any... I just looked at the Object Browser and played about. Someone else might be able to give you a good reference but I always find trying to look for good VB.NET guides quite hard (other than this site of course :oP )
This is the code I used to get the document title and location name from an IE instance created outside my program.

    Dim SWs As New SHDocVw.ShellWindows
    Dim IE As SHDocVw.InternetExplorer

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
        Dim doc
        ListBox1.Items.Clear()
        ListBox2.Items.Clear()


        TextBox1.Text = SWs.Count

        For Each IE In SWs
            ListBox1.Items.Add(IE.LocationName)

            doc = IE.Document
            If TypeOf doc Is mshtml.HTMLDocument Then
                ListBox2.Items.Add(doc.Title)
            End If
        Next
    End Sub




It was giving me the cast error when I tried to cast this document object as a type IHTMLDocument2.  Anymore suggestions???
Hi... just tested your code and it works for me... did modify it slightly out of curiosity.

        Dim doc
        Dim trueDoc As mshtml.IHTMLDocument2
        ListBox1.Items.Clear()
        ListBox2.Items.Clear()


        TextBox1.Text = SWs.Count

        For Each IE In SWs
            ListBox1.Items.Add(IE.LocationName)

            doc = IE.Document
            If TypeOf doc Is mshtml.IHTMLDocument2 Then
                trueDoc = CType(doc, mshtml.IHTMLDocument2)
                ListBox2.Items.Add(trueDoc.title & " " & trueDoc.all.length)

            End If
        Next

=============================================

If you can let me know which Window it threw the error on so I can try and emulate the error, it'll help

PS.... I didn't know about the shellwindows property, so thanx for teaching me something new
Hi again... created some code using your ShellWindows property... you'll need to modify it to load rules from a database or xml file or something.



    Private Sub SearchWindows()
       ' Starting point... searches through all IE instances
        Dim doc
        Dim trueDoc As mshtml.IHTMLDocument2


        For Each IE In SWs
            If IE.ReadyState = SHDocVw.tagREADYSTATE.READYSTATE_COMPLETE Then
                ' Will only attempt to modify a document if it is ready
                doc = IE.Document
                If TypeOf doc Is mshtml.IHTMLDocument2 Then
                    ' If a proper IE instance
                    trueDoc = CType(doc, mshtml.IHTMLDocument2)
                    ParseDocument(trueDoc, IE.LocationURL)
                End If
            End If
        Next
    End Sub

    Sub ParseDocument(ByRef thisDoc As mshtml.IHTMLDocument2, ByVal sURL As String)
        Dim inpElement As mshtml.IHTMLInputElement
        Dim matchedValue As String = ""

        ' loops through all input tags, this may need to be changed as im not sure how good it is at finding all Input tags
        ' it could miss Hidden ones

        For Each inpElement In thisDoc.all.tags("input")
            matchedValue = ""

            ' passes tag name and url to function to determine if there is a rule applicable to it
            If matchFound(sURL, inpElement.name, matchedValue) Then
                inpElement.value = matchedValue
            End If
        Next
    End Sub

    Function matchFound(ByVal surl As String, ByVal ElementName As String, ByRef NewValue As String) As Boolean
        ' All this code you'll probably replace with rules loaded from a database or file

        ' this rule depends on the url
        If surl.ToLower.IndexOf("amazon.co.uk") Then
            If ElementName.ToLower = "field-keywords" Then
                NewValue = "Final Fantasy"
            End If
        End If

        ' this rule is a generic rule independant of url
        If ElementName.ToLower.IndexOf("forename") >= 0 Then
            NewValue = "steven"
        End If

        Return (NewValue.Length > 0)
    End Function

    Private Sub cmdList_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles cmdList.Click
        ' this was just a button to test it, you might have it on a timer to find new instances of an IE window or maybe an icon on the system tray
       ' which when clicked shows a list of all IE windows in a popup menu, and on clicking, it parses that window
        SearchWindows()
    End Sub
I tried your code.  The only thing I added was:

Dim IE as SHDocVw.InternetExplorer
&
Dim SWs As SHDocVw.ShellWindows

Is that correct???

I'm getting this error:
An unhandled exception of type 'System.NullReferenceException' occurred in WindowsApplication10.exe

Additional information: Object reference not set to an instance of an object.

It's giving me the error on the For Loop when it's looking for an instance of IE.
Do you mean on this line:


    Private Sub SearchWindows()
       ' Starting point... searches through all IE instances
        Dim doc
        Dim trueDoc As mshtml.IHTMLDocument2


        For Each IE In SWs            '                                                                           <== THIS LINE
            If IE.ReadyState = SHDocVw.tagREADYSTATE.READYSTATE_COMPLETE Then
                ' Will only attempt to modify a document if it is ready
                doc = IE.Document
                If TypeOf doc Is mshtml.IHTMLDocument2 Then
                    ' If a proper IE instance
                    trueDoc = CType(doc, mshtml.IHTMLDocument2)
                    ParseDocument(trueDoc, IE.LocationURL)
                End If
            End If
        Next
    End Sub
If that's the case... just use some Try Catch blocks for safety... I'm doing this straight into the text box without using it in VB.NET, but any errors or slipups can be easily corrected... if this fixes you're problem, use similar Try/Catch blocks to eliminate any possibilities of crashing or faulting.

 Private Sub SearchWindows()

        ' Starting point... searches through all IE instances
        Dim doc
        Dim trueDoc As mshtml.IHTMLDocument2

        Try
                For Each IE In SWs
                        Try
                                If IE.ReadyState = SHDocVw.tagREADYSTATE.READYSTATE_COMPLETE Then
                                        ' Will only attempt to modify a document if it is ready

                                        If (IE.Document) Then
                                                'If the property exists

                                                doc = IE.Document

                                                If TypeOf doc Is mshtml.IHTMLDocument2 Then
                                                        ' If property matches the desired type
                                                        trueDoc = CType(doc, mshtml.IHTMLDocument2)
                                                        ParseDocument(trueDoc, IE.LocationURL)
                                                End If
                                        End If
                                End If
                        Catch e2 As Exception
                                'Looks like there was an error with this instance of IE, ignore it and go onto the next instance
                        End Try
                Next
        Catch e1 as Exception
                'Don't do anything
        End Try
End Sub
ASKER CERTIFIED SOLUTION
Avatar of S-Twilley
S-Twilley

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer