Link to home
Create AccountLog in
Avatar of gkamckenney
gkamckenney

asked on

Help reading HTML from Active Browser Window

I have a program that runs minimized a uses a Global HotKey to copy manually selected text from a Web Browser. This works great.

I have a new feature request to have my program search for a key word and scrape it automatically, without the user needing to select it first.

I am struggling with this because every example that I have seen requests the web from URL and then reads it, I do not have that luxury.  In my case, the Web browser has already been loaded, and the results are static HTML dumped from a db query.

Can someone help with sample code to identify the active browser window, and read its HTML contents so that I can parse through for my key word.

Any help would be a lifesaver!!
Avatar of Bob Learned
Bob Learned
Flag of United States of America image

Are you talking about scraping HTML out of a Web Browser control, or an external Internet Explorer window?

Bob
Avatar of gkamckenney
gkamckenney

ASKER

Off of an external Internet Explorer window.

Clarification:

The user has the HTML on the page in their Web Browser. I need to get some text off of that HTML page without calling it first.
SOLUTION
Avatar of Bob Learned
Bob Learned
Flag of United States of America image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
OK, so this returns the URLs of all open windows in IE, correct? So once this is accomplished, I am still a little unclear as to the most effective way to search for the value that I want to scrape. Example:

IntegrationID     00019920192831388

These are both stored as static HTML data on the screen. Should I use RegEx or do you have any other birght ideas? sorry, kind of a newcomer...

(Attached is the HTML output of the page)
PSFT-Regular-Entry-Frame-HTML-So.txt
Process:

1) Enumerate open Internet Explorer windows
2) Get the URLs
3) Use an HttpWebRequest to get the HTML
4) Use Microsoft.mshtml namespace to parse the HTML (or use regular expression if the HTML isn't too complex)

Bob
OK, but back to my issue, if I use HttpWebRequest, aren't I then just recalling the URL? If so, this will not work. This particular web site is PeopleSoft, a sophisticated ERP system and will not return squat.

Or am I missing something?
Also, the web page I am wishing to access is always the "Active" window. Can I simplify the code to just return the HTML of the current active browser window. Essentially all I need to do is read some HTML from an open window.

Any help is much appreaciated!
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Sample usage:


    Dim windowList As Dictionary(Of String, String) = InternetExplorerWindows.GetWindows()
 
    Dim htmlText As String = windowList("http://www.google.com/search?hl=en&q=SHDocVw+shellwindows")

Open in new window

Bob,

You have been so extremely helpful, but I am not a usual programmer, and could still use a little help.

Below is the ocde from my project that runs when a global hot key is triggered

The idea is that a user has data in a field called IntegrationID. (refer to the html I sent you earlier)
The way my program works now is the user has to drag their mouse over the data and copy to their clipboard. Now when they hit the hot key it stores the data to a variable that I then use to construct a URL to another application.

The problem is that the user does not want to have to manually select the data, instead they want to hit the global hot key and have the code search for the word IntegrationID and then pad over to read the 17 character code.

As you can tell, they expect far more sophistication that I had antcipated.

So as you can see, getting the URL and HTML back is a start, but I really just need to copy that data from the active window, which will always be this html page.

Sorry for the long-winded explanation and constant requests from a newbie.
Private Sub hotKey_Pressed(ByVal sender As System.Object, ByVal e As HotKeyPressedEventArgs)
 
        Dim iData As IDataObject = Clipboard.GetDataObject()
 
        ' Determines whether the data is in a format you can use.
        If iData.GetDataPresent(DataFormats.Text) Then
            ' Yes it is, so display it in a text box.
            txtPropertyValue.Text = CType(iData.GetData(DataFormats.Text), String)
 
        Else
            ' No it is not.
            txtPropertyValue.Text = "Bad Data"
        End If
 
        'construct the URL.
 
        Dim sURL As String
        sURL = constructURL()
        'launch IE after constructing URL
        OpenBrowser(sURL)
 
    End Sub

Open in new window

1) I have been at this game for a long time

2) I have the reputation for killer amounts of patience, almost to a fault.

3) What happens if you have multiple Internet Explorer windows open?  

4) Is there an expected URL that the browser will have?

5) I can't find 'IntegrationID' in the attached HTML that you included above (PSFT-Regular-Entry-Frame-HTML-So.txt).

Bob
The url is generated after a dynamic call, so not only is it not expected but returns you to a login screen if you recall it.

In the text file you will find Integration ID about halfway down. The data we always want to capture is the next field which is a 12 character value, in this sample "000000010001"

If I can somehow get that text field automatically populated to my "txtPropertyValue.Text" on the active focused, IE window, I would be in the clear.

I really do appreciate your help! If you ever need help with Documentum or any other Document Management system, I would be glad to repay the favor!
Oh, Integration ID, not IntegrationID (d'oh):

<label for='VOUCHER_PACKSLIP_NO' class='PSEDITBOXLABEL' >Integration ID</label>
</td>
</tr>
<tr>
<td height='51' colspan='2'></td>
<td colspan='4'  valign='top' align='LEFT'>
<span class='PSEDITBOX_DISPONLY' >000000010001</span>
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Man, this is good stuff, we are almost there, and you have been such a big help I really do hate to ask.

Questions / Issues:

1. I get errors when my program executes. The first error is a Runtime error where it is complaining about "setupTimeout()" in the HTML that I sent you. The additional warning is "Microsoft JScript runtime error: Object expected"
2. Then I get another runtime error on Line 1, syntax error.
3. Once we get through these errors, the program continues and does return the value to my text box. The problem is it is trimming the leading zeros. I need these leading zeros.
4. Final question, this is great, but how do I now set it up to pull the HTML from the active HTML browser and not a local file.

I will reward you with a million points if I could!

John
Here is the code from my post above. It has both the code you included and my function that returns the id to my text box and continues my program.

So I just need to comibine the code you inclueded earlier to gather the IE window / document?

    Private Sub hotKey_Pressed(ByVal sender As System.Object, ByVal e As HotKeyPressedEventArgs)
 
        Dim html As String = My.Computer.FileSystem.ReadAllText(Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory) & "\PSFT-Regular-Entry-Frame-HTML.htm")
        Dim id As String = HtmlParser.FindIntegrationID(html)
 
        Dim iData As IDataObject = Clipboard.GetDataObject()
 
        ' Determines whether the data is in a format you can use.
        If iData.GetDataPresent(DataFormats.Text) Then
            ' Yes it is, so display it in a text box.
            'txtPropertyValue.Text = CType(iData.GetData(DataFormats.Text), String)
            txtPropertyValue.Text = id
 
        Else
            ' No it is not.
            txtPropertyValue.Text = "Bad Data"
        End If
 
        'construct the URL.
        Dim sURL As String
 
        sURL = constructURL()
 
        'launch IE after constructing URL
        OpenBrowser(sURL)
 
    End Sub
 
    Friend Class HtmlParser
 
        ' Look for:
        ' <label for='VOUCHER_PACKSLIP_NO' class='PSEDITBOXLABEL' >Integration ID</label>
        ' </td>
        ' </tr>
        ' <tr>
        ' <td height='51' colspan='2'></td>
        ' <td colspan='4'  valign='top' align='LEFT'>
        ' <span class='PSEDITBOX_DISPONLY' >000000010001</span>
 
        Public Shared Function FindIntegrationID(ByVal pageText As String) As String
            Dim document As New mshtml.HTMLDocument()
 
            ' The HTML document class doesn't have the correct 'write' method, 
            ' so use the IHTMLDocument2 interface.
            CType(document, mshtml.IHTMLDocument2).write(pageText)
 
            Dim labelFound As Boolean = False
            Dim count As Integer = 0
 
            For Each table As mshtml.HTMLTable In document.getElementsByTagName("table")
 
                If table.className = "PSPAGECONTAINER" Then
 
                    ' Loop through all the table cells (<td> elements)
                    For Each cell As mshtml.HTMLTableCell In table.getElementsByTagName("td")
 
                        If Not labelFound Then
                            ' Look for the last 'Integration ID' label, since we need the lowest
                            ' level <td> element.
                            For Each label As mshtml.HTMLLabelElement In cell.getElementsByTagName("label")
                                If label.innerText = "Integration ID" Then
                                    If count < 4 Then
                                        count += 1
                                    Else
                                        ' The right <label> was found, so quit looking, and start looking for the next <span>.
                                        labelFound = True
                                        Exit For
                                    End If
                                End If
                            Next label
                        Else
                            ' The next element that has a <span> child should be the 'Integration ID' value.
                            For Each span As mshtml.HTMLSpanElement In cell.getElementsByTagName("span")
                                Return Val(span.innerText)
                            Next span
                        End If
 
                    Next cell
 
                End If
 
            Next table
 
            Return ""
 
        End Function
 
    End Class

Open in new window

FYI, I tried Format(txtProperty.text,"000000000000") after I set txtProperty.text = id in Line 12, but it still returns 10001. I am thinking that the leading zeros are being skipped in the code you sent me, but I could also be wrong.
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
I understand, you are the most patient person I have ever met!

I get the first Runtime error immediatly after I hit the hot-key combination. Once I get through this, the next Runtime error (3 or 4 in a row) all fire, one after another, see attachement.
runtime-errors.pdf
BTW, that formtting is not working either. I am conviced the problem as at the HtmlParser.FindIntegrationID(html), I think it is not getting the leadin zeros. I wrote in a msgbox to display the id (which is the variable to hold the FindIntegrationID) at it only returns 10001.
OK, see i am learning from your direction. The text was never gettting to the conrol because it was never formatted when it retreived the data from the HTML. So I changed the code you provided me:

Return Val(span.innerText)

With:

Return Format(Val(span.innerText), "000000000000")

Awesome! So now we just need to get through the errors and figure out how to read this from the active IE Browser HTML instead of a file on the desktop.

We are so close I can taste it.
Are you able to debug Javascript, or do you need a lesson on that, too?

Bob
Sorry to say, I will need help. I am traditionally a ASP guy, with VB6 exp, other than that a newbie.

I told you I was a newbie...and please, stop if I am overstepping my bounds and using up to much of your time.

If we can get through these errors and read the active IE browser HTML, we would be home free.

Either way, and help is appreciated. I have been plugging away at this all day and have very little to show but my formatted text (yeah) and not much else.

John
BTW, did I mention that when ou attemp to debug the jscript errors nothing happens?
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Thank you for this, I am clear until step 3. Where would I add the Javascript again? The HTML page cannot be modified in the end because it comes from another application? I am missing something obvious?
AH, I think I know what's going on... The script errors are caused by the HTML page (Peoplesoft) not having access to the primary frame and underlying system. I do think that once I am able to test this against a "live" instance of this page, the errors will go away.

So, Bob, if I could be so bold. If you can just help me tie all of your examples together, I just need:

1. Read the current open IE window's HMTL document to your parser function and I am home free.

Thank you,

John
Bob,

I have confirmed that the errors are not occurring on a workstation that has access to the Peoplesoft application.

The only remaining issue that I am struggling with is to take your sample usage:

Dim html As String = My.Computer.FileSystem.ReadAllText(Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory) & "\PSFT-Regular-Entry-Frame-HTML.htm")
Dim id As String = HtmlParser.FindIntegrationID(html)

And modify it to read the document from the active Browser window. Tried using your "InternetExplorerWindows(Get Windows)" function, but have had no success.

Thank you in advance!

John
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
OK, this seems like a good approach, but I can't compile because it am getting an error:

Value of type 'System.Collections.Generic.KeyValuePair(Of String, String)' cannot be converted to 'String'.

Am I missing something?
Now, THAT is an error that my internal compiler doesn't catch:

For Each url As String In windowList.Values

That was code typed into the Comment block, and not in a code module.

Bob
So, see the attached code derived from your example. I always know that the URL will have "ENTER_VOUCHER_INFORMATION" so I am simulating that command by calling the local HTML file on my desktop, which has PSFT in the URL.

When I run in debug mode I get the error referenced in the attached PDF.

Also, the code generating this error is:
        If iData.GetDataPresent(DataFormats.Text) Then
 
            Dim windowList As Dictionary(Of String, String) = InternetExplorerWindows.GetWindows()
 
            For Each url As String In windowList.Values
                'If url.Contains("ENTER_VOUCHER_INFORMATION") = True Then
                If url.Contains("PSFT") = True Then
 
                    Dim html As String = windowList(url)
                    Dim id As String = HtmlParser.FindIntegrationID(html)
 
                    txtPropertyValue.Text = id
 
                End If
 
            Next url

Open in new window

KeyNotFoundException.pdf
Bob,

OK, now that I look at the attached code, it is obvious it won't work.

I am struggling on how to return a single url to pass to the "Dim html As String = " from your previous example a while back.

If I can just use the example you gave me (see below) then I wold modify the qualifier to be:

If url.Contains("Integration ID") = True Then
                Dim html As String = windowList(url)
                Dim id As String = HtmlParser.FindIntegrationID(html)
Because when I write a msgbox(html) I get the whole web page dumped to a messagebox.

I would be home free, right, but I am lost on this final step.

John


Dim windowList As Dictionary(Of String, String) = InternetExplorerWindows.GetWindows()
 
For Each url As String In windowList.Values
   If url = "http://www.google.com" Then
       Dim html As String = windowList(url)
       Dim id As String = HtmlParser.FindIntegrationID(html)
   End If
Next url

Open in new window

Oh, me thinks I doth screwed up before (d'oh):

For Each url As String In windowList.Keys

Bob
Bob,

That did the trick, it worked as expected with multiple URLs, it picked the one with the matching URL, see below.

As I expected, when I tested this against the live system, it did not work. Reason being is that Peoplesoft id frame-based. I will start digging into this to see how we can modify this to get the right frame. The other issue is that I am guessing if multiple IE Windows are open matching the URL, my code will launch that many browsers with that many search results. Will need to tweak the code to make sure it only searches against the "Active" broswer window.

Any pointers on these remaining issues to bring this project to a close?

Any help would be great, you have been such a big help in this effort. I could not have done this without you.

Thanks,

John
Dim windowList As Dictionary(Of String, String) = InternetExplorerWindows.GetWindows()
 
        For Each url As String In windowList.Keys
            'This is what we should be able to grab on their workstation(s)
            If url.Contains("ENTER_VOUCHER_INFORMATION") = True Then
                'If url.Contains("PSFT") = True Then
                Dim html As String = windowList(url)
                Dim id As String = HtmlParser.FindIntegrationID(html)
 
                txtPropertyValue.Text = id
 
                'construct the URL.
                Dim sURL As String
                sURL = constructURL()
 
                'launch IE after constructing URL
                OpenBrowser(sURL)

Open in new window

John,

If you are looking to prevent multiple windows from being opened with the same URL, you can test to see if the 'windowList' dictionary has that URL, and not open a new browser if it does.

Bob
Not exactly, I don't care how many windows are open, or preventing these to open, I am looking to identify the Active Window and only read that URL. It is expected that the client may have many windows open for the same URL, but these pages contain diferrent ids. I am only looking to search the active windows when they hit my global keys as this tells me the current page has the id they are looking for.

Also, i know which frame I need to search (yes this program uses frames) any easy way to set the URL search or other control to only read the <frame name="TargetContent"> HTML document?

Thanks,

John
I think I made it through the "which windows is active issue"  All i really need help on now is gaining access to the specific frame "TargetContent" on the URL that matches my criteria I specify on the

If url.Contains("ENTER_VOUCHER_INFORMATION") = True Then line.

Thanks,

John
Let's break down some important parts for the parser:

1) Get all the tables from the document:

For Each table As mshtml.HTMLTable In document.getElementsByTagName("table")

2) Get all the cells from those tables:

For Each cell As mshtml.HTMLTableCell In table.getElementsByTagName("td")

What you need to do is to wedge a call to get the IFrame with document.getElementsByName before the 'Get all tables', and then use that element to call getElementsByTagName("table") instead of the 'document' element.

Bob
Bob,

Thank you. I  have been trying to get the frame, am I missing something. See code below and attached HTML Source that contains the frames.

Thank you,

John
    Friend Class HtmlParser
 
        'This is the main logic for the PeopleSoft scrape. Parse through the HTML looking for the required data.
        ' Look for:
        ' <label for='VOUCHER_PACKSLIP_NO' class='PSEDITBOXLABEL' >Integration ID</label>
        ' </td>
        ' </tr>
        ' <tr>
        ' <td height='51' colspan='2'></td>
        ' <td colspan='4'  valign='top' align='LEFT'>
        ' <span class='PSEDITBOX_DISPONLY' >000000010001</span>
 
        Public Shared Function FindIntegrationID(ByVal pageText As String) As String
            Dim document As New mshtml.HTMLDocument()
 
            ' The HTML document class doesn't have the correct 'write' method, 
            ' so use the IHTMLDocument2 interface.
            CType(document, mshtml.IHTMLDocument2).write(pageText)
 
            Dim labelFound As Boolean = False
            Dim count As Integer = 0
 
            For Each frame As mshtml.HTMLIFrame In document.getElementsByName("frame")
 
                If frame.name = "TargetContent" = True Then
 
                    MsgBox("Found it")
 
                    For Each table As mshtml.HTMLTable In document.getElementsByTagName("table")
 
                        If table.className = "PSPAGECONTAINER" Then
 
                            ' Loop through all the table cells (<td> elements)
                            For Each cell As mshtml.HTMLTableCell In table.getElementsByTagName("td")
 
                                If Not labelFound Then
                                    ' Look for the last 'Integration ID' label, since we need the lowest
                                    ' level <td> element.
                                    For Each label As mshtml.HTMLLabelElement In cell.getElementsByTagName("label")
                                        If label.innerText = "Integration ID" Then
                                            If count < 4 Then
                                                count += 1
                                            Else
                                                ' The right <label> was found, so quit looking, and start looking for the next <span>.
                                                labelFound = True
                                                Exit For
                                            End If
                                        End If
                                    Next label
                                Else
                                    ' The next element that has a <span> child should be the 'Integration ID' value.
                                    For Each span As mshtml.HTMLSpanElement In cell.getElementsByTagName("span")
                                        Return Format(Val(span.innerText), "000000000000")
                                    Next span
                                End If
 
                            Next cell
 
                        End If
 
 
                    Next table
 
                    Return ""
                End If
            Next frame
 
        End Function
 
    End Class

Open in new window

Regular-Entry-Main.txt
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Bob,

The code below (from your example) gets the document, so we don't need to get it here, right? The src will always change. I am just looking to identify the right frame, because right now the program does read the page by the "If url.Contains("Regular") = True Then" statement.

So really, aren't I just trying to idenitfy the right frame so that the Parser can continue?

John
Private Sub hotKey_Pressed(ByVal sender As System.Object, ByVal e As HotKeyPressedEventArgs)
 
        'JWM Debug Comment:From desktop for testing, remove after final test
        'Dim html As String = My.Computer.FileSystem.ReadAllText(Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory) & "\PSFT-Regular-Entry-Frame-HTML.htm")
        'Dim id As String = HtmlParser.FindIntegrationID(html)
 
        Dim windowList As Dictionary(Of String, String) = InternetExplorerWindows.GetWindows()
 
        For Each url As String In windowList.Keys
            'This is what we should be able to grab on their workstation(s)
            'Put this back after testing on dev!
            'If url.Contains("ENTER_VOUCHER_INFORMATION") = True Then
            If url.Contains("Regular") = True Then
                Dim html As String = windowList(url)
                Dim id As String = HtmlParser.FindIntegrationID(html)
                'MsgBox(url)
                txtPropertyValue.Text = id
            Else
                'Really cant do anything because their will always be an else if multiple windows are open.
            End If
        Next url
 
        'construct the URL.
        Dim sURL As String
        sURL = constructURL()
 
        'launch IE after constructing URL
        OpenBrowser(sURL)
 
        'UnComment the following section in order for the copy to clipboard to work.
        'This change was made for the Five Star Quality Care Project
        'This would typically copy the value from the Clipboard to the text box, thus passing to the URL
        'txtPropertyValue.Text = CType(iData.GetData(DataFormats.Text), String)
 
    End Sub
 
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Me.Close()
    End Sub
 
    Friend Class HtmlParser
 
        'This is the main logic for the PeopleSoft scrape. Parse through the HTML looking for the required data.
        ' Look for:
        ' <label for='VOUCHER_PACKSLIP_NO' class='PSEDITBOXLABEL' >Integration ID</label>
        ' </td>
        ' </tr>
        ' <tr>
        ' <td height='51' colspan='2'></td>
        ' <td colspan='4'  valign='top' align='LEFT'>
        ' <span class='PSEDITBOX_DISPONLY' >000000010001</span>
 
        Public Shared Function FindIntegrationID(ByVal pageText As String) As String
            Dim document As New mshtml.HTMLDocument()
 
            ' The HTML document class doesn't have the correct 'write' method, 
            ' so use the IHTMLDocument2 interface.
            CType(document, mshtml.IHTMLDocument2).write(pageText)
 
            Dim labelFound As Boolean = False
            Dim count As Integer = 0
 
            For Each frame As mshtml.HTMLIFrame In document.getElementsByName("frame")
 
                'Testing frame collection
                If frame.name = "TargetContent" = True Then
 
                    MsgBox("Found it")
 
                    For Each table As mshtml.HTMLTable In document.getElementsByTagName("table")
 
                        If table.className = "PSPAGECONTAINER" Then
 
                            ' Loop through all the table cells (<td> elements)
                            For Each cell As mshtml.HTMLTableCell In table.getElementsByTagName("td")
 
                                If Not labelFound Then
                                    ' Look for the last 'Integration ID' label, since we need the lowest
                                    ' level <td> element.
                                    For Each label As mshtml.HTMLLabelElement In cell.getElementsByTagName("label")
                                        If label.innerText = "Integration ID" Then
                                            If count < 4 Then
                                                count += 1
                                            Else
                                                ' The right <label> was found, so quit looking, and start looking for the next <span>.
                                                labelFound = True
                                                Exit For
                                            End If
                                        End If
                                    Next label
                                Else
                                    ' The next element that has a <span> child should be the 'Integration ID' value.
                                    For Each span As mshtml.HTMLSpanElement In cell.getElementsByTagName("span")
                                        Return Format(Val(span.innerText), "000000000000")
                                    Next span
                                End If
 
                            Next cell
 
                        End If
 
 
                    Next table
 
                    Return ""
                End If
            Next frame
 
        End Function
 
    End Class

Open in new window

John,

You attached Regular-Entry-Main.txt, which had the frameset/frame HTML, but that HTML didn't have the content for the frames.  

Does the 'pageText' that you get from this line contain the 'IntegrationID' HTML?

    Public Shared Function FindIntegrationID(ByVal pageText As String) As String

Bob
Oh, sorry. Yes, the frameset contains the local reference for the "TargetContent" frame. This is the frame that has the Integration ID, the page that worked effectively before testing on their systems.

So, the parser works if that page is loaded singly, but when launched within the frame, no dice.

So, with the Regular-Entry-Main.txt(html version) loaded, i need to gain access to the document in the "TargetContent" frame to read into your FindIntegrationID routine. This would be a realistic test that would validate that this would work in their live environment.

the attached zip has all of the files referenced in the frameset. (The page that loads in the "TargetContent" is the one page I gave you when you originally wrote the FindIntegrationID.

Thanks,
John

Regular-Voucher.txt
John,

What I see, and what I tried to explain, is that you get the <frame> element that you need, and then use the 'src' to read the referenced file, and then parse the HTML that you get from the file.

Bob


Dim html As String = My.Computer.FileSystem.ReadAllText(sourceFile)

Open in new window

Bob,

I understand, but as you can probably tell since you have been working with me, sometimes I need a little direction :)

Since the examples above are local files, and the real application will be a "live" website, I know I need to identify the right frame in the code attached below.

So I just need to figure out how to best identify the frame in this code block while keeping the rest in tact, which works perfect w/o frames.

Thanks,

John
    Private Sub hotKey_Pressed(ByVal sender As System.Object, ByVal e As HotKeyPressedEventArgs)
 
        Dim windowList As Dictionary(Of String, String) = InternetExplorerWindows.GetWindows()
 
        For Each url As String In windowList.Keys
            'This is what we should be able to grab on their workstation(s)
            'Put this back after testing on dev!
            'If url.Contains("ENTER_VOUCHER_INFORMATION") = True Then
            If url.Contains("Regular") = True Then
                Dim html As String = windowList(url)
                Dim id As String = HtmlParser.FindIntegrationID(html)
                'MsgBox(url)
                txtPropertyValue.Text = id
            Else
                'Really cant do anything because their will always be an else if multiple windows are open.
            End If
        Next url
 
        'construct the URL.
        Dim sURL As String
        sURL = constructURL()
 
        'launch IE after constructing URL
        OpenBrowser(sURL)
 
        'UnComment the following section in order for the copy to clipboard to work.
        'This would typically copy the value from the Clipboard to the text box, thus passing to the URL
        'txtPropertyValue.Text = CType(iData.GetData(DataFormats.Text), String)
 
    End Sub

Open in new window

If you need a little help, then I need "real live" HTML to show you what you need to do.

Bob
Agreed, but that's the problem. The client will not allow access to the Peoplesoft system. That is one of the main reasons this project has been such a burden. Everytime you make something work in development, you deploy to a "live" site and you run into more gotchas.

Live site or not, I think what you have already provided me is 99% of what is needed. I know in the "live" site the frame I need is "TargetContent". So, unless I am wrong, I really just need to add this to the "windowList" logic you provided me to get the right frame. Then we use the FindIntegrationID logic "as is".

So, just like the local HTML example I gave you, we have a good replica, but we just cannot rely on the "src",  we need to rely on the frame "TargetContent" and we can read  the "src" from:

For Each url As String In windowList.Keys
            'This is what we should be able to grab on their workstation(s)
            'Put this back after testing on dev!
            'If url.Contains("ENTER_VOUCHER_INFORMATION") = True Then
            If url.Contains("Regular") = True Then
                Dim html As String = windowList(url)

Somehow reusing the "Dim html As String = windowList(url), but reading the "TargetContent" frame instead of the URL.

Am I oversimplying this or missing something? My problem is I cannot verify that I am getting the frame, but I know I am still getting the URL from the "If url.Contains("Regular") = True Then", because if I set it to something that I do not have running, nothing happens.

Thanks,
John :)
John,

>>The client will not allow access to the Peoplesoft system
Do you need access to Peoplesoft to see the HTML from the View Source menu in the browser?

Bob
Yes, but I have the "actual" source if you want it. The problem is that it is dynamic, so it may change everytime it is accessed. See attached, rename to .zip.

Thanks,

John
tst.txt
John,

After all that, if the 'src' attributes are dynamic, then I am not sure you can get the HTML that you are looking for.

The URL that you need to reference would look something like this:

Bob
src="http://209.135.62.132/psc/B88FFIV/EMPLOYEE/ERP/c/ENTER_VOUCHER_INFORMATION.VCHR_EXPRESS.GBL?Folder=MYFAVORITES&amp;PortalActualURL=http%3a%2f%2f209.135.62.132%2fpsc%2fB88FFIV%2fEMPLOYEE%2fERP%2fc%2fENTER_VOUCHER_INFORMATION.VCHR_EXPRESS.GBL%3fFolder%3dMYFAVORITES&amp;PortalContentURL=http%3a%2f%2f209.135.62.132%2fpsc%2fB88FFIV%2fEMPLOYEE%2fERP%2fc%2fENTER_VOUCHER_INFORMATION.VCHR_EXPRESS.GBL&amp;PortalContentProvider=ERP&amp;PortalCRefLabel=Regular%20Entry&amp;PortalRegistryName=EMPLOYEE&amp;PortalServletURI=http%3a%2f%2f209.135.62.132%2fpsp%2fB88FFIV%2f&amp;PortalURI=http%3a%2f%2f209.135.62.132%2fpsc%2fB88FFIV%2f&amp;PortalHostNode=ERP&amp;NoCrumbs=yes"

Open in new window

ASKER CERTIFIED SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Most helpful and knowledgeable person I have ever worked with!
Bob,

I cannot thank you enough for your help and patience in helping me with this. You are obviously very talented and experienced, but most of all, you are a good person and I thank you!

John
John,

I learned a lot of this question, too.  I started with a simple class that enumerates Internet Explorer windows, and wound up with a class that can get both HTML text from those same windows, and from the internal IFrame elements.

Bob
Bob,

Found a small issue, if the user has more than one document open with an Integration ID (multiple browser windows) they get an unhandled exception at the "list.Add(window.LocationName, windowDocument.body.outerHTML" statement. What's the easiest method to limit this list so that only one result is returned.

Thank you,

John