Link to home
Start Free TrialLog in
Avatar of kbay808
kbay808Flag for United States of America

asked on

How to create a vba code to extract data by the Element ID from a website with iframes into an excel spreadsheet?

On the attached screen shot, I’m trying to capture all of the records in the activity log located in the red box.  The number of records will vary, but my goal is to get the first 2 records so that I can add 98 more.  ltlbearand3 created the attached IE-Debug file for a similar question that I posted.  I used that to capture all of the target data.  I also attached 2 copies of the debug files.  There are 2 files because there are 2 scenarios that causing the iframes to be different and I need it to work for both.

To make it easier to locate the target data, I changed all of the target data to match the labels listed in the example below and highlighted all of them green.  I also changed the font on the entire activity log to red.

Example of desired output:
(Date/Time 1), (Type 1), (Operator 1), (Description 1)
(Date/Time 2), (Type 2), (Operator 2), (Description 2)
EE-DebugFile-Closed-Sanitized-.docx
EE-DebugFile-Open-Sanitized-.docx
IE-Debug.xlsm
Screen-Shot-of-Target-Data.JPG
Avatar of ltlbearand3
ltlbearand3
Flag of United States of America image

kbay8808,

Since you keep going down this path of extraction from the Web site, I would like to take time while we make this work to help you be able to code this yourself.  Take a look at the code and let me know that parts that you don't understand.  We want to get you to the point where you can do this work yourself and only ask when you are stuck.  

As a general overview of what the code is doing:

The website is using frames to display the data (this is what was throwing me off way back on the first question).  Some of the frames have content within that frame including more frames.  

Since you are working from a web site that is already open, we are using the Shell Object to find the open IE window.  Then we use the Internet Explorer object to access the window.
    Dim objIE As SHDocVw.InternetExplorer
    Dim objShell As SHDocVw.ShellWindows 

Open in new window


We loop through all the open windows to find one that has an HTMLDocument type (that is the web browser)
    For Each objIE In objShell
        If TypeOf objIE.Document Is MSHTML.HTMLDocument Then

Open in new window

           
Then we get the main frames HTML code and put it into an HTMLDocument and we check the URL to make sure we found the one we want:
            Set objIEDoc = objIE.Document
            If Not (objIEDoc Is Nothing) Then
                If objIEDoc.URL = strURL Then

Open in new window

               
               
The debug file then outputs the HTML code and looks for all frames in that document.  If it finds a frame, it prints any HTML associate with that frame and looks for more frames within that frame.  This helped me and IrogSinta (Ron) identify that the first frame was the one with the action.  This is also how we figured out the frame names were changing.  

Once we have that, now we check to make sure we have frames in the document.  Based on the debug logs, it appears the frame names change but the order does not.  Therefore, we grab the first frame from the main page:
        ' Make Sure we had some frames returned
        If objIEDoc.frames.Length > 0 Then
            ' Get the Document from the 1st Frame
            Set varFrame = objIEDoc.frames.Item(0)
            Set objFrameDoc = varFrame.Document

Open in new window


Now we need the second frame from within the first frame:
            ' Get the Detail Frame (Sub Frame #2)
            If objFrameDoc.frames.Length > 1 Then
                Set varFrame2 = objFrameDoc.frames.Item(1)
                Set objFrameDoc2 = varFrame2.Document

Open in new window

               
Now we use the getElementbyID property to find the values you want in the HTML code.  For example here is one value you wanted to find:
<DIV id=X165_2Border style="BORDER-TOP-STYLE: none; BORDER-LEFT-STYLE: none; BORDER-BOTTOM-STYLE: none; BORDER-RIGHT-STYLE: none; WIDTH: 100%; BACKGROUND-COLOR: transparent" MyAttrib="EditBorder"><SPAN onclick="handleOnClick(this, event);" tabIndex=-1 onfocus="handleOnFocus(this, event);" onblur="&#10;    applyToSameControl(this);&#10;  " id=X165_2 class=FormatTableElementReadonly ButtonId="Column5" dvdVar="" scripttype="text" scType="Text" dataType="string" ref="references/model[@id='number.vj']/instance[2]/operator" name="references/model[@id='number.vj']/instance[2]/operator" onchange="handleOnChange(this, event);" readonly="readonly">(Operator 2)</SPAN></DIV></A></DIV></TD>

Open in new window


Notice the start of the section with <DIV which markes the beginning of this piece.  Then it give the ID X165_2.  That is what we need to look for to pull out the value of (Operator 2)

In this new section notice each row has and ID that ends in a _X where x is the row number.  Therefore we can loop through until we don't find any more.


Try this attached code.  Please post the debug if it does not work.  Thanks.
IE-Pull-Data.Q28632832.v1.xlsm
Avatar of kbay808

ASKER

I’m sorry for the long delay.  I was called out of town.  I appreciate you breaking the code down for me so that I can learn how to do it myself.  There are a couple of things that I don’t quite get though.  In the attached debug file, how can I tell which frame/sub frame the target data is in?  And second, the code is looking for the element ID without the last digit after the "_". As it loops does it just look for all element IDs that start the same?  

For example: X161_1, X161_2, X161_3, …
Set objFound = objFrameDoc2.getElementById("X161_" & intLineNum)

Open in new window


I highlighted the target data in green and change the entire range where the target data is found to red.
Debug-File--Sanitized-.docx
Avatar of kbay808

ASKER

I noticed that the target data was set to be entered into Range(“A”).  Since this is going to be in a loop, how do I modify the code to enter the first set of data in Range A5:D5 and then each of the following onto it's own new line?
I had it set to just keep going down column A - although it looks like I forgot to increment the counter to keep moving down - sorry.  We can use the Offset method to move to the right one column for each row.  Give this sheet a try.
IE-Pull-Data.Q28632832.v2.xlsm
Avatar of kbay808

ASKER

I forgot to mention that I got an error when running the code.  Please see the attached screen shots for details.  I bypassed the error and ran the code and here is the debug file from it.  No data was pulled.  When I look at the debug file, for what I can tell, it looks like the target data is in frame 0.  When I look at the code it looks like it’s looking for the target data in frame 1.  I assume that I’m reading it wrong.  Could you please point out what I’m missing?
Error.JPG
Debug.JPG
Debug-File--Sanitized-.docx
OK.  That error was in creating the debug file.  The data looks different than the files you posted earlier.  Give this one a try.
IE-Pull-Data.Q28632832.v3.xlsm
Avatar of kbay808

ASKER

After reading your last post, I realized that the difference in the 2 files was the background tab which changes things.  I need to either account for the different background tabs or have the users set it up their view the same way every time.  That got me thinking.  What I think I need is a tool that I will extract an element ID from every frame.  That way, I can just keep adding searches to the one that worked.  I attached a spreadsheet with my idea.  What do you thing?
Find-iFrame-for-Element-ID-Tool.xlsm
Your posted file did not contain any code that I could see.  I tried to code for the two different source pages that showed in the debug files.  See how this works:
IE-Pull-Data.Q28632832.v4.xlsm
Avatar of kbay808

ASKER

There are a couple of issues.  First, it’s not finding the description.  The other issue is that it’s only finding the 5th row on target data.  I attached I screen shot and highlighted the row that was captured.  I also attached a copy of the IE Pull Data file with the results.
Debug-File--Sanitized-.docx
IE-Pull-Data.Q28632832.v4.xlsm
ScreenShot.jpg
Avatar of kbay808

ASKER

I entered the code into the tool with my limited knowledge of writing code.  With that said, it does not work.  What I’m attempting to do is create a master tool that will look in every frame within the website for the same element ID.  My hope is to avoid having to create and search debug files to see what frame I need to be looking in.  With all that you have learned about our website, is a tool like this possible?  If it is then it would make me self-dependent.
Find-iFrame-for-Element-ID-Tool.xlsm
Ok.  I found the problem related to only showing row 5 and the missing the description (They were related).  Try this code.

For your second post, I don't understand the question.  If your wanting a tool that can scour each page and find information related to a particular ticket, I don't believe that can be done.  The element ID you need for each page will very likely be different.
IE-Pull-Data.Q28632832.v5.xlsm
Avatar of kbay808

ASKER

It worked great!!!  Is there any way to get it to start on row 4?  

As for my master tool idea, I finally had a revelation.  You have taught me how to use the debug tool that you created for me to search for the element ID.  The issue that I had was that I did not understand the iFrames.  I did not know how many frames that there were or could be.  After vigorously testing the 2 codes that you have created for me, I figured out that those two codes are all that I need.  I was able to find the test target data after I figured out what the correct element ID was for that particular scenario.  With that said, I updated the code for my master tool, but I could use a little help.
 
The Macro_1 works, but it freezes after it pulls the data.  I attached a screen shot of the debug screen.  Also, when the right element ID is used for wrong macro, I get an error.  How can it make it ignore the errors and keep running the macro?  And finally, how can I get the code to look for the element ID from cell “B6” instead of manually changing the code?
Find-iFrame-for-Element-ID-Tool.xlsm
Debug-screen-shot.JPG
OK.  I think I see what you are doing.  I have updated the code.  First I put all the code into one module.  I like to keep things together.  Also, it is usually best to put code that is reused in multiple procedures in a separate procedure or function.  I have done that with the web page access code.

Now, you just need to enter the proper element ID in the spreadsheet.  Understand that for the ID on the second macro, you will need to put the row number in the ID.  For example for row 4 you would put X165_4 or X169_4_1.
Find-iFrame-for-Element-ID-Tool.v4.xlsm
Avatar of kbay808

ASKER

The tool worked perfect!!!  Thank you very much.  

As for the starting on the 4th row, I meant that I wanted the output to start with the first row of the target data, but entered starting in cells A4:D4 instead of A1:D1.  In my tool that I will use this code in has a title and headers that take up the first 3 rows in the spreadsheet.
ASKER CERTIFIED SOLUTION
Avatar of ltlbearand3
ltlbearand3
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of kbay808

ASKER

Awesome!!!  Thank you very much for all of your help.