_Mark_
asked on
Get URL of image displayed in IFRAME?
Hi,
I asked this question a while ago and was surprised that nobody had an answer. I´ll try my luck again, this time with a slightly rephrased description of the problem. Hopefully somebody can point me in the right direction.
Many webpages position ads in IFRAMEs. Usually they display an image in the IFRAME. Left-clicking on it opens the linked page. Right-clicking allows either to save the image (Save Image as) or to view its properties. In the property window the full address of the image is revealed, for example:
https://img.web.de/_Muster/bookmark/credit.gif
MY QUESTION: IS IT POSSIBLE TO PROGRAMMATICALLY RETRIEVE THE SAME INFORMATION – THE ADDRESS OF THE IMAGE?
The source code of the webpage only contains an src to the document linked in the IFRAME but not the address of the image, for example:
<IFRAME src="/iframe.ng/site=freem ail&catego ry=login&s pecial=top &adsize=46 8x60&conte nt=webde" width="640" height="70" scrolling="no" marginwidth="0" marginheight="0" frameborder="0">
</IFRAME>
Accessing the content of the document linked in the IFRAME is not possible because of cross frame security.
Here is an example site: http://freemail.web.de
The second element from the top is an IFRAME, containing five different ads.
I fooled around with this, without much success:
Private Sub setEvent(doc As HTMLDocument)
Dim sElement As IHTMLElement
Dim testforError As String
On Error Resume Next
' MsgBox "setEvent " & doc.location.href
For Each sElement In doc.All
testforError = sElement.tagName
If Err.Number = 0 Then
With sElement
If .tagName = "IMG" Or .tagName = "A" Then ….
If sElement.tagName = "FRAME" Then ….
'========================= ========== ========== =======
'This approach only provides the src to the document linked in the IFRAME
'How to get the address of the image displayed in the IFRAME? Obviously
'Internet Explorer knows how to get it, since it shows it in the property window
If sElement.tagName = "IFRAME" Then
For cnt = 0 To doc.All.length
If doc.All(cnt).sourceIndex = sElement.sourceIndex Then MsgBox doc.All(cnt).src
Next
End If
'========================= ========== ========== =======
If Err.Number <> 0 Then
Debug.Print Err.Description
Err.Clear
End If
End With
End If
Next
End Sub
Thanks for your help!
I asked this question a while ago and was surprised that nobody had an answer. I´ll try my luck again, this time with a slightly rephrased description of the problem. Hopefully somebody can point me in the right direction.
Many webpages position ads in IFRAMEs. Usually they display an image in the IFRAME. Left-clicking on it opens the linked page. Right-clicking allows either to save the image (Save Image as) or to view its properties. In the property window the full address of the image is revealed, for example:
https://img.web.de/_Muster/bookmark/credit.gif
MY QUESTION: IS IT POSSIBLE TO PROGRAMMATICALLY RETRIEVE THE SAME INFORMATION – THE ADDRESS OF THE IMAGE?
The source code of the webpage only contains an src to the document linked in the IFRAME but not the address of the image, for example:
<IFRAME src="/iframe.ng/site=freem
</IFRAME>
Accessing the content of the document linked in the IFRAME is not possible because of cross frame security.
Here is an example site: http://freemail.web.de
The second element from the top is an IFRAME, containing five different ads.
I fooled around with this, without much success:
Private Sub setEvent(doc As HTMLDocument)
Dim sElement As IHTMLElement
Dim testforError As String
On Error Resume Next
' MsgBox "setEvent " & doc.location.href
For Each sElement In doc.All
testforError = sElement.tagName
If Err.Number = 0 Then
With sElement
If .tagName = "IMG" Or .tagName = "A" Then ….
If sElement.tagName = "FRAME" Then ….
'=========================
'This approach only provides the src to the document linked in the IFRAME
'How to get the address of the image displayed in the IFRAME? Obviously
'Internet Explorer knows how to get it, since it shows it in the property window
If sElement.tagName = "IFRAME" Then
For cnt = 0 To doc.All.length
If doc.All(cnt).sourceIndex = sElement.sourceIndex Then MsgBox doc.All(cnt).src
Next
End If
'=========================
If Err.Number <> 0 Then
Debug.Print Err.Description
Err.Clear
End If
End With
End If
Next
End Sub
Thanks for your help!
Ok ... so in your subroutine the HTML document that is passed in is the page that contains the banner ad (or the IFRAME that point to the banner ad) or is it the actual banner ad page?
I think I just confused myself ...
Example, let's say we have 2 pages ...
MyPage.htm - which contains an IFRAME to a banner
BannerPage.htm - which contains the A tag and IMG tag which you want to find the source
If you can parse through MyPage.htm you should be able to get the SRC of the banner IFRAME and then open BannerPage.htm and parse through that page ... once you get in BannerPage.htm you should be golden, right? Or am I missing something?
Let me know ...
I think I just confused myself ...
Example, let's say we have 2 pages ...
MyPage.htm - which contains an IFRAME to a banner
BannerPage.htm - which contains the A tag and IMG tag which you want to find the source
If you can parse through MyPage.htm you should be able to get the SRC of the banner IFRAME and then open BannerPage.htm and parse through that page ... once you get in BannerPage.htm you should be golden, right? Or am I missing something?
Let me know ...
ASKER
Thanks for your input, guys.
COBOLdinosaur:
Maybe this is a naive assumption, but does this mean that Internet Explorer can access the information inspite cross domain security (it´s in the property window of the image), but I can´t?
dfu23:
You are right. It would defy the purpose of my app though if I had to leave the current page and navigate to the BannerPage.htm (in order to be allowed to parse its content). Opening it in a second (hidden) browser window is not very tempting either. Another problem: Even if I did navigate to BannerPage.htm, how would I know which one of the parsed images (page might have more than one) is the one that was displayed in the IFRAME on MyPage.htm?
Thanks.
COBOLdinosaur:
Maybe this is a naive assumption, but does this mean that Internet Explorer can access the information inspite cross domain security (it´s in the property window of the image), but I can´t?
dfu23:
You are right. It would defy the purpose of my app though if I had to leave the current page and navigate to the BannerPage.htm (in order to be allowed to parse its content). Opening it in a second (hidden) browser window is not very tempting either. Another problem: Even if I did navigate to BannerPage.htm, how would I know which one of the parsed images (page might have more than one) is the one that was displayed in the IFRAME on MyPage.htm?
Thanks.
Just out of curiosity - why do you need to access the URL of the image in an iframe, as opposed to the image itself?
I know in the background you can do a scape from a URL (something like this):
// This uses items from the System.Net namespace
String url = "http://a.banner.ad/path/to/page";
String result = "";
HttpWebRequest request = (HttpWebRequest)WebRequest .Create(ur l);
request.Method = "GET";
request.ContentType = "application/x-www-form-ur lencoded";
HttpWebResponse response = (HttpWebResponse)request.G etResponse ();
using (StreamReader reader = new StreamReader(response.GetR esponseStr eam()))
{
result = reader.ReadToEnd();
reader.Close();
}
Now you should have the source of the BannerPage.htm ... best way to get what you need ... I'm not really sure.
// This uses items from the System.Net namespace
String url = "http://a.banner.ad/path/to/page";
String result = "";
HttpWebRequest request = (HttpWebRequest)WebRequest
request.Method = "GET";
request.ContentType = "application/x-www-form-ur
HttpWebResponse response = (HttpWebResponse)request.G
using (StreamReader reader = new StreamReader(response.GetR
{
result = reader.ReadToEnd();
reader.Close();
}
Now you should have the source of the BannerPage.htm ... best way to get what you need ... I'm not really sure.
ASKER
seanpowell:
I want to be able to programmatically save the image to disk. This particular IFRAME image doesn´t get stored in the IE cache, so I need to download it. Therefore I need the url.
dfu23:
I am coding this app in VB6 and, to be honest, I haven´t delved into the NET Framework yet. I guess the above can´t be translated into VB6, can it?
I want to be able to programmatically save the image to disk. This particular IFRAME image doesn´t get stored in the IE cache, so I need to download it. Therefore I need the url.
dfu23:
I am coding this app in VB6 and, to be honest, I haven´t delved into the NET Framework yet. I guess the above can´t be translated into VB6, can it?
The iframe is implemented as a window object. When you right click on the image, the focus is to the instance of window in the iframe. The the browser is not going cross domain in that case. However when you try a reference to the iframe window from the window in teh main page you are now going cross domain.
If you you are trying to automate the saving of the image you don't need a browser at all you can do an http get server to server, and save the overhead and security hassle.
Cd&
If you you are trying to automate the saving of the image you don't need a browser at all you can do an http get server to server, and save the overhead and security hassle.
Cd&
I remember doing a "screen scrape" in classic ASP (VBScript) but don't remember the object that I used to get it to work ... maybe the XMLHTTP?
Sorry, I'm not very familiar with VB.
Sorry, I'm not very familiar with VB.
ASKER
Sorry for the delay. In my part of the world (Austria) it was time to get some sleep ...
Interesting information on IE, COBOLdinosaur, thanks.
I have a small Browser Helper Object that allows "one click" saving of any image on a page. It works well, even on pages with nested frames, but I feel it is incomplete as long as it doesn´t also work for images displayed in iframes.
The example page I mentioned at the beginning of this thread contains an iframe, which in itself contains five iframes. Each of these five has a different image assigned to it. If I click on the iframe in the main page, the focus is on this window object and from there (I assume) it is passed on to the underlying, nested iframe, which in itself is again a window object. So I am now at the level of this sub-iframe and IE can access its properties and provide the image url.
It seems to be very difficult to do the same in code. This is how I imagine it:
To find out which nested iframe was clicked I would have to store the position of the mouse over the top iframe. Then, I guess, I would have to navigate to the address to which this main iframe links to. There I should be able to locate the desired sub-iframe, using the mouse coordinates. I extract the src of the sub-iframe, navigate to it, and parse it for .tagName="IMG". Considering that there might be more than one nesting level, this seems to become quite impossible.
Hopefully there is a more realistic way to do this?
Thanks.
Interesting information on IE, COBOLdinosaur, thanks.
I have a small Browser Helper Object that allows "one click" saving of any image on a page. It works well, even on pages with nested frames, but I feel it is incomplete as long as it doesn´t also work for images displayed in iframes.
The example page I mentioned at the beginning of this thread contains an iframe, which in itself contains five iframes. Each of these five has a different image assigned to it. If I click on the iframe in the main page, the focus is on this window object and from there (I assume) it is passed on to the underlying, nested iframe, which in itself is again a window object. So I am now at the level of this sub-iframe and IE can access its properties and provide the image url.
It seems to be very difficult to do the same in code. This is how I imagine it:
To find out which nested iframe was clicked I would have to store the position of the mouse over the top iframe. Then, I guess, I would have to navigate to the address to which this main iframe links to. There I should be able to locate the desired sub-iframe, using the mouse coordinates. I extract the src of the sub-iframe, navigate to it, and parse it for .tagName="IMG". Considering that there might be more than one nesting level, this seems to become quite impossible.
Hopefully there is a more realistic way to do this?
Thanks.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
What I meant with "navigate" was to actually load the cross domain page in a separate window - in that case access would, as you said, become possible. This wouldn´t be an acceptable solution anyways though.
I accept your answer and thank you for your help.
One more thought:
Under "IE security settings" - "Miscellaneous", there are the options "Access data sources across domains" and "Navigate sub-frames across different domains". I thought setting both to "Allow" would make cross domain access possible. Did not work though. Do these settings only work with scripts embedded in the webpage and not with BHO´s?
I accept your answer and thank you for your help.
One more thought:
Under "IE security settings" - "Miscellaneous", there are the options "Access data sources across domains" and "Navigate sub-frames across different domains". I thought setting both to "Allow" would make cross domain access possible. Did not work though. Do these settings only work with scripts embedded in the webpage and not with BHO´s?
Those user setting allow the use of cross-domain forms and re-direction into sub-frames without security alerts. Neither involves scripting; just what the browser is allowed to do to access resources; it does not make anything additional available to the scripting engine.
BTW the purpose of the security restriction, is that without it, a malicious web site could present a form from say Amazon and then use scripting to steal credit card information that he user thought they were sending only to Amazon.
Thanks for the A. :^)
Cd&
BTW the purpose of the security restriction, is that without it, a malicious web site could present a form from say Amazon and then use scripting to steal credit card information that he user thought they were sending only to Amazon.
Thanks for the A. :^)
Cd&
Dim Document as IHTMLDocument2
Dim pFramesCollection as IHTMLFramesCollection2
Dim pDisp as IDispatch
Dim IWindow2 as IHTMLWindow2
Dim i as integer
Dim VarIndex as OleVariant
Dim FrameDocument as IHTMLDocument2
NOTE: Assumes Document is set to the document you want to access the frames from
pFramesCollection = Document.Frames
If Not pFramesCollection Is Nothing Then)
For i = 0 to pFramesCollection.Length - 1
VarIndex = i
Set pDisp = pframescollection.item(Var
pDisp.QueryInterface(IHTML
if Not IWindow2 Is Nothing Then
if Not IWindow2.Document Is Nothing Then
Set FrameDocument = IWindow2.Document
End If
End If
Next i
End If
ASKER
Thanks turbo1212.
Looks interesting. I´ll test it soon.
Looks interesting. I´ll test it soon.
ASKER
I had to change the code a little to make it work with VB:
Dim pFramesCol As IHTMLFramesCollection2
! Dim pDisp As Object
Dim IWindow2 As IHTMLWindow2
Dim i As Integer
! Dim varIndex As Variant
Dim frameDoc As IHTMLDocument2
Set pFramesCol = doc.frames
If Not pFramesCol Is Nothing Then
For i = 0 To pFramesCol.length - 1
varIndex = i
Set pDisp = pFramesCol.Item(varIndex)
! ' pDisp.QueryInterface IHTMLWindow2, IWindow2
! Set IWindow2 = pDisp
If Not IWindow2 Is Nothing Then
If Not IWindow2.Document Is Nothing Then <== Error: Access denied!
Set frameDoc = IWindow2.Document
End If
End If
Next i
End If
Unfortunately, on pages with cross domain iframes the result was the usual: Access denied.
Dim pFramesCol As IHTMLFramesCollection2
! Dim pDisp As Object
Dim IWindow2 As IHTMLWindow2
Dim i As Integer
! Dim varIndex As Variant
Dim frameDoc As IHTMLDocument2
Set pFramesCol = doc.frames
If Not pFramesCol Is Nothing Then
For i = 0 To pFramesCol.length - 1
varIndex = i
Set pDisp = pFramesCol.Item(varIndex)
! ' pDisp.QueryInterface IHTMLWindow2, IWindow2
! Set IWindow2 = pDisp
If Not IWindow2 Is Nothing Then
If Not IWindow2.Document Is Nothing Then <== Error: Access denied!
Set frameDoc = IWindow2.Document
End If
End If
Next i
End If
Unfortunately, on pages with cross domain iframes the result was the usual: Access denied.
Cd&