Link to home
Start Free TrialLog in
Avatar of paygo
paygo

asked on

HOW TO SCREEN SCRAPE IFRAME with ASP / ASP.NET / Can't get IFRAME Contents of a non local domain - Thanks

Here's what I've heard so far-

1) I can't get to the HTML contents of an IFRAME that is pointing to a non-local URL such as src=http://www.yahoo.com.

2) I can't use the ActiveXObject("Msxml2.XMLHTTP.4.0") since A) Access is denied and b) a security risk message appears.

so.

Is there a way to screen scrape the contents of an IFRAME using asp.net?

Here's an example Iframe
<body>
<form id="form1">
<IFRAME src="http://www.google.com" name="myIframeId"></IFRAME>
</form>
<body>

Thanks
Avatar of fritz_the_blank
fritz_the_blank
Flag of United States of America image

Any luck with this?

Function GetHTML(strURL)
      Dim objXMLHTTP, strReturn
      Set objXMLHTTP = Server.CreateObject("MSXML2.ServerXMLHTTP")
      objXMLHTTP.Open "GET", strURL, False
      objXMLHTTP.Send
      strReturn = objXMLHTTP.responseText
      Set objXMLHTTP = Nothing
      GetHTML = strReturn
End Function
Two problems:

1) Its a tricky problem if you are trying to scrape a dynamic page that is embedded as an Iframe. coz it would always scrape the original URL i.e. www.google.com and not what has been searched.

2) You need to have MSxml2.XMLHTTP loaded on your server to use it. If not u cant use it.

But here is how u scrape a page.

'create an instance of the XMLHTTP component
Set objXML = Server.CreateObject("MSXML2.ServerXMLHTTP")

'build up the url and store it in strURL variable
strURL = "http://www.google.com"
   
get the strURL
objXML.Open "GET" , strURL , False ,"",""
   
'send the information
objXML.Send
   
'we have no errors
If Err.Number = 0 Then
      'and the url is valid
      If objXML.Status = 200 then
            'store all of the downloaded data in strOpen
            strFileContent = objXML.ResponseText

      Else
            'bad url display a message
            Response.Write "Incorrect URL"
      End if
Else
      'if we do have an error display the description of the error
      Response.Write Err.Description
End If

'clear up
Set objXML = Nothing
So, it sounds like we are on the same page, then?

FtB
Avatar of paygo
paygo

ASKER

thx - I'm writing/testing some code now - will keep you posted shortly
Avatar of paygo

ASKER

Hi  I'm not seeing  'ServerXMLHTTP' using intellisense MSXML2.****

do you know where / which version on MSXML*.dll contains this object?
I am not sure, but just for yucks, try my code with a page that you know will work (just about any public page will do) to see if you get a response.
Avatar of paygo

ASKER

Do you have this code in C# Scripting somewhere?

Thanks
I am afraid that I don't. I use this in ASP classic with VBScript.

FtB
Avatar of paygo

ASKER

Thanks Fritz,

I tried this in VB and received the following message.

'Let' and 'Set' assignment statements are no longer supported.

Is there a vb work around?
ASKER CERTIFIED SOLUTION
Avatar of fritz_the_blank
fritz_the_blank
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of paygo

ASKER

Thanks Fritz - you answered my question about this.

I going to post a second question regarding the C# scripting.

I am running asp.net and feel confortable with C# from the object side.

Good time for me to learn about C# scripting - can't be too much different than what I already know

Regards

Steve
Best of luck. If I come across a C# example, I'll post it.

FtB
Sorry about the double post Ftb,

It took me a while to write the code and then i realized that you had posted earlier.

:) SD
Not a problem--I'll take a look at yours to see what I can learn from it.

FtB