Link to home
Start Free TrialLog in
Avatar of gilad_no
gilad_no

asked on

URLOpenBlockingStream with sessions

How can I use URLOpenBlockingStream when connection to site?

I know how to download a single page, but how can I use it when I need multiple pages?

I want to login into a site, navigate into a specific page and parse the data from there. I don't want to use inet.

For example:

LPSTREAM pStream;
URLOpenBlockingStream(NULL,"www.mypage.com/login.asp",&pStream,0,pCallback);

... - Post username/password

URLOpenBlockingStream(NULL,"www.mypage.com/main.asp",&pStream,0,pCallback);

URLOpenBlockingStream(NULL,"www.mypage.com/page1.asm",&pStream,0,pCallback);

pStream->Read(...);   // Read the page

URLOpenBlockingStream(NULL,"www.mypage.com/logout.asp",&pStream,0,pCallback);
Avatar of DanRollins
DanRollins
Flag of United States of America image

What is preventing you from doing that?

Of course a URLOpenBlockingStream is, well...., BLOCKING -- which means that the call does not return until it is finished reading all of the data.

Is it your desire to read several pages simultaneously?  If so, and if you want to use URLOpenBlockingStream, then you will need to create several threads of execution.  Do you need help with that?

-- Dan
Avatar of gilad_no
gilad_no

ASKER

This wasn't my question :)

Suppose I need to download some data from a page which requires login. I need the navigation mechanism to save the session until I'll logout from the site. Does URLOpenBlockingStream support sessions?

Suppose I want to create a simple agent to go to Yahoo!, log in with username/password, read the messages and logout. I don't want to use wininet to do so. How can I implement it using URLOpenBlockingStream?
There are two varieties of authentication.  In one, your browser will show some HTML input boxes.  In the other, you will see a popup grey box requestiin your username and password.

In order to use URLOpenBlockingStream with the latter, you need to provide an IAuthenticate handler for your IBindStatusCallback interface.  This is somewhat complicated in C++, but here's some VB code that does it.
   http://www.domaindlx.com/e_morcillo/scripts/cod/ie.asp

If the login is regular HTML input boxes, then odds are, you will get a cookie that will authenticate you for the rest of the session.

If you are dead set on using C++, then you have some work ahead of you.  However, if you just want to download some files from a password-protected site, then you might try this simple technique.  Create a text file with these contents:

=-=-=-=-=-=-=-=-=-=-=-=- start of file
<html>
<script>
function DownloadNow( sFullRUL, sFile )
{
      goXmlHttp= new ActiveXObject("Microsoft.XMLHTTP");
      gnXmlHttpStatus= 0;
      goXmlHttp.open("GET", sFullURL, false, "username", "password" );
      goXmlHttp.send("");

     //-------------------------- here is the trick to saving binary data!
     var adTypeBinary = 1;
     var adSaveCreateOverwrite = 2;
     var adModeReadWrite= 3;
     
     var stream = new ActiveXObject("adodb.stream");
     stream.type = adTypeBinary;
     stream.mode = adModeReadWrite;
     stream.open();
     stream.write( goXmlHttp.responseBody );
     stream.savetofile( sFile, adSaveCreateOverwrite );
     stream.close();

}
function DoClickDownload()
{
     DownloadNow( "http://somesite.com/somefile.jpg", "c:\\temp\\somefile.jpg );
     alert("done!");
}
</script>

<input type=button value="START!" onClick="DoClickDownload();"><br>

</html>
=-=-=-=-=-=-=-=-=-=-=-=- end of file

Save the file as DoDownload.HTA and then double-click it.

-- Dan
I still don't understand :(

The site doesn't uses regular authentication so I can't use IAuthenticate.

My goal is to write a client which can simulate a web browser navigation. The user can then create scripts to perform navigation and data extraction. For example:

Navigate "www.yahoo.com/main.asp"

var frm as Form
frm.username="myusername"
frm.password="mypassword"
frm.language="en"

Post "www.yahoo.com/login.asp", frm

Extract "<TD>(.*?)</TD><TD>(.*?)</TD>"     // Extract using regular expressions

Navigate "www.yahoo.com/logout.asp"



This was just a sample. Currently, I am using wininet to do the navigation. I am creating a CInternetSession object and navigating to the desired page. When there is data to post, I use CHttpFile::SendRequest to post the data.

I want to replace my code to use URLOpenBlockingStream. I don't need to just download a web page. I could use URLDownloadToFile instead. I want to build a mechanism to perform a full session navigation.

Thanks for the help
I think you are asking about a situation in which a site expects the user to type name and password into some HTLM input boxes and click submit (as opposed to a site that causes a grey box to pop up).  

In that case, and assuming that you can fill in the form, and submit it, then I think you are done.  That sort of authentication usually works by placing a cookie on the client machine and that cookie will normally be sent with each subsequent request.  If not, then you should drop back to using WinInet functions such as InternetOpenURL which do all of that automatically.

Also, take a look at the Yahoo Mail login page.  Note that it has an OnSubmit handler in Javascript to encode some stuff.  I doubt that just downloading that page data and sending the form value will work.

Is this a theoretical question or is there a specific problem that you are encountering?

-- Dan
This was my question :)

If IE places a cookie, I think everything should work, but how can I remove it? Is it a session cookie? If so, how can I log out (and remove the cookie)?
That is "roll yer own" authentication and the way the host handles it would vary from site to site.  Why not try a few experiments and see what you learn?

-- Dan
I've tried, but it doesn't work. I can't post the data. I am trying to post it using GetBindInfo (I allocate using GlobalAlloc) but it does not work. Using wininet I've managed to post my data.
I can't see anything that lets URLOpenBlockingStream send form data -- or any other HTTP headers.  With WinInet fns, these are well documented.  

Do you have a plan in that regard?

-- Dan
So if I need to post data, my only option is to use wininet?
ASKER CERTIFIED SOLUTION
Avatar of DanRollins
DanRollins
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial