URLOpenBlockingStream with sessions

How can I use URLOpenBlockingStream when connection to site?

I know how to download a single page, but how can I use it when I need multiple pages?

I want to login into a site, navigate into a specific page and parse the data from there. I don't want to use inet.

For example:

LPSTREAM pStream;
URLOpenBlockingStream(NULL,"www.mypage.com/login.asp",&pStream,0,pCallback);

... - Post username/password

URLOpenBlockingStream(NULL,"www.mypage.com/main.asp",&pStream,0,pCallback);

URLOpenBlockingStream(NULL,"www.mypage.com/page1.asm",&pStream,0,pCallback);

pStream->Read(...);   // Read the page

URLOpenBlockingStream(NULL,"www.mypage.com/logout.asp",&pStream,0,pCallback);
gilad_noAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DanRollinsCommented:
What is preventing you from doing that?

Of course a URLOpenBlockingStream is, well...., BLOCKING -- which means that the call does not return until it is finished reading all of the data.

Is it your desire to read several pages simultaneously?  If so, and if you want to use URLOpenBlockingStream, then you will need to create several threads of execution.  Do you need help with that?

-- Dan
0
gilad_noAuthor Commented:
This wasn't my question :)

Suppose I need to download some data from a page which requires login. I need the navigation mechanism to save the session until I'll logout from the site. Does URLOpenBlockingStream support sessions?

Suppose I want to create a simple agent to go to Yahoo!, log in with username/password, read the messages and logout. I don't want to use wininet to do so. How can I implement it using URLOpenBlockingStream?
0
DanRollinsCommented:
There are two varieties of authentication.  In one, your browser will show some HTML input boxes.  In the other, you will see a popup grey box requestiin your username and password.

In order to use URLOpenBlockingStream with the latter, you need to provide an IAuthenticate handler for your IBindStatusCallback interface.  This is somewhat complicated in C++, but here's some VB code that does it.
   http://www.domaindlx.com/e_morcillo/scripts/cod/ie.asp

If the login is regular HTML input boxes, then odds are, you will get a cookie that will authenticate you for the rest of the session.

If you are dead set on using C++, then you have some work ahead of you.  However, if you just want to download some files from a password-protected site, then you might try this simple technique.  Create a text file with these contents:

=-=-=-=-=-=-=-=-=-=-=-=- start of file
<html>
<script>
function DownloadNow( sFullRUL, sFile )
{
      goXmlHttp= new ActiveXObject("Microsoft.XMLHTTP");
      gnXmlHttpStatus= 0;
      goXmlHttp.open("GET", sFullURL, false, "username", "password" );
      goXmlHttp.send("");

     //-------------------------- here is the trick to saving binary data!
     var adTypeBinary = 1;
     var adSaveCreateOverwrite = 2;
     var adModeReadWrite= 3;
     
     var stream = new ActiveXObject("adodb.stream");
     stream.type = adTypeBinary;
     stream.mode = adModeReadWrite;
     stream.open();
     stream.write( goXmlHttp.responseBody );
     stream.savetofile( sFile, adSaveCreateOverwrite );
     stream.close();

}
function DoClickDownload()
{
     DownloadNow( "http://somesite.com/somefile.jpg", "c:\\temp\\somefile.jpg );
     alert("done!");
}
</script>

<input type=button value="START!" onClick="DoClickDownload();"><br>

</html>
=-=-=-=-=-=-=-=-=-=-=-=- end of file

Save the file as DoDownload.HTA and then double-click it.

-- Dan
0
Introduction to R

R is considered the predominant language for data scientist and statisticians. Learn how to use R for your own data science projects.

gilad_noAuthor Commented:
I still don't understand :(

The site doesn't uses regular authentication so I can't use IAuthenticate.

My goal is to write a client which can simulate a web browser navigation. The user can then create scripts to perform navigation and data extraction. For example:

Navigate "www.yahoo.com/main.asp"

var frm as Form
frm.username="myusername"
frm.password="mypassword"
frm.language="en"

Post "www.yahoo.com/login.asp", frm

Extract "<TD>(.*?)</TD><TD>(.*?)</TD>"     // Extract using regular expressions

Navigate "www.yahoo.com/logout.asp"



This was just a sample. Currently, I am using wininet to do the navigation. I am creating a CInternetSession object and navigating to the desired page. When there is data to post, I use CHttpFile::SendRequest to post the data.

I want to replace my code to use URLOpenBlockingStream. I don't need to just download a web page. I could use URLDownloadToFile instead. I want to build a mechanism to perform a full session navigation.

Thanks for the help
0
DanRollinsCommented:
I think you are asking about a situation in which a site expects the user to type name and password into some HTLM input boxes and click submit (as opposed to a site that causes a grey box to pop up).  

In that case, and assuming that you can fill in the form, and submit it, then I think you are done.  That sort of authentication usually works by placing a cookie on the client machine and that cookie will normally be sent with each subsequent request.  If not, then you should drop back to using WinInet functions such as InternetOpenURL which do all of that automatically.

Also, take a look at the Yahoo Mail login page.  Note that it has an OnSubmit handler in Javascript to encode some stuff.  I doubt that just downloading that page data and sending the form value will work.

Is this a theoretical question or is there a specific problem that you are encountering?

-- Dan
0
gilad_noAuthor Commented:
This was my question :)

If IE places a cookie, I think everything should work, but how can I remove it? Is it a session cookie? If so, how can I log out (and remove the cookie)?
0
DanRollinsCommented:
That is "roll yer own" authentication and the way the host handles it would vary from site to site.  Why not try a few experiments and see what you learn?

-- Dan
0
gilad_noAuthor Commented:
I've tried, but it doesn't work. I can't post the data. I am trying to post it using GetBindInfo (I allocate using GlobalAlloc) but it does not work. Using wininet I've managed to post my data.
0
DanRollinsCommented:
I can't see anything that lets URLOpenBlockingStream send form data -- or any other HTTP headers.  With WinInet fns, these are well documented.  

Do you have a plan in that regard?

-- Dan
0
gilad_noAuthor Commented:
So if I need to post data, my only option is to use wininet?
0
DanRollinsCommented:
I don't know, but that is how I've done it.  The other way I've done it is via the XMLHttpRequest object, as shown above.

-- Dan
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Development

From novice to tech pro — start learning today.