Simulate web browser - NON VISUAL and simulate cookies & session vars.

I need to find a solution to simulate a web browser.  I know how to use the web browser control in a GUI environment, but can it be used in a console application?  We have the following goals:

1) store session variables
2) need to do form posts
3) need to support (simulate?) cookies

A client has asked us to log into his site and navigate through to grab one of his files automatically, and this can't be done with a graphical interface and sendkeys.  I'm sure there are some evil hackers that don't want to show their face but this is completely legit, and I need to find a way around this ASAP!

How could I do this?  

Points awarded when all 3 goals have been met.
LVL 1
bswiftlyAsked:
Who is Participating?
 
raterusConnect With a Mentor Commented:
Look at this program,
http://www.siliconwold.com/interceptor/help.htm

There are others like it, but you can view the HTTP Headers of Requests/Responses.  Probably a little easier than using a packet sniffer.  Sorry this software requires you buy it, you may be able to find something similar that is free.

Anyway, once you get a picture of how the requests/responses look, it is up to you to program this into the HttpWebRequest/HttpWebResponse objects, something like so...
       
        Dim rq As HttpWebRequest = CType(WebRequest.Create("http://www.mysite.com"), HttpWebRequest)
        rq.CookieContainer.Add(New System.Net.Cookie("myCookie", "some value"))
        rq.Headers.Add("SomeHeaderName", "SomeHeaderValue")

        Dim rsp As HttpWebResponse = CType(rq.GetResponse, HttpWebResponse)
        Dim c As Cookie = rsp.Cookies(0)

        Dim responseString As String

        Dim reader As StreamReader = New StreamReader(rsp.GetResponseStream)
        Dim html As String = reader.ReadToEnd()
        reader.Close()

        'Ok, next page, same process, remember to return cookies the server returned to you!

I don't know what more you need to know to do this, it seem pretty cut/dry to me!
0
 
raterusCommented:
This is crazy, why can't your client just put the file up on a FTP to download?  Or at least give you a link that took you directly to the page the client has.

If you had to do it, you could use the WebClient/WebRequest classes to simulate this, though I don't know what you are talking about as far as session variables, the client never sees those...
0
 
bswiftlyAuthor Commented:
our client has a subscription to a website.  

he does the same login every day, does his lengthy configuration to setup a query, and that generates a new file with current data every day.

we want to automate his process without having them seeing it pop up on their screen.

its not the clients file, so he can't put it on an ftp site.  he doesn't have the file, thats the problem, we have to get it for him from his subscription site.  

1) login,  
2) navigate to page
3) form post to generate file  
4) download file.
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

 
raterusCommented:
Well you can do it using WebClient / WebRequest, GOOD LUCK though, it's not going to be pretty or easy for that matter, you are basically writing your own webbrowser, so you'll have to keep track of all the possible HTTP codes the webserver is going to spit back at you, and fool with cookies, and anything else.  And if that's not daunting enough, the whole process will be ruined the minute the site changes the layout/structure to get the files!

You may want to try and reverse engineer the website and see if you can't find a "hack", where you can make a request to the exact page you need, given you have the correct cookies set, that will let you in, this will be much easier, but likely isn't possible.

If the client/this site have enough of a relationship, they may be able to allow you to get these files easier, if that is possible, I'd really go this route.
0
 
bswiftlyAuthor Commented:
i agree with you raterus but we don't want to have to ask our client to get in touch with the webmaster because he probably won't like to hear one of his subscribers wants a program to interact with his website.

I guess I'm not sure if you had any info for me..  I pretty much knew everything you said, but I'm not sure how to attack it.  How do I fool this with cookies?  How do i set a session variable?  How would I track HTTP codes.. not sure how I would even accept them..   webclient.navigate,... and then listen for events, but what events?  

I need a lot more help than that!  But I agree, it is going to be daunting.   I can offer a lot more points if thats whats holding anyone back from offering a solution.

0
 
raterusCommented:
Probably the easiest way if I was going to do this, is get to the core of how a client/server interact through HTTP.  All it boils down to is HTTP Requests/HTTP Responses, which are all plain text.  Familiar with packet sniffers?  Just to give you an idea of what the requests are going to have to look like, you can sniff what your browser is sending/receiving from the client.  Sorry, I don't know a better way to get at this info other than a packet sniffer, and even then I've never really done it, just know it can be done!

Lets say you get this, and you have a clear picture of the HTTP requests you will have to send, and the HTTP responses that you will receive, now you have to mimic this.  If these requests are static, where nothing on the server has changed, you may be able to just forget the responses from the server, and just make these requests, in order, leading up to the eventual download of the file.  This likely won't work if they are using a modern server technology so you'll have to interpret the responses from the server in order to figure out what to do next.

Once you have a clear picture of the Request you'll have to make, it should be pretty easy to plug values into the WebRequest Class.  Cookies, posted values, and request codes.  Any of this helping?

--Michael
0
 
bswiftlyAuthor Commented:
it is, I guess.. just will take a few headaches to figure it out.  Looking at msdn and playing around with a webrequest class ..  hard to figure out how to get started using these classes..  never done packet sniffing either.  
0
 
bswiftlyAuthor Commented:
well i'm going to close this question if no one has any suggestions, or at least some sample code of something other way of doing it !
0
 
raterusCommented:
What more would you like?  If you have fooled with the WebRequest/WebResponse then you are already past the point of samples.  I likely can't search google any better than you can to find something more specific.  Sometimes you get to the point where you're trudging on new turf, and samples just don't exist because nobody has done it!  I don't see why you can't do this using the methods I've described.
0
 
bswiftlyAuthor Commented:
well i'm still no further than where I began.. I don't know how do use a packet sniffer, or how to begin developing a browser.   I need some sample code for a starting point I guess.  

1) how to save cookies
2) session variables
3) how to navigate from page to page without any visual elements.


without any of those i'm at square one!
0
 
bswiftlyAuthor Commented:
k thanks
0
 
raterusCommented:
If you use FireFox any, here is an extension to view the Request/Response Headers.  Does the same as that other program, and it's free :-)
http://livehttpheaders.mozdev.org/
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.