Link to home
Start Free TrialLog in
Avatar of bswiftly
bswiftly

asked on

Simulate web browser - NON VISUAL and simulate cookies & session vars.

I need to find a solution to simulate a web browser.  I know how to use the web browser control in a GUI environment, but can it be used in a console application?  We have the following goals:

1) store session variables
2) need to do form posts
3) need to support (simulate?) cookies

A client has asked us to log into his site and navigate through to grab one of his files automatically, and this can't be done with a graphical interface and sendkeys.  I'm sure there are some evil hackers that don't want to show their face but this is completely legit, and I need to find a way around this ASAP!

How could I do this?  

Points awarded when all 3 goals have been met.
Avatar of raterus
raterus
Flag of United States of America image

This is crazy, why can't your client just put the file up on a FTP to download?  Or at least give you a link that took you directly to the page the client has.

If you had to do it, you could use the WebClient/WebRequest classes to simulate this, though I don't know what you are talking about as far as session variables, the client never sees those...
Avatar of bswiftly
bswiftly

ASKER

our client has a subscription to a website.  

he does the same login every day, does his lengthy configuration to setup a query, and that generates a new file with current data every day.

we want to automate his process without having them seeing it pop up on their screen.

its not the clients file, so he can't put it on an ftp site.  he doesn't have the file, thats the problem, we have to get it for him from his subscription site.  

1) login,  
2) navigate to page
3) form post to generate file  
4) download file.
Well you can do it using WebClient / WebRequest, GOOD LUCK though, it's not going to be pretty or easy for that matter, you are basically writing your own webbrowser, so you'll have to keep track of all the possible HTTP codes the webserver is going to spit back at you, and fool with cookies, and anything else.  And if that's not daunting enough, the whole process will be ruined the minute the site changes the layout/structure to get the files!

You may want to try and reverse engineer the website and see if you can't find a "hack", where you can make a request to the exact page you need, given you have the correct cookies set, that will let you in, this will be much easier, but likely isn't possible.

If the client/this site have enough of a relationship, they may be able to allow you to get these files easier, if that is possible, I'd really go this route.
i agree with you raterus but we don't want to have to ask our client to get in touch with the webmaster because he probably won't like to hear one of his subscribers wants a program to interact with his website.

I guess I'm not sure if you had any info for me..  I pretty much knew everything you said, but I'm not sure how to attack it.  How do I fool this with cookies?  How do i set a session variable?  How would I track HTTP codes.. not sure how I would even accept them..   webclient.navigate,... and then listen for events, but what events?  

I need a lot more help than that!  But I agree, it is going to be daunting.   I can offer a lot more points if thats whats holding anyone back from offering a solution.

Probably the easiest way if I was going to do this, is get to the core of how a client/server interact through HTTP.  All it boils down to is HTTP Requests/HTTP Responses, which are all plain text.  Familiar with packet sniffers?  Just to give you an idea of what the requests are going to have to look like, you can sniff what your browser is sending/receiving from the client.  Sorry, I don't know a better way to get at this info other than a packet sniffer, and even then I've never really done it, just know it can be done!

Lets say you get this, and you have a clear picture of the HTTP requests you will have to send, and the HTTP responses that you will receive, now you have to mimic this.  If these requests are static, where nothing on the server has changed, you may be able to just forget the responses from the server, and just make these requests, in order, leading up to the eventual download of the file.  This likely won't work if they are using a modern server technology so you'll have to interpret the responses from the server in order to figure out what to do next.

Once you have a clear picture of the Request you'll have to make, it should be pretty easy to plug values into the WebRequest Class.  Cookies, posted values, and request codes.  Any of this helping?

--Michael
it is, I guess.. just will take a few headaches to figure it out.  Looking at msdn and playing around with a webrequest class ..  hard to figure out how to get started using these classes..  never done packet sniffing either.  
well i'm going to close this question if no one has any suggestions, or at least some sample code of something other way of doing it !
What more would you like?  If you have fooled with the WebRequest/WebResponse then you are already past the point of samples.  I likely can't search google any better than you can to find something more specific.  Sometimes you get to the point where you're trudging on new turf, and samples just don't exist because nobody has done it!  I don't see why you can't do this using the methods I've described.
well i'm still no further than where I began.. I don't know how do use a packet sniffer, or how to begin developing a browser.   I need some sample code for a starting point I guess.  

1) how to save cookies
2) session variables
3) how to navigate from page to page without any visual elements.


without any of those i'm at square one!
ASKER CERTIFIED SOLUTION
Avatar of raterus
raterus
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
k thanks
If you use FireFox any, here is an extension to view the Request/Response Headers.  Does the same as that other program, and it's free :-)
http://livehttpheaders.mozdev.org/