DoveTails
asked on
Logon to Web Site and Download File Programmatically using Python, urllib2 module
Wish to logon to web site to programmatically download data.
Looking for assistance with Python and the urllib2 module.
At this page:
https://www.nifc.blm.gov/cgi/WfmiHome.cgi
there is a Logon button which uses a GET (not a Post) referring to:
https://www.nifc.blm.gov/cgi/WfmiHome.cgi/Page/Logon
which Redirects to: https://www.nifc.blm.gov/cgi/WfmiHome.cgi/Page/DoiMonitor
I'm hoping to understand how to navigate through web pages to eventually logon with my account.
I've tried many variations of the below code without success.
Need help understand urllib2 to navigate to page with "I Agree" button and then to the actual page to enter username and password.
Appreciate help ... Thanks !
Looking for assistance with Python and the urllib2 module.
At this page:
https://www.nifc.blm.gov/cgi/WfmiHome.cgi
there is a Logon button which uses a GET (not a Post) referring to:
https://www.nifc.blm.gov/cgi/WfmiHome.cgi/Page/Logon
which Redirects to: https://www.nifc.blm.gov/cgi/WfmiHome.cgi/Page/DoiMonitor
I'm hoping to understand how to navigate through web pages to eventually logon with my account.
I've tried many variations of the below code without success.
Need help understand urllib2 to navigate to page with "I Agree" button and then to the actual page to enter username and password.
mport urllib
import urllib2
import cookielib
#cookie storage
cj = cookielib.CookieJar()
opener = urllib2.build_opener(
urllib2.HTTPCookieProcessor(cj),
urllib2.HTTPRedirectHandler
)
#### First page
url = 'https://www.nifc.blm.gov/cgi/WfmiHome.cgi'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
html = response.read()
# Print to screen
print html
Appreciate help ... Thanks !
ASKER
Thanks for the response Walter.
My thinking behind navigating through the first few pages with buttons specifying "Logon" and "I Agree" is to acquire the necessary cookies. If I attempt to navigate directly to the Authentication Page, I am directed back to the main Home page with the "Logon" button (basically back to page 1).
I'm assuming by pressing "I Agree" in a standard browser a cookie is set which lasts for that session.
Hopefully the code I have worked on for a Post with the logon information will work, but I cannot programmatically get to the logon page and my guess for that is because I do not yet have the "I Agree" cookie.
Any thoughts ?
My thinking behind navigating through the first few pages with buttons specifying "Logon" and "I Agree" is to acquire the necessary cookies. If I attempt to navigate directly to the Authentication Page, I am directed back to the main Home page with the "Logon" button (basically back to page 1).
I'm assuming by pressing "I Agree" in a standard browser a cookie is set which lasts for that session.
Hopefully the code I have worked on for a Post with the logon information will work, but I cannot programmatically get to the logon page and my guess for that is because I do not yet have the "I Agree" cookie.
Any thoughts ?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you. More options than I expected.
Appreciate your input !
Appreciate your input !
1) create a code similar to yours to submit the information for the logon page;
2) make sure that a session is being maintained;
3) call the download url, using the code you have showed.
It is not a question of navigating through pages, but a matter of generate the session that you need with the logon page and then call the direct download url.