Link to home
Start Free TrialLog in
Avatar of hexo dark
hexo dark

asked on

how to retrieve all cookies with python

Hi

how to retrieve all cookies with python

I would want to get tokens key in cookies and I do not know how to get the full cookies

ty
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

'cookies' are normally stored in web browsers and each browser has it's own set of cookies for each web site that is visited.  I don't know what you mean by "tokens key".  More info here: http://en.wikipedia.org/wiki/HTTP_cookie
Avatar of hexo dark
hexo dark

ASKER

mmm i got an exemple of script python juste how to adapt this for the web site

import cookielib, urllib2, urllib, fileinput, sys, re

def login(username,password):
	opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie_jar))
	opener.addheaders =[('Referer', 'http://website.fr/index.php/connexion'),
						('User-Agent','Mozilla/5.0 (Windows NT 6.1; rv:26.0) Gecko/20100101 Firefox/26.0'),
						('Content-Type','application/x-www-form-urlencoded')]

	url ='http://website.fr/index.php/connexion'
	data = {'formname' : 'RpcApiUser_Login', 'fail_url' : 'http://website.fr/index.php/connexion', 'user' : username, 'login' : password}
	req = urllib2.Request(url, urllib.urlencode(data))
	res = opener.open(req)


try:
	with open('cookies.txt'): pass
except IOError:
	cookie_jar = cookielib.MozillaCookieJar('cookies.txt')
	cookie_jar.save()
if sys.argv[1] == 'no':
	print 'No cookies created.'
	sys.exit()
else:
	cookie_jar = cookielib.MozillaCookieJar('cookies.txt')
	cookie_jar.load()
	username = raw_input('Username: ')
	password = raw_input('Password: ')
	login(username,password)
	opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie_jar))
	opener.addheaders =[('User-Agent','Mozilla/5.0 (Windows NT 6.1; rv:26.0) Gecko/20100101 Firefox/26.0'),
						('Connection','keep-alive')]
	url = 'http://website.fr/'
	req = opener.open(url)
	site = req.read()

	if re.search(username+'(?i)',site):
		print 'Login successful.'
		cookie_jar.save()

		for line in fileinput.input('cookies.txt',inplace =1):
			line = line.strip()
			if not 'c_visitor' in line:
				print line
	else:
		print 'Login failed.'
		sys.exit()

Open in new window

That code will only get the cookies that are set by that web site when you try to connect to it.  Is that what you want?
yes
Then I would just put in the correct web site addresses and try it.
It is what I made but it does not work

is the link http://animedigitalnetwork.fr/index.php/connexion
The login form on that page is more complicated than what is shown in your code.  You will need to include all the data required by the form to get it to work.

You might want to do a test version on a simpler page that you have control over so you can make something that works before you tackle that page again.
I would like to know what is information which I have to have to make a success(to succeed)? And how got this information
Look at the source for that page.  All of the 'input' values including the 'hidden' ones need to be submitted to the 'action' page.  There are 3 'forms' on that page but this is the one that you need to look at.  I'm not sure that your code will do a 'post' method to the web site.
<form action="/index.php/connexion" method="post" id="login-form" >
		<fieldset class="userdata">
	<p id="form-login-username">
		<label for="modlgn-username">Pseudo</label>
		<input id="modlgn-username" type="text" name="username" class="inputbox"  size="18" />
	</p>
	<p id="form-login-password">
		<label for="modlgn-passwd">Mot de passe</label>
		<input id="modlgn-passwd" type="password" name="password" class="inputbox" size="18"  />
	</p>
		<p id="form-login-remember">
		<label for="modlgn-remember">Se souvenir de moi</label>
		<input id="modlgn-remember" type="checkbox" name="remember" class="inputbox" value="yes"/>
	</p>
		<button type="submit" name="Submit" class="button" >Connexion</button><br/><a target="_blank" class="button login-createaccount" href="/index.php/component/account/?view=create&amp;return=aHR0cDovL2FuaW1lZGlnaXRhbG5ldHdvcmsuZnIvaW5kZXgucGhwL2Nvbm5leGlvbg==">
					Créer un compte</a>
				<input type="hidden" name="option" value="com_users" />
	<input type="hidden" name="task" value="user.login" />
	<input type="hidden" name="return" value="aHR0cDovL2FuaW1lZGlnaXRhbG5ldHdvcmsuZnIvaW5kZXgucGhwL2Nvbm5leGlvbg==" />
	<input type="hidden" name="20eb3eb9b0b326538448b5e479d0ccac" value="1" />	</fieldset>
	<ul>
		<li>
			<a target="_blank" href="/index.php/oubli-mot-de-passe">
			Mot de passe oublié ?</a>
		</li>
		<li>
			<a target="_blank" href="/index.php/identifiant-perdu">
			Pseudo oublié ?</a>
		</li>
	</ul>
	</form>

Open in new window

Here is the Python page on using 'urllib2' :  https://docs.python.org/2/howto/urllib2.html
ok

<div class="login-login">
    <h2 class="cr-header">Connexion</h2>
    <div class="login-tip1">
      Vous avez déjà un compte ? Connectez-vous ci-dessous.    </div>


    <form id="RpcApiUser_Login" method="post" action="https://www.website.com/?a=formhandler">
      <input type="hidden" name="formname" value="RpcApiUser_Login" />
                  <input type="hidden" name="fail_url" value="http://www.website.com/login" />
            <table>
        <tr>
          <th>E-mail ou nom d'utilisateur:</th>
          <td><input type="text" name="name" value="" /></td>
        </tr>
        <tr>
          <th>Mot de passe:</th>
          <td><input type="password" name="password" /></td>
        </tr>
                <tr>
          <td colspan="2">
            <a class="submit" href="#" onclick="$('#RpcApiUser_Login').submit();return false;">Connexion<span class="right"></span></a>
            <input type="submit" class="ie-hidden-submit" />
          </td>
        </tr>
      </table>
    </form>

Open in new window


It is the part of the code of the web page or the script connects normally and I resumed(took back) this script for an other web site (http://animedigitalnetwork.fr/index.php/connexion ) I try to understand this line

data = {'formname' : 'RpcApiUser_Login', 'fail_url' : 'http://website.fr/index.php/connexion', 'user' : username, 'login' : password}

Open in new window


to adapte this to animedigitalnetwork  ty for te link  :) and your help
According to the Python page, the 'data' statement is a collection of the name/value pairs associated with the 'input' statements in the form.  Some form of that statement is used in all languages that are able to send a POST request to a web page.  In the first item, 'formname' : 'RpcApiUser_Login',  'formname' is the 'name' attribute from the 'input' statement and 'RpcApiUser_Login' will be the value you want to submit for it.  

However, "'user' : username, 'login' : password" are not part of the form above.  That form uses 'name' and 'password' for the names and you have to supply the values.  Like " 'name' : username, 'password' : password ".  Your script must match the name/value pairs that are found in each form you want to submit data to.
You would have any idea how I could place it because after several attempts = give nothing :(

Edit this site is a joomla logins !
They don't have to let you log in and there are ways to prevent you from doing that using methods like the one you have been trying.  But you don't know (yet) how to code a successful login which is why I suggested you create a simple test site and get it working on that first.
The protections that you speak is contournables ? ok Am going to make a try with a joomla install
I would really suggest something much simpler for initial testing.  I don't know what 'contournables' means.  More info here: http://www.w3schools.com/html/html_forms.asp and here: http://www.w3schools.com/php/php_forms.asp
i have a question is possible to use firefox on terminal command to connect to a web site and get my information ? cookies files ...
No.  Firefox runs in the GUI, not the terminal.
it partially works

import pycurl
import urllib
import StringIO

def test(debug_type, debug_msg):
    if len(debug_msg) < 300:
        print "debug(%d): %s" % (debug_type, debug_msg.strip())    

pf = {'username' : 'user', 'password' : 'pw' }
fields = urllib.urlencode(pf)
pageContents = StringIO.StringIO()

p = pycurl.Curl()
p.setopt(pycurl.FOLLOWLOCATION, 1)
p.setopt(pycurl.COOKIEFILE, './cookie_test.txt')
p.setopt(pycurl.COOKIEJAR, './cookie_test.txt')
p.setopt(pycurl.POST, 1)
p.setopt(pycurl.POSTFIELDS, fields)
p.setopt(pycurl.WRITEFUNCTION, pageContents.write)
p.setopt(pycurl.VERBOSE, True)
p.setopt(pycurl.DEBUGFUNCTION, test)
p.setopt(pycurl.URL, 'http://animedigitalnetwork.fr/index.php/connexion')
p.perform()

p.close() # This is mandatory.

pageContents.seek(0)
print pageContents.readlines()

Open in new window


juste how to get information of set cookies  the cookies is not complete
To get the cookies, you may have to visit the page once in 'pycurl' before you try to login.  What you are showing is very similar to 'curl' in PHP.
I do not see how the script adjusts you can show me I at the beginning in this language and I simply want to have these complete cookies in a text file
You may to run the script twice, first as a 'GET' to load the cookies and second as a 'POST' to try to login.
Would it be possible to get an example? I'm better at understanding if i can see it.
ASKER CERTIFIED SOLUTION
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I give you points:) even if I did not succeed I abandon(give up) not but for the moment I can nothing more
nikel