How to take into account a line break with Beautifsoup

I am trying to understand how to do a line break when I want to have the value “name” on line 2
<input type="hidden" name="return" value="sommetoken=1" />
  line 2 ->  <input type="hidden" name="**sommetoken2**" value="1" /></form>

Open in new window


ty
aeko satoAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Mark BradyPrincipal Data EngineerCommented:
I don't follow your question. Why do you need to worry about line breaks? BeautifulSoup should load the source code just fine.

Try this example
s = '<form><input type="hidden" name="return" value="sommetoken=1" />\n'
s += '<input type="hidden" name="**sommetoken2**" value="this is your value" /></form>'

# notice there is a line break in the string above - but it doesn't matter

soup = BeautifulSoup(s)
my_input = soup.find('input', {'name': '**sommetoken2**'})
print my_input['value']

Open in new window


The result is **sommetoken2**

Try it.
aeko satoAuthor Commented:
hello

is my code

#!/usr/bin/env python
import requests
from BeautifulSoup import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}                                                              $
s = requests.session()
response = s.get('http://website.fr/index.php/connexion, headers=headers')


# extract the token
soup = BeautifulSoup(response.content)
#token = soup.find('input', {'name':'return'})['value']
#token2 = soup.find('input', {'type':'hidden'})['name']   ##here the result is "return"##

#print token
#print token2


# now post to that login page with some valid credentials and the token
auth = {
    'userName': 'batman'
    , 'password': 'j0kersuck5'
    , '_csrf_token': 'fhjdjs'
}

# now we should be authenticated, try visiting a protected page
response = s.get('http://website.fr/index.php?option=com_vodvideo&view=player&format=raw&video_id=5984&load=1&date=, headers=headers')
print response.text

Open in new window


<input type="hidden" name="return" value="sommetoken1=" />
    <input type="hidden" name="sommetoken2" value="1" /></form> ##I must retrieve the token 1 and 2

Open in new window


how i forget this the token is random
Mark BradyPrincipal Data EngineerCommented:
As long as you know the "name" value you can find it with BeautifulSoup. If you need to find all of the input tags you can do something like this.

Using my code as an example:
s = '<form><input type="hidden" name="return" value="sommetoken=1" />\n'
s += '<input type="hidden" name="**sommetoken2**" value="this is your value" /></form>'

# notice there is a line break in the string above - but it doesn't matter

soup = BeautifulSoup(s)
my_inputs = soup.findall('input')

for input in my_inputs:
    print input.name + input['value']

Open in new window


Now you see how to get them all and loop through them to find the ones you need?

If you are still having trouble try posting an example of the response you get from your call then tell me exactly what you want.
Fundamentals of JavaScript

Learn the fundamentals of the popular programming language JavaScript so that you can explore the realm of web development.

aeko satoAuthor Commented:
   Traceback (most recent call last):
  File "logins3", line 12, in <module>
    soup = BeautifulSoup(s)
  File "/usr/lib/python2.7/site-packages/BeautifulSoup.py", line 1522, in __init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/BeautifulSoup.py", line 1147, in __init__
    self._feed(isHTML=isHTML)
  File "/usr/lib/python2.7/site-packages/BeautifulSoup.py", line 1171, in _feed
    smartQuotesTo=self.smartQuotesTo, isHTML=isHTML)
  File "/usr/lib/python2.7/site-packages/BeautifulSoup.py", line 1773, in __init__
    self._detectEncoding(markup, isHTML)
  File "/usr/lib/python2.7/site-packages/BeautifulSoup.py", line 1918, in _detectEncoding
    '^<\?.*encoding=[\'"](.*?)[\'"].*\?>').match(xml_data)
TypeError: expected string or buffer

Open in new window


.

i create a plugins for xbmc (kodi) to stream content of the video site (for personal use)
an i create  the script login  to get information

edit
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import requests
from BeautifulSoup import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}                                                              $
s = requests.Session()
response = s.get("http://website.fr/index.php/connexion", headers=headers)

#print response.text

#print repr(response)

# extract the token

soup = BeautifulSoup(response.content)
my_inputs = soup.findall('input')

for input in my_inputs:
    print input.name + input['value']

# now post to that login page with some valid credentials and the token
auth = {
    'userName': 'user'
    , 'password': 'pass'
    , '_csrf_token': 'test'
}

# now we should be authenticated, try visiting a protected page
response = s.get('http://website.fr/index.php?option=com_vodvideo&view=player&format=raw&video_id=5984&load=1&date=', headers=headers)

Open in new window


I made new changes I have a new error

TypeError: 'NoneType' object is not callable

Open in new window

Mark BradyPrincipal Data EngineerCommented:
Sounds like you are not getting a response. Try this

import traceback

try:
   # put you call in here

except:
    traceback.print_exc()


# so leave your login code as it is but when you go to do the final call to get the data you want, put that in the try block so if an exception is thrown you will get a nice traceback that makes sense.

I am unable to test the code I gave you as I am at home and don't have python installed, however if you can print out the response from the call and post that data here, I will help you get what you need tomorrow
aeko satoAuthor Commented:
i got it ! the new code

#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import requests
from BeautifulSoup import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}                                                              $
s = requests.Session()
response = s.get("http://website.fr/index.php/connexion", headers=headers)

#print response.text

#print repr(response)

# extract the token

soup = BeautifulSoup(response.content)
my_inputs = soup.findAll('input')

for in_put in my_inputs:
    token =  in_put.name , in_put['value']

# now post to that login page with some valid credentials and the token
auth = {
    'userName': 'user'
    , 'password': 'pass'
    , '_csrf_token': token
}

# now we should be authenticated, try visiting a protected page
response = s.get('http://website.fr/index.php?option=com_vodvideo&view=player&format=raw&video_id=5984&load=1&date=', headers=headers)

print response.content

Open in new window


is normal i have this ?

input 
input 
input 
input yes
input aW5kZXgucGhwP29wdGlvbj1jb21fdXNlcnMmdmlldz1wcm9maWxl
input 1
input 
input 1
input 1

Open in new window


the web site i trying to connect is here http://animedigitalnetwork.fr/
and information i trying to get is a json files with smil links
Mark BradyPrincipal Data EngineerCommented:
That doesn't look right. For starters your loop on line 19 is wrong. You are looping all the input and each time assigning it to a value - and with a comma , - You should do the loop but look for a specific input that you need and when you find it, assign that value to a variable.
aeko satoAuthor Commented:
hello

can you make me an example pleez
Mark BradyPrincipal Data EngineerCommented:
Please do what I asked for earlier. Post an example of the data you get back and tell me exactly what you need to find in the response
Mark BradyPrincipal Data EngineerCommented:
Try this example
# coding=utf-8
import traceback
from bs4 import BeautifulSoup


AUTH_ENDPOINT = 'http://adbook2.fattail.com/abn/ws/adbookconnectns2.svc?wsdl'
FATTAIL_ENDPOINT = 'https://adbook2.fattail.com/abn/ws/adbookconnectns2.svc'

USERNAME = 'mweaver@sonobi.com'
PASSWORD = 'ab_son_123!'

class FindMe(object):

	def get_response(self, search_string = None, attribs = None):
		
		response = """
			<html>
				<head>
					<title>Test Page</title>
				</head>
				<body>
					<form name="test">
						<input type="hidden" name="return" value="sommetoken=" />
						<input type="hidden" name="sommetoken" value="1" />
						<input type="text" name="username" value="John" />
						<input type="password" name="password" value="xyz" />
						<input type="submit" value="send" />
					</form>
					<p>
						Hello. This is a test page
					</p>
				</body>
			</html>
		"""		
		soup = BeautifulSoup(response)
				
		try:
			if attribs is not None:
				print 'search by attribute'
				my_input = soup.find('input', attribs)
				if my_input is not None:
					return my_input['value']
				else:
					return 'Input not found in response'.format(search_string)
			else:
				print 'search by string'
				inputs = soup.findAll('input')
				
				if inputs is not None:
					for my_input in inputs:
						if my_input['value'] == search_string:
							return my_input['value']
					return '{0} not found in response'.format(search_string)
				else:
					return '{0} not found in response'.format(search_string)
		except:
			traceback.print_exc()

app = FindMe()

attributes = {
	'name':'username'
}

string = 'sommetoken='

result = app.get_response(search_string = string, attribs = None)
print result

Open in new window


Run the file like it is and you will get back the value 'sommetoken='

It is doing a search for a particular "value"

Now on line 67 change it so attribs = attributes

Save and re-run the file.
Now you will get back "John"

So you should be able to see how I looped through ALL of the inputs in the response and tried to find the one with the value equal to the search string

This Page has some good tips on how to search and parse any result with BeautifulSoup

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
aeko satoAuthor Commented:
If I have asked for examples, it is because I had not understood what you had explained to me. My English is not very good and this is the only place where I have found any help.  

But in what I see in spite of the example I do not manage to understand(include) I am going to have to abandon(give up) the time(weather) to learn(teach) more on the subject I am going to explain all the same what I want to manage to make

conection page

<form action="/index.php/connexion" method="post" id="login-form" >
		<fieldset class="userdata">
	<p id="form-login-username">
		<label for="modlgn-username">Pseudo</label>
		<input id="modlgn-username" type="text" name="username" class="inputbox"  size="18" />
	</p>
	<p id="form-login-password">
		<label for="modlgn-passwd">Mot de passe</label>
		<input id="modlgn-passwd" type="password" name="password" class="inputbox" size="18"  />
	</p>
		<p id="form-login-remember">
		<label for="modlgn-remember">Se souvenir de moi</label>
		<input id="modlgn-remember" type="checkbox" name="remember" class="inputbox" value="yes"/>
	</p>
		<button type="submit" name="Submit" class="button" >Connexion</button>	<input type="hidden" name="option" value="com_users" />
	<input type="hidden" name="task" value="user.login" />
	<input type="hidden" name="return" value="aHR0cDovL2FuaW1lZGlnaXRhbG5ldHdvcmsuZnkZXgucGhwL2Nvbm5leGlvbg==" />
	<input type="hidden" name="d72fec97d5b2845b23bceabd7483bfe7" value="1" />	</fieldset>

Open in new window


I do not know how the site confirms tokens for the moment I try only the collected

i have an exemple colected with firefox curl

curl 'http://website.fr/index.php?option=com_vodvideo&view=player&format=raw&video_id=5878&load=1&date=Fri%20Apr%2003%202015%2014:48:01%20GMT-0400%20(EDT)' -H 'Cookie: a2506a7b1c9f80a536e0f254bf8c954d=481447465542+8525556105C10761157+D555B1B4D114654101242+9405340+811717B51+4761E4041411450571319; _gat=1; 18acd9b63ecbf50de0b8c010c2b7289f=c58f5408a6ef6924bd67775f216f91a0; _ga=GA1.2.64935130.1427206961' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36' -H 'Accept: */*' -H 'Referer: http://animes.fr/video/tail/8878-episode-22' -H 'X-Requested-With: XMLHttpRequest' -H 'Connection: keep-alive' --compressed

Open in new window

Mark BradyPrincipal Data EngineerCommented:
I'm not sure what you are saying sorry. I don't need to see the login code. You said that is working for you so I am only interested to see the data (string) that comes back when you make your call (not the login response). If you can figure out how to post that here then I can help you more.

You should try my example which will help you understand how to get data from a response.
aeko satoAuthor Commented:
i got an exemple in php

https://www.wareziens.net/forum/topic-21273-decrypter-fichier-png-protection-a-changer-page-3.html

post 55


it looks like what I want to make a difference will not look the correct file
Mark BradyPrincipal Data EngineerCommented:
Yeah that page is not in English
aeko satoAuthor Commented:
sorry mmm i try to traduct to you the script :)
aeko satoAuthor Commented:
script translation French to english

http://pastebin.com/5KdmAJGi
aeko satoAuthor Commented:
I am going to withdraw from this question the time of how all this works :)  thank you for everything
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Python

From novice to tech pro — start learning today.