Gaaara
asked on
how to edit name from variable with python
hello
i use bs4 and get title tag on the website
résult
Assassination Classroom - Saison 1 Épisode 2 - VOSTFR
and i wold like edit this to get in a variable
Assassination Classroom - S 01 Ep 2
and rename my files
ty in advance
i use bs4 and get title tag on the website
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537)
session = requests.Session()
response = session.get("http://animedigitalnetwork.fr/video/assassination-classroom/5886-episode-2-lecon-de-base-ball", headers=headers)
soup = BeautifulSoup(response.content)
oname = soup.find("title")
oname.text
print oname.text
résult
Assassination Classroom - Saison 1 Épisode 2 - VOSTFR
and i wold like edit this to get in a variable
Assassination Classroom - S 01 Ep 2
and rename my files
ty in advance
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Python cares about indentation. Lines 33-40 in your pastebin need to be indented so they are part of the elif menu =="2" block.
ASKER
i don't understand sorry ?
This:
The indentation needs to line up.
elif menu=="2":
#demande de liens
olinks=raw_input("Entrer votre liens ")
#récupération du fichier png & smil
subprocess.call(["php", "files/adn.php" , olinks])
#decryption du fichier png
subprocess.call(["php", "files/AES.class.php" , "tmp/adn.png"])
#récupération du nom de l'animation
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537)'}
session = requests.Session()
response = session.get("http://animedigitalnetwork.fr/video/assassination-classroom/5886-episode-2-lecon-de-base-ball", headers=headers)
soup = BeautifulSoup(response.content)
oname = soup.find("title")
oname_cleanedup = re.sub(r'(.*?\s+-\s+S)aison\s+(\d+)\s+É.*?(\d+)(.*)',
lambda m: "{title} {season:02d} Ep {episode}".format(title=m.group(1), season=int(m.group(2)), episode=m.group(3)),
oname.text)
print(oname_cleanedup)
##fin
elif menu=="3":
Needs to be this: elif menu=="2":
#demande de liens
olinks=raw_input("Entrer votre liens ")
#récupération du fichier png & smil
subprocess.call(["php", "files/adn.php" , olinks])
#decryption du fichier png
subprocess.call(["php", "files/AES.class.php" , "tmp/adn.png"])
#récupération du nom de l'animation
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537)'}
session = requests.Session()
response = session.get("http://animedigitalnetwork.fr/video/assassination-classroom/5886-episode-2-lecon-de-base-ball", headers=headers)
soup = BeautifulSoup(response.content)
oname = soup.find("title")
oname_cleanedup = re.sub(r'(.*?\s+-\s+S)aison\s+(\d+)\s+É.*?(\d+)(.*)',
lambda m: "{title} {season:02d} Ep {episode}".format(title=m.group(1), season=int(m.group(2)), episode=m.group(3)),
oname.text)
print(oname_cleanedup)
##fin
elif menu=="3":
The indentation needs to line up.
ASKER
ty for your help
ASKER
ehh is not work sorry the name is not edited sorry I was too fast
ASKER
it works just on python direcly no in my script ....
You'll need to give me a bit more to go on. What is it doing? Just not matching/ Try the following and paste back what it spits out...
oname = soup.find("title")
print("Searching: {0}".format(oname.text))
match = re.search(r'(.*?\s+-\s+S)aison\s+(\d+)\s+É.*?(\d+)(.*)'):
if match:
print "Found match"
oname_cleanedup = re.sub(r'(.*?\s+-\s+S)aison\s+(\d+)\s+É.*?(\d+)(.*)',
lambda m: "{title} {season:02d} Ep {episode}".format(title=m.group(1), season=int(m.group(2)), episode=m.group(3)),
oname.text)
print(oname_cleanedup)
else:
print("Match not found")
I went back to your pastebin and I'm not seeing where you're importing re. You need to make sure to import the regular expression module.
import re
ASKER
yes re is imported :)
i have an error of syntax
i deleted the
i have an error of syntax
File "start.py", line 35
match = re.search(r'(.*?\s+-\s+S)aison\s+(\d+)\s+É.*?(\d+)(.*)'):
i deleted the
:
i have the error raceback (most recent call last):
File "start.py", line 34, in <module>
print("Searching: {0}".format(oname.text))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc9' in position 35: ordinal not in range(128)
Change:
print("Searching: {0}".format(oname.text))
To:
print("Searching: {0}".format(oname.text.enc ode('ascii ', 'xmlcharrefreplace')))
And:
print(oname_cleanedup)
To:
print(oname_cleanedup.enco de('ascii' , 'xmlcharrefreplace'))
print("Searching: {0}".format(oname.text))
To:
print("Searching: {0}".format(oname.text.enc
And:
print(oname_cleanedup)
To:
print(oname_cleanedup.enco
ASKER
i have other error :)
Searching: Assassination Classroom - Saison 1 Épisode 2 - VOSTFR
Traceback (most recent call last):
File "start.py", line 35, in <module>
match = re.search(r'(.*?\s+-\s+S)aison\s+(\d+)\s+É.*?(\d+)(.*)')
TypeError: search() takes at least 2 arguments (1 given)
match = re.search(r'(.*?\s+-\s+S)a ison\s+(\d +)\s+É.*?( \d+)(.*)', oname.text)
ASKER
not work :)
Searching: Assassination Classroom - Saison 1 Épisode 2 - VOSTFR
Match not found
Then I'm guessing you may be have encoding problems (e.g., your file/editor may not really be using utf-8 and your É may not be the same É), because that regular expression should match. Try changing the regular expression to:
r'(.*?\s+-\s+S)aison\s+(\d +)\s+\xc9. *?(\d+)(.* )'
r'(.*?\s+-\s+S)aison\s+(\d
ASKER
not work ^^
is possible to créate a function with a different python files to get the modification name with principal script ?
Searching: Assassination Classroom - Saison 1 Épisode 2 - VOSTFR
Found match
Assassination Classroom - Saison 1 Épisode 2 - VOSTFR
is possible to créate a function with a different python files to get the modification name with principal script ?
If it printed "found match", then the regular expression matched. And the substitution would have worked too (if you had changed it to use the \xc9). But if you're saying it didn't work, I'm guessing you didn't modify the second regular expression (the one in the sub call) to use the \xc9.
Can you post your entire code? This back and forth is getting us nowhere fast.
Can you post your entire code? This back and forth is getting us nowhere fast.
ASKER
ok
http://pastebin.com/EF0umjd9
I thought of creating a new script python to get and modificate the name and générate a variable for the principal script
http://pastebin.com/EF0umjd9
I thought of creating a new script python to get and modificate the name and générate a variable for the principal script
You didn't change the regular expression in the sub call to use a \xc9 rather than the É. The problem is that whatever editor you're using isn't using the right-encoding. Your É isn't the utf-8 É that is in the webpage. To get around that, use the \xc9 (which is the utf-8 character code for É).
In other words, change line 37 to:
oname_cleanedup = re.sub(r'(.*?\s+-\s+S)aiso n\s+(\d+)\ s+\xc9.*?( \d+)(.*)',
And you can also probably change line 40 back to just:
print(oname_cleanedup)
Whether you really can or not depends on the codepage that your console is using. You'll know when you try to print it. If it bombs on that print line it's because whatever codepage your console is in, it doesn't have a translation for that character.
If you want to get rid of the debug output all together, you can go back with my original post from way earlier in the day, and just use \xc9 rather than the É:
In other words, change line 37 to:
oname_cleanedup = re.sub(r'(.*?\s+-\s+S)aiso
And you can also probably change line 40 back to just:
print(oname_cleanedup)
Whether you really can or not depends on the codepage that your console is using. You'll know when you try to print it. If it bombs on that print line it's because whatever codepage your console is in, it doesn't have a translation for that character.
If you want to get rid of the debug output all together, you can go back with my original post from way earlier in the day, and just use \xc9 rather than the É:
elif menu=="2":
#demande de liens
olinks=raw_input("Entrer votre liens ")
#récupération du fichier png & smil
subprocess.call(["php", "files/adn.php" , olinks])
#decryption du fichier png
subprocess.call(["php", "files/AES.class.php" , "tmp/adn.png"])
#récupération du nom de l'animation
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537)'}
session = requests.Session()
response = session.get("http://animedigitalnetwork.fr/video/assassination-classroom/5886-episode-2-lecon-de-base-ball", headers=headers)
soup = BeautifulSoup(response.content)
oname = soup.find("title")
oname_cleanedup = re.sub(r'(.*?\s+-\s+S)aison\s+(\d+)\s+\xc9.*?(\d+)(.*)',
lambda m: "{title} {season:02d} Ep {episode}".format(title=m.group(1), season=int(m.group(2)), episode=m.group(3)),
oname.text)
print(oname_cleanedup)
##fin
elif menu=="3":
ASKER
arr /$$%$%? :(
i have somme dificulty with indentation you have tool to help me ?
ty for your post
i have somme dificulty with indentation you have tool to help me ?
ty for your post
You might want to try using a python ide. Pycharm is a good choice (https://www.jetbrains.com/pycharm/). There's a free community edition and on occasions they've offered the free full version to students (https://www.jetbrains.com/student/).
ASKER
eem i have a probleme
the modification it works just one mor bug
how to resset the variable in the end of the script ?
By the fact that the information is kept(guarded) for the end of the script and when I begin again the process he(it) keeps(guards) this information and distorts to give them
and on oder link the résult is
suposed to
the modification it works just one mor bug
how to resset the variable in the end of the script ?
By the fact that the information is kept(guarded) for the end of the script and when I begin again the process he(it) keeps(guards) this information and distorts to give them
and on oder link the résult is
Naruto Shippuden - Épisode 392 - VOSTFR
suposed to
Naruto Shippuden - Ep 392
I'm not understanding what you mean with your first question.
As far as the second question, you should be able to use the following:
As far as the second question, you should be able to use the following:
elif menu=="2":
#demande de liens
olinks=raw_input("Entrer votre liens ")
#récupération du fichier png & smil
subprocess.call(["php", "files/adn.php" , olinks])
#decryption du fichier png
subprocess.call(["php", "files/AES.class.php" , "tmp/adn.png"])
#récupération du nom de l'animation
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537)'}
session = requests.Session()
response = session.get("http://animedigitalnetwork.fr/video/assassination-classroom/5886-episode-2-lecon-de-base-ball", headers=headers)
soup = BeautifulSoup(response.content)
oname = soup.find("title")
if oname.text.find('Saison') >= 0:
regexp = r'(.*?\s+-\s+S)aison\s+(\d+)\s+\xc9.*?(\d+)(.*)'
subst = "{title} {season:02d} Ep {episode}"
else:
regexp = r'(.*?\s+-)(\s+)\xc9.*?(\d+)(.*)'
subst = "{title} Ep {episode}"
oname_cleanedup = re.sub(regexp,
lambda m: subst.format(title=m.group(1), season=int(m.group(2)), episode=m.group(3)),
oname.text)
print(oname_cleanedup)
ASKER
error
Traceback (most recent call last):
File "start.py", line 41, in <module>
oname.text)
File "/usr/lib64/python2.7/re.py", line 151, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "start.py", line 40, in <lambda>
lambda m: subst.format(title=m.group(1), season=int(m.group(2)), episode=m.group(3)),
ValueError: invalid literal for int() with base 10: ''
ASKER
for the fris question
my menu is a loop the script end and display de choice menu and i use the same option = get the same title name
exemple
title 1 naruto
title 2 bleach
run script link naruto
résult naruto Ep 01
and back to menu and choose the same option title bleach
result naruto Ep 01
my menu is a loop the script end and display de choice menu and i use the same option = get the same title name
exemple
title 1 naruto
title 2 bleach
run script link naruto
résult naruto Ep 01
and back to menu and choose the same option title bleach
result naruto Ep 01
lambda m: subst.format(title=m.group (1), season=int(m.group(2)) if m.group(2).find(" ")==-1 else "", episode=m.group(3))
And it shouldn't do that the second time through... The oname_cleanedup variable is set based on the oname.text variable which should be reset by your request.
So, I'd need to see the real code where you're making your request. Because this line:
response = session.get("http://animedigitalnetwork.fr/video/assassination-classroom/5886-episode-2-lecon-de-base-ball", headers=headers)
Can't be your real code because that request is hardcoded and would always return that assassination classroom episode 2 page.
And it shouldn't do that the second time through... The oname_cleanedup variable is set based on the oname.text variable which should be reset by your request.
So, I'd need to see the real code where you're making your request. Because this line:
response = session.get("http://animedigitalnetwork.fr/video/assassination-classroom/5886-episode-2-lecon-de-base-ball", headers=headers)
Can't be your real code because that request is hardcoded and would always return that assassination classroom episode 2 page.
ASKER
Can't be your real code because that request is hardcoded and would always return that assassination classroom episode 2 page.
yes i remplace this with olinks :)
error
File "/home/gaaara/adn/test2.py", line 21
oname.text)
^
SyntaxError: invalid syntax
Based on the error it looks like your line 21 is probably just:
oname.text)
I'm guessing you put it on a line by itself and it's supposed to be up on line 20.
And I'm sorry but I think I'm done with this question. It's eaten up way way too much of my time. I think you might want to learn a little bit more about python syntax. You need some basic python knowledge here that I'm guessing might be missing-- indentation, where you can and can't wrap lines. Sorry but I gotta give this question up.
oname.text)
I'm guessing you put it on a line by itself and it's supposed to be up on line 20.
And I'm sorry but I think I'm done with this question. It's eaten up way way too much of my time. I think you might want to learn a little bit more about python syntax. You need some basic python knowledge here that I'm guessing might be missing-- indentation, where you can and can't wrap lines. Sorry but I gotta give this question up.
ASKER
ok ty for your help :) ^^ It was really appreciated your help
Looked at it again and I'm guessing what you did is that you got rid of the ending comma "," on line 20. Anyway... now I really do have to give this one up and good luck to you with your program.
ASKER
it works :)
ASKER
my code
http://pastebin.com/t5r5tFgN
error
Open in new window