Solved

how to download  raw post data json python

Posted on 2015-01-02
25
242 Views
Last Modified: 2015-01-12
hello
i would like download playlist json files how to downloads this files?

1 load url www.somsite.com/playliste/1
2 download json raw http://somesite/index.php?option=com_play&view=playlist&format=raw



ty
0
Comment
Question by:Gaaara
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 14
  • 11
25 Comments
 
LVL 45

Expert Comment

by:aikimark
ID: 40529221
Have you used urllib, mechanize, or request?
0
 

Author Comment

by:Gaaara
ID: 40529822
I want to know how to get files I am one thus begin wanted to know what to use and also to have some documentation with visible demo
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40529865
is there a specific web site you are opening?
0
Get Actionable Data from Your Monitoring Solution

Your communication platform is only as good as the relevance of the information you send. Ensure your alerts get to the right people every time with actionable responses. Create escalation rules that ensure everyone follows the process and nothing is left to chance.

 

Author Comment

by:Gaaara
ID: 40530824
no just get the playlis with a cron job for get a émail In every new links for 2 -3 web site To miss nothing
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40530825
please post an actual working URL that returns a JSON file.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40530867
I'm getting a 404 on the link that is supposed to return JSON
0
 

Author Comment

by:Gaaara
ID: 40530902
the json link Is activated in the load of the page with a free account or premium account with cookie

exemple with curl copied with  firebug
curl 'http://animedigitalnetwork.fr/index.php?option=com_vodvideo&view=playlist&format=raw' -H 'Host: animedigitalnetwork.fr' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:34.0) Gecko/20100101 Firefox/34.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3' -H 'Accept-Encoding: gzip, deflate' -H 'X-Requested-With: XMLHttpRequest' -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' -H 'Referer: http://animedigitalnetwork.fr/video/naruto-shippuden' -H 'Cookie: _ga=GA1.2.2143731636.1420256453; 18acd9b63ecbf50de0b8c010c2b7289f=m9sbtbulg7mciu7b2f1gpt3337; _gat=1' --data 'playlist=265&season=&order=DESC'

Open in new window

the script need to collect all the cookie and get the file

i dont no have a 404 with any link ?
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40530919
you have signed up for an account with that site.  That is why your URLs get results.  Log out (sign off) and retry your URL to see what I'm seeing.
0
 

Author Comment

by:Gaaara
ID: 40530928
is ok for me :) i dont no have a 404 error What do you want to know about the site
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40530965
I like to test code prior to posting it.
0
 

Author Comment

by:Gaaara
ID: 40530986
If you have some difficulty post the code i test it :)
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40532446
Let me rephrase my prior comment.  I don't feel comfortable posting untested code.  Maybe one of the other Python experts will feel better about your testing proposal.
0
 

Author Comment

by:Gaaara
ID: 40532863
mm test with my test compte of this site

user    testpseudo
pass   KytmEgsdKQa9

you have a vpn ?
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40533602
I was able to sign in, but your second link doesn't return anything.  I added &playlist=265 and didn't see any JSON.
0
 

Author Comment

by:Gaaara
ID: 40537189
I've requested that this question be deleted for the following reason:

...
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40536505
Looks like (rendered) HTML to me
0
 

Author Comment

by:Gaaara
ID: 40536602
with BeautifulSoup is possible to passing this part ?

	

    import urllib2
    from BeautifulSoup import BeautifulSoup
     
    page = urllib2.urlopen('http://animedigitalnetwork.fr/video/naruto-shippuden')
    soup = BeautifulSoup(page)
    first_div = soup.find('div',{"class" : "adn-video"})
    print  first_div 

Open in new window


i got this résult

<div class="adn-video"> <div class="adn-video_screenshot">
<img src="http://image.animedigitalnetwork.fr/license/claymore/tv/web/eps1_328x184.jpg" alt="Claymore 1" /><span class="adn_video_play-button"></span> </div><div class="adn-video_text"><div class="adn-video_title">
<h4>Claymore</h4><span>Épisode 1</span><div class="adn-rating mobile-hide" itemprop="aggregateRating" itemscope="itemscope" itemtype="http://schema.org/AggregateRating"><meta itemprop="ratingValue" content="4.6667" /><meta itemprop="ratingCount" content="10" /><div id="adn-rating"><ul class="adn-rating_empty"><li>&#xe002;</li><li>&#xe002;</li><li>&#xe002;</li><li>&#xe002;</li><li>&#xe002;</li></ul><ul class="adn-rating_rating"><li>&#xe002;</li><li>&#xe002;</li><li>&#xe002;</li><li>&#xe002;</li><li>&#xe002;</li></ul></div><p class="adn-rating-message"></p></div></div><div class="adn-video_link">
<a title="Claymore 1" href="/video/claymore/1849-episode-1-la-claymore">Voir la vidéo</a>
</div></div></div>

Open in new window


is not naruto !?
0
 
LVL 45

Assisted Solution

by:aikimark
aikimark earned 500 total points
ID: 40537116
There are several ways to parse the data out of HTML and beautifulsoup is certainly one of those ways.  What you have posted in your latest comment is HTML and not JSON.


Try this URL:
http://animedigitalnetwork.fr/video/naruto-shippuden

Then do a view source or inspect element to see the HTML.
0
 

Accepted Solution

by:
Gaaara earned 0 total points
ID: 40537186
is resolved

from bs4 import BeautifulSoup
import requests

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

# initialize session
session = requests.Session()

# getting playlist
response = session.get('http://animedigitalnetwork.fr/video/naruto-shippuden', headers=headers)
soup = BeautifulSoup(response.content)
playlist = soup.find('a', {'data-playlist': True})['data-playlist']

# getting list of videos
url = 'http://animedigitalnetwork.fr/index.php?option=com_vodvideo&view=playlist&format=raw'
response = session.post(url, data={
    'playlist': playlist,
    'season': '',
    'order': 'DESC'
}, headers=headers)

soup = BeautifulSoup(response.content)
for video in soup.select('div.adn-video'):
    print video.a.get('href')

Open in new window

0
 

Author Comment

by:Gaaara
ID: 40537190
is ok
0
 

Author Comment

by:Gaaara
ID: 40537192
I've requested that this question be closed as follows:

Accepted answer: 0 points for Gaaara's comment #a40537186
Assisted answer: 500 points for aikimark's comment #a40537116

for the following reason:

i give you the 500 points for the efforts
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40537797
There is no need to give me 'effort' points.  You can accept your comment as the solution.
0
 

Author Closing Comment

by:Gaaara
ID: 40544014
i give you the 500 points for the efforts
0

Featured Post

Raise the IQ of Your IT Alerts

From IT major incidents to manufacturing line slowdowns, every business process generates insights that need to reach the people required to take action. You need a platform that integrates with your business tools to create fully enabled DevOps toolchains.

You need xMatters.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Having just graduated from college and entered the workforce, I don’t find myself always using the tools and programs I grew accustomed to over the past four years. However, there is one program I continually find myself reverting back to…R.   So …
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question