Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Python scraper

Posted on 2014-12-21
7
Medium Priority
?
383 Views
Last Modified: 2014-12-27
I am trying to scrape a website for some information. I found a script and tried to convert it to python but the conversion still has some errors. I wondered if anyone can assist with the errors. Thanks

def scrapeEarningsZacks_(Stock=None,*args,**kwargs):

    varargin = cellarray(args)

    nargin = 1-[Stock].count(None)+len(args)



    s=urlread_(char('http://zacks.thestreet.com/CompanyView.php'),char('post'),[char('ticker'),Stock])

    try:

        etst=strfind_(s,char('Surprise%</strong></div></td>'))

    finally:

        pass

    etend=strfind_(s[etst:end()],char(' </table>'))

    et=s[etst:etst + etend]

    rowend=strfind_(et,char('</tr>'))

    earnings=cell_(length_(rowend) - 2,6)

    for i in arange_(1,(length_(rowend) - 1)).reshape(-1):

        if i == length_(rowend):

            row=et[rowend[i]:end()]

        else:

            row=et[rowend[i]:rowend[i + 1]]

        dst=strfind_(row,char('<td>'))

        for j in arange_(1,6).reshape(-1):

            if j == 6:

                a=row[dst[j]:end() - 23]

            else:

                a=row[dst[j]:dst[j + 1]]

            earnings[i,j]=a[5:(end() - 38)]

    emptyCells=cellfun_(isempty,earnings)

    row,col=find_(emptyCells,nargout=2)

    earnings[row,:]=[]

    return earnings

print scrapeEarningsZacks_(AAPL)

Open in new window

0
Comment
Question by:earngreen
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 46

Expert Comment

by:aikimark
ID: 40512319
Have you tried passing a string into the function?
print scrapeEarningsZacks_("AAPL")

Open in new window

0
 
LVL 84

Expert Comment

by:Dave Baldwin
ID: 40512322
It looks like all the data on that page is posted thru javascript.  Your code will not run the javascript to get the data so it is unlikely that you will be able to scrape that page.  In particular, the input for selecting a stock is done with javascript.  It is not something you can 'post' to and get a result.  This is that code:

<input  type="text" name="search_company"  id="search_company" value="Enter company name" size=18 onFocus="JavaScript:this.value=''" onBlur="JavaScript:Fill_Lookup()" onkeyup="get_ticker_info();">
0
 
LVL 46

Expert Comment

by:aikimark
ID: 40512591
are you running this code in Windows or Linux?
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 

Author Comment

by:earngreen
ID: 40513143
This is Linux
0
 
LVL 46

Expert Comment

by:aikimark
ID: 40513222
what libraries have you imported?
0
 
LVL 25

Accepted Solution

by:
clockwatcher earned 2000 total points
ID: 40514416
It would probably be easier if you just told us what you're hoping to return rather than fix whatever is going on with that code that you have there.

From the sample URL

   http://zacks.thestreet.com/CompanyView.php?ticker=AAPL

What would you like your scrapeEarnings to return?  The entire table?    Here's a python3 example of parsing that into python objects using beautiful soup:

from bs4 import BeautifulSoup
import urllib.request

class Earning(object):
    def __init__(self, table_row):
        (self.date, 
         self.period_ending,
         self.estimate,
         self.reported,
         self.surprise,
         self.surprise_percent) = [i.text for i in table_row("td")]

    def __str__(self):
        return "\t".join((self.date, self.period_ending, self.estimate,
                         self.reported, self.surprise, self.surprise_percent))

class Earnings(object):
    def __init__(self, soup):
        self.soup = soup
        self.earnings_table = soup.find(id="divPrint")("table")[1]
        self.earnings_rows = self.earnings_table("tr")[1:]
        self.earnings = [Earning(e) for e in self.earnings_rows]

    def __str__(self):
        return "\n".join([str(e) for e in self.earnings])

def getEarningsForTicker(ticker):
    url = "http://zacks.thestreet.com/CompanyView.php?ticker={0}".format(ticker)
    return Earnings(BeautifulSoup(urllib.request.urlopen(url)))

def main():
    print(getEarningsForTicker('AAPL'))

if __name__ == '__main__':
    main()

Open in new window

0
 

Author Comment

by:earngreen
ID: 40520120
clockwatcher that worked out great. thx
0

Featured Post

Tech or Treat!

Submit an article about your scariest tech experience—and the solution—and you’ll be automatically entered to win one of 4 fantastic tech gadgets.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Variable is a place holder or reserved memory locations to store any value. Which means whenever we create a variable, indirectly we are reserving some space in the memory. The interpreter assigns or allocates some space in the memory based on the d…
Strings in Python are the set of characters that, once defined, cannot be changed by any other method like replace. Even if we use the replace method it still does not modify the original string that we use, but just copies the string and then modif…
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
Suggested Courses

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question