earngreen
asked on
Python scraper
I am trying to scrape a website for some information. I found a script and tried to convert it to python but the conversion still has some errors. I wondered if anyone can assist with the errors. Thanks
def scrapeEarningsZacks_(Stock=None,*args,**kwargs):
varargin = cellarray(args)
nargin = 1-[Stock].count(None)+len(args)
s=urlread_(char('http://zacks.thestreet.com/CompanyView.php'),char('post'),[char('ticker'),Stock])
try:
etst=strfind_(s,char('Surprise%</strong></div></td>'))
finally:
pass
etend=strfind_(s[etst:end()],char(' </table>'))
et=s[etst:etst + etend]
rowend=strfind_(et,char('</tr>'))
earnings=cell_(length_(rowend) - 2,6)
for i in arange_(1,(length_(rowend) - 1)).reshape(-1):
if i == length_(rowend):
row=et[rowend[i]:end()]
else:
row=et[rowend[i]:rowend[i + 1]]
dst=strfind_(row,char('<td>'))
for j in arange_(1,6).reshape(-1):
if j == 6:
a=row[dst[j]:end() - 23]
else:
a=row[dst[j]:dst[j + 1]]
earnings[i,j]=a[5:(end() - 38)]
emptyCells=cellfun_(isempty,earnings)
row,col=find_(emptyCells,nargout=2)
earnings[row,:]=[]
return earnings
print scrapeEarningsZacks_(AAPL)
It looks like all the data on that page is posted thru javascript. Your code will not run the javascript to get the data so it is unlikely that you will be able to scrape that page. In particular, the input for selecting a stock is done with javascript. It is not something you can 'post' to and get a result. This is that code:
<input type="text" name="search_company" id="search_company" value="Enter company name" size=18 onFocus="JavaScript:this.v alue=''" onBlur="JavaScript:Fill_Lo okup()" onkeyup="get_ticker_info() ;">
<input type="text" name="search_company" id="search_company" value="Enter company name" size=18 onFocus="JavaScript:this.v
are you running this code in Windows or Linux?
ASKER
This is Linux
what libraries have you imported?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
clockwatcher that worked out great. thx
Open in new window