?
Solved

Macports Import beautifulsoup4 Problem

Posted on 2016-07-30
6
Medium Priority
?
131 Views
Last Modified: 2016-08-01
I have in this screenshot a list of installs and beautifulsoup4 is one of them.  But when I try to import it I get an error.  Please help.

https://gyazo.com/3415f2e1e28c095cf2c225ab7dcdbc79

Thanks,
0
Comment
Question by:sharingsunshine
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
6 Comments
 
LVL 40

Expert Comment

by:Eoin OSullivan
ID: 41736249
Check what modules are installed in python and what they are called
In the Python Shell type
help('modules')

Open in new window


I think that the beautifulsoup version in macports is v3
https://trac.macports.org/browser/trunk/dports/python/py-beautifulsoup/Portfile

In that case the command should probably be
import BeautifulSoup

Open in new window

0
 

Author Comment

by:sharingsunshine
ID: 41736884
Here is what I see
https://gyazo.com/f57bd11a6165e5991a7f9e66a841fcde

When I do this
import BeautifulSoup

I get this

>>> 
============== RESTART: /Users/rjw/Documents/Python/test_web.py ==============

Traceback (most recent call last):
  File "/Users/rjw/Documents/Python/test_web.py", line 25, in <module>
    (html, newText) = getText(url)
  File "/Users/rjw/Documents/Python/test_web.py", line 12, in getText
    html = BeautifulSoup(htmlString, 'html5lib')
TypeError: 'module' object is not callable
>>> 

here is the code
from requests import get
import BeautifulSoup
from re import compile, search
url = 'http://sethgodin.typepad.com'
nextPageNumRE = compile(r'page/(\d+?)/')
nextPageNum = '1'
maxPage = 4
totText = []

def getText(url):
    htmlString = get(url).text
    html = BeautifulSoup(htmlString, 'html5lib')
    tags = html.find_all('div', {'class':'entry-body'})
    text = [e.get_text() for e in tags]
    return (html, text)

def getPage(html, regex):
    nextPageTag = html.find('span', {'class':'pager-right'})
    nextPageATag = nextPageTag.find_next('a')
    nextPageURL = nextPageATag.attrs['href']
    nextPageNum = regex.search(nextPageURL).group(1)
    return (nextPageURL, nextPageNum)

while int(nextPageNum) <= maxPage:
    (html, newText) = getText(url)
    totText = totText + newText
    print (str(len(totText))) + ' posts were found'
    (url, nextPageNum) = getPage(html, nextPageNumRE)

Open in new window

0
 
LVL 40

Accepted Solution

by:
Eoin OSullivan earned 2000 total points
ID: 41737355
Try changing the import line to
from BeautifulSoup import BeautifulSoup

Open in new window

0
10 Questions to Ask when Buying Backup Software

Choosing the right backup solution for your organization can be a daunting task. To make the selection process easier, ask solution providers these 10 key questions.

 

Author Comment

by:sharingsunshine
ID: 41737379
here is the output from that change
============== RESTART: /Users/rjw/Documents/Python/test_web.py ==============

Traceback (most recent call last):
  File "/Users/rjw/Documents/Python/test_web.py", line 26, in <module>
    (html, newText) = getText(url)
  File "/Users/rjw/Documents/Python/test_web.py", line 13, in getText
    html = BeautifulSoup(htmlString, 'html5lib')
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/BeautifulSoup.py", line 1522, in __init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/BeautifulSoup.py", line 1147, in __init__
    self._feed(isHTML=isHTML)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/BeautifulSoup.py", line 1189, in _feed
    SGMLParser.feed(self, markup)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 104, in feed
    self.goahead(0)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 174, in goahead
    k = self.parse_declaration(i)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/BeautifulSoup.py", line 1463, in parse_declaration
    j = SGMLParser.parse_declaration(self, i)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/markupbase.py", line 109, in parse_declaration
    self.handle_decl(data)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/BeautifulSoup.py", line 1448, in handle_decl
    self._toStringSubclass(data, Declaration)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/BeautifulSoup.py", line 1381, in _toStringSubclass
    self.endData(subclass)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/BeautifulSoup.py", line 1251, in endData
    (not self.parseOnlyThese.text or \
AttributeError: 'str' object has no attribute 'text'

Open in new window


here is the code changed
from requests import get
# import BeautifulSoup
from BeautifulSoup import BeautifulSoup
from re import compile, search
url = 'http://sethgodin.typepad.com'
nextPageNumRE = compile(r'page/(\d+?)/')
nextPageNum = '1'
maxPage = 4
totText = []

def getText(url):
    htmlString = get(url).text
    html = BeautifulSoup(htmlString, 'html5lib')
    tags = html.find_all('div', {'class':'entry-body'})
    text = [e.get_text() for e in tags]
    return (html, text)

def getPage(html, regex):
    nextPageTag = html.find('span', {'class':'pager-right'})
    nextPageATag = nextPageTag.find_next('a')
    nextPageURL = nextPageATag.attrs['href']
    nextPageNum = regex.search(nextPageURL).group(1)
    return (nextPageURL, nextPageNum)

while int(nextPageNum) <= maxPage:
    (html, newText) = getText(url)
    totText = totText + newText
    print (str(len(totText))) + ' posts were found'
    (url, nextPageNum) = getPage(html, nextPageNumRE)

Open in new window


this may be past by original issue but I can't tell since it is referencing a line number that is outside the range of my code..  If it is, say the word and I will award you the points since you got me past the BeautifulSoup hurdle.
0
 
LVL 40

Expert Comment

by:Eoin OSullivan
ID: 41737403
Your code is now successfully calling the BeautifulSoup module .. the error / issue now lies INSIDE the BeautifulSoup code that is why the line numbers are referring to that module (BeautifulSoup.py - line 1251).

File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/BeautifulSoup.py", line 1251, in endData  (not self.parseOnlyThese.text or \ AttributeError: 'str' object has no attribute 'text'

Afraid I'm not in a position to debug this as I'm not running beautifulsoup on my Mac and I'd have to do that to replicate .. it could well be the fact that the code is for beautifulsoup4 but macports is using beautifulsoup3.
0
 

Author Closing Comment

by:sharingsunshine
ID: 41737459
Thanks for getting me this far and letting me know where to look next.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Dictionaries contain key:value pairs. Which means a collection of tuples with an attribute name and an assigned value to it. The semicolon present in between each key and values and attribute with values are delimited with a comma.  In python we can…
In this article we have discussed about the OS X EI Capitan and how to fix Wi-Fi issue in OS X El Capitan. We have explained how to delete system level preferences and create a new Wi-Fi location to resolve Wi-Fi issue.
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
Learn the basics of if, else, and elif statements in Python 2.7. Use "if" statements to test a specified condition.: The structure of an if statement is as follows: (CODE) Use "else" statements to allow the execution of an alternative, if the …

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question