Link to home
Start Free TrialLog in
Avatar of trevor1940
trevor1940

asked on

python: Dependency error

I'm attempting to run this python script

In the scrapy.cfg points to this page  to install scrapy like so
pip install scrapyd

Open in new window


Which I've done however running the script using the example I get this error

User generated image
BTW I've set up unbuntu 18.4 in virtualbox this came with python preloaded
Avatar of Systech Admin
Systech Admin
Flag of India image

Please try to install via below command

apt-get install scrapyd

Refer below link

https://scrapyd.readthedocs.io/en/stable/install.html
ASKER CERTIFIED SOLUTION
Avatar of sahil Chopra
sahil Chopra

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of trevor1940
trevor1940

ASKER

@sahil Chopra

That doc suggest to use a virtualenv  as I'm using a VirtualBox is this still necessary?

So if I understand correctly I run both
sudo apt-get install python-dev python-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev
# Then run
pip install scrapy scrapyd

Open in new window

Is this for python or phyton3? The Version of Ubuntu suggest both  are installed

I'll then be able to run the script thus from wherever the spider directory is?

scrapy crawl oxford -o oxford.jl

Open in new window


the "a _init_.py"   has a list or words
once I've got it to work how might i separate this into another file and import it?
Hi,

It's installed in a location that isn't part of your execution path yet, or so to say the "PATH" environment variable.
Probably located somewhere in /usr/local/share/python* so just add the exact location to your execution PATH and you're ready to go.


Cheers
Run
sudo apt-get install python-dev python-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev

pip install scrapy scrapyd

got this message

Segmentation fault (core dumped)

Full text

pip install scrapy scrapyd
Collecting scrapy
  Using cached https://files.pythonhosted.org/packages/3b/e4/69b87d7827abf03dea2ea984230d50f347b00a7a3897bc93f6ec3dafa494/Scrapy-1.8.0-py2.py3-none-any.whl
Collecting scrapyd
  Using cached https://files.pythonhosted.org/packages/7a/c0/0aaadd16155743b1d0d0b6300286845e5b9871acbde274365c7b4c0a8148/scrapyd-1.2.1-py2.py3-none-any.whl
Collecting pyOpenSSL>=16.2.0 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/9e/de/f8342b68fa9e981d348039954657bdf681b2ab93de27443be51865ffa310/pyOpenSSL-19.1.0-py2.py3-none-any.whl
Collecting protego>=0.1.15 (from scrapy)
Collecting service-identity>=16.0.0 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/e9/7c/2195b890023e098f9618d43ebc337d83c8b38d414326685339eb024db2f6/service_identity-18.1.0-py2.py3-none-any.whl
Collecting cryptography>=2.0 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/e2/67/4597fc5d5de01bb44887844647ab8e73239079dd478c35c52d58a9eb3d45/cryptography-2.8-cp27-cp27mu-manylinux1_x86_64.whl
Collecting Twisted>=16.0.0; python_version == "2.7" (from scrapy)
  Using cached https://files.pythonhosted.org/packages/22/c2/5a30a4ad78af4d3e5df1701ec6a0dd59e1b0213dc2323dbf61b3af342ad5/Twisted-19.10.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting lxml>=3.5.0 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/25/73/730ad7249847f741c7e622f52b971daa5540f6fb87589bc92c717f5aafba/lxml-4.4.2-cp27-cp27mu-manylinux1_x86_64.whl
Collecting six>=1.10.0 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/65/26/32b8464df2a97e6dd1b656ed26b2c194606c16fe163c695a992b36c11cdf/six-1.13.0-py2.py3-none-any.whl
Collecting parsel>=1.5.0 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/86/c8/fc5a2f9376066905dfcca334da2a25842aedfda142c0424722e7c497798b/parsel-1.5.2-py2.py3-none-any.whl
Collecting cssselect>=0.9.1 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/3b/d4/3b5c17f00cce85b9a1e6f91096e1cc8e8ede2e1be8e96b87ce1ed09e92c5/cssselect-1.1.0-py2.py3-none-any.whl
Collecting w3lib>=1.17.0 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/6a/45/1ba17c50a0bb16bd950c9c2b92ec60d40c8ebda9f3371ae4230c437120b6/w3lib-1.21.0-py2.py3-none-any.whl
Collecting zope.interface>=4.1.3 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/71/4d/cdbbbdebd56fa4e799169abdb388b0a762d1f1cac156192be48c318ccf17/zope.interface-4.7.1-cp27-cp27mu-manylinux1_x86_64.whl
Collecting PyDispatcher>=2.0.5 (from scrapy)
Collecting queuelib>=1.4.2 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/4c/85/ae64e9145f39dd6d14f8af3fa809a270ef3729f3b90b3c0cf5aa242ab0d4/queuelib-1.5.0-py2.py3-none-any.whl
Collecting pyasn1-modules (from service-identity>=16.0.0->scrapy)
  Using cached https://files.pythonhosted.org/packages/52/50/bb4cefca37da63a0c52218ba2cb1b1c36110d84dcbae8aa48cd67c5e95c2/pyasn1_modules-0.2.7-py2.py3-none-any.whl
Collecting pyasn1 (from service-identity>=16.0.0->scrapy)
  Using cached https://files.pythonhosted.org/packages/62/1e/a94a8d635fa3ce4cfc7f506003548d0a2447ae76fd5ca53932970fe3053f/pyasn1-0.4.8-py2.py3-none-any.whl
Collecting ipaddress; python_version < "3.3" (from service-identity>=16.0.0->scrapy)
  Using cached https://files.pythonhosted.org/packages/c2/f8/49697181b1651d8347d24c095ce46c7346c37335ddc7d255833e7cde674d/ipaddress-1.0.23-py2.py3-none-any.whl
Collecting attrs>=16.0.0 (from service-identity>=16.0.0->scrapy)
  Using cached https://files.pythonhosted.org/packages/a2/db/4313ab3be961f7a763066401fb77f7748373b6094076ae2bda2806988af6/attrs-19.3.0-py2.py3-none-any.whl
Collecting enum34; python_version < "3" (from cryptography>=2.0->scrapy)
  Using cached https://files.pythonhosted.org/packages/c5/db/e56e6b4bbac7c4a06de1c50de6fe1ef3810018ae11732a50f15f62c7d050/enum34-1.1.6-py2-none-any.whl
Collecting cffi!=1.11.3,>=1.8 (from cryptography>=2.0->scrapy)
  Using cached https://files.pythonhosted.org/packages/93/5d/c4f950891251e478929036ca07b22f0b10324460c1d0a4434c584481db51/cffi-1.13.2-cp27-cp27mu-manylinux1_x86_64.whl
Collecting PyHamcrest>=1.9.0 (from Twisted>=16.0.0; python_version == "2.7"->scrapy)
  Using cached https://files.pythonhosted.org/packages/9a/d5/d37fd731b7d0e91afcc84577edeccf4638b4f9b82f5ffe2f8b62e2ddc609/PyHamcrest-1.9.0-py2.py3-none-any.whl
Collecting hyperlink>=17.1.1 (from Twisted>=16.0.0; python_version == "2.7"->scrapy)
  Using cached https://files.pythonhosted.org/packages/7f/91/e916ca10a2de1cb7101a9b24da546fb90ee14629e23160086cf3361c4fb8/hyperlink-19.0.0-py2.py3-none-any.whl
Collecting Automat>=0.3.0 (from Twisted>=16.0.0; python_version == "2.7"->scrapy)
  Using cached https://files.pythonhosted.org/packages/e5/11/756922e977bb296a79ccf38e8d45cafee446733157d59bcd751d3aee57f5/Automat-0.8.0-py2.py3-none-any.whl
Collecting incremental>=16.10.1 (from Twisted>=16.0.0; python_version == "2.7"->scrapy)
  Using cached https://files.pythonhosted.org/packages/f5/1d/c98a587dc06e107115cf4a58b49de20b19222c83d75335a192052af4c4b7/incremental-17.5.0-py2.py3-none-any.whl
Collecting constantly>=15.1 (from Twisted>=16.0.0; python_version == "2.7"->scrapy)
  Using cached https://files.pythonhosted.org/packages/b9/65/48c1909d0c0aeae6c10213340ce682db01b48ea900a7d9fce7a7910ff318/constantly-15.1.0-py2.py3-none-any.whl
Collecting functools32; python_version < "3.0" (from parsel>=1.5.0->scrapy)
Collecting setuptools (from zope.interface>=4.1.3->scrapy)
  Using cached https://files.pythonhosted.org/packages/54/28/c45d8b54c1339f9644b87663945e54a8503cfef59cf0f65b3ff5dd17cf64/setuptools-42.0.2-py2.py3-none-any.whl
Collecting pycparser (from cffi!=1.11.3,>=1.8->cryptography>=2.0->scrapy)
Collecting idna>=2.5 (from hyperlink>=17.1.1->Twisted>=16.0.0; python_version == "2.7"->scrapy)
  Using cached https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl
Installing collected packages: six, enum34, ipaddress, pycparser, cffi, cryptography, pyOpenSSL, protego, pyasn1, pyasn1-modules, attrs, service-identity, setuptools, PyHamcrest, zope.interface, idna, hyperlink, Automat, incremental, constantly, Twisted, lxml, functools32, w3lib, cssselect, parsel, PyDispatcher, queuelib, scrapy, scrapyd
Successfully installed Automat-0.8.0 PyDispatcher-2.0.5 PyHamcrest-1.9.0 Twisted-19.10.0 attrs-19.3.0 cffi-1.13.2 constantly-15.1.0 cryptography-2.8 cssselect-1.1.0 enum34-1.1.6 functools32-3.2.3.post2 hyperlink-19.0.0 idna-2.8 incremental-17.5.0 ipaddress-1.0.23 lxml-4.4.2 parsel-1.5.2 protego-0.1.15 pyOpenSSL-19.1.0 pyasn1-0.4.8 pyasn1-modules-0.2.7 pycparser-2.19 queuelib-1.5.0 scrapy-1.8.0 scrapyd-1.2.1 service-identity-18.1.0 setuptools-42.0.2 six-1.13.0 w3lib-1.21.0 zope.interface-4.7.1
Segmentation fault (core dumped)

Open in new window

This segmentation fault is because of memory. I think you ran out of memory.
This segmentation fault is because of memory. I think you ran out of memory.

The VirtualBox has 4GB I'll up it to 8 & try again

I thought it was to do with wrong or out of date  pyphon dependencies
Hi tried this again this time allocating 8GB to the Virtualbox I still had a Segmentation fault (core dumped)  message
A “segmentation fault” is when your program tries to access memory that it’s not allowed to access, or tries to . This can be caused by:

trying to dereference a null pointer (you’re not allowed to access the memory address 0)
trying to dereference some other pointer that isn’t in your memory
a C++ vtable pointer that got corrupted and is pointing to the wrong place, which causes the program to try to execute some memory that isn’t executable
some other things that I don’t understand, like I think misaligned memory accesses can also segfault
This “C++ vtable pointer” thing is what was happening to my segfaulting program. I might explain that in a future blog post because I didn’t know any C++ at the beginning of this week and this vtable lookup thing was a new way for a program to segfault that I didn’t know about.

For more refer https://jvns.ca/blog/2018/04/28/debugging-a-segfault-on-linux/
FYI for anyone visiting
I didn't actually solve this as the project has moved in a different direction
Spending time debugging   a  "segmentation fault"  isn't something I'm inclined to do

However I thank contributors for their advice