I'm trying to run a scraping program I wrote for in python using scrapy on an ubuntu machine. Scrapy is installed. I can import until python no problem and when try pip install scrapy I get
Requirement already satisfied (use --upgrade to upgrade): scrapy in /system/linux/lib/python2.7/dist-packages
When I try to run scrapy from the command, with scrapy crawl ... for example, I get.
The program 'scrapy' is currently not installed.
What's going on here? Are the symbolic links messed up? And any thoughts on how to fix it?
Without sudo, pip installs into $HOME/.local/bin, $HOME/.local/lib, etc. Add the following line to your ~/.bashrc or ~/.profile (or the appropriate place for other shells):
export PATH="${PATH}:${HOME}/.local/bin"
then open a new terminal or reload .bashrc, and it should find the command.
I had the same error. Running scrapy in a virtual environment solved it.
Create a virtual env : python3 -m venv env
Activate your env : source env/bin/activate
Install Scrapy with pip : pip install scrapy
Start your crawler : scrapy crawl your_project_name_here
For example my project name was kitten, I just did the following in step 4
scrapy crawl kitten
NOTE: I did this on Mac OS running Python 3+
I tried the following sudo pip install scrapy , however was promtly advised by Ubuntu 16.04 that it was already installed.
I had to first use sudo pip uninstall scrapy, then sudo pip install scrapy for it to successfully install.
Now you should successfully be able to run scrapy.
I faced the same problem and solved using following method. I think scrapy is not usable by the current user.
Uninstall scrapy.
sudo pip uninstall scrapy
Install scrapy again using -H.
sudo -H pip install scrapy
Should work properly.
If you install scrapy only in virtualenv, then scrapy command isn't exists in your system bin directory. You could check it like this:
$ which scrapy
For me it is in(because I sudo installed it):
/usr/local/bin/scrapy
You could try full path to your scrapy.
For example if it is installed in virtualenv:
(env) linux#hero:~dev/myscrapy$ python env/bin/scrapy
Note: We recommend installing Scrapy inside a virtual environment on all platforms.
I had the same issue. sudo pip install scrapy fixed my problem, although I don't know why must use sudo.
make sure you activate command that is
"Scripts\activate.bat"
A good way to work around is using pyenv to manage the python version.
$ brew install pyenv
# Any version 3.6 or above
$ pyenv install 3.7.3
$ pyenv global 3.7.3
# Update Environment to reflect correct python version controlled by pyenv
$ echo -e '\nif command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init -)"\nfi' >> ~/.zshrc
# Refresh Terminal
# or source $~/.zshrc
$ exec $0
$ which python
/Users/mbbroberg/.pyenv/shims/python
$ python -V
Python 3.7.3
# Install scrapy
$ pip install scrapy
$ scrapy --version
Reference:
https://opensource.com/article/19/5/python-3-default-mac
scrapy crawl is not how you start a scrapy program. You start it by doing
scrapy startproject myprojectname
Then to actually start a scrapy program go into myprojectname/spiders and then you can call
scrapy crawl "yourspidername"
To have scrapy create a spider you can cd into your directory and execute
scrapy genspider mydomain mydomain.com
Additionally, you can test if your scrapy actually works by executing
scrapy shell "google.com"
All this information can be found in their Documentation.
If something happens then you have actually installed scrapy and you are crawling (haha) your way to success!
P.S. Scrapy does not work well on Python3 so if you're running on there and you still have troubles, use Python 2.7!
Related
I installed scrapy via sudo pip install scrapy. It installed the python modules into site-packages and I can import scrapy in my python environment. However, attempting to use the command line tool throws an error:
scrapy startproject demo
has error The program 'scrapy' is not currently installed. and tells me to install python-scrapy.
whereis scrapy has no output. Got tired of trying to track down the install path, so I ran find -name "*crap*", which also turned up nothing useful. It seems that the commandline tool wasn't installed by pip. What am I missing with this pip install?
The problem is sudo pip install scrapy installs scrapy in a directory not accessible by the current user, if you are not root.
You need to remove scrapy first sudo pip uninstall scrapy then reinstall with the -H sudo flag sudo -H pip install scrapy this will make it such that your command line can detect the scrapy installation.
This also does not answer the question why is scrapy command line tool not available, but if scrapy is importable as you comment, you can use:
$ python -m scrapy.cmdline version -v
$ python -m scrapy.cmdline shell <url>
scrapy is an alias to this in fact, as specified in Scrapy's setup.py entry_points section, and should have been setup by pip install.
This doesn't answer the question of what's wrong with the pip install, but to anyone with a working scrapy package but non-functional commandline command, you can create a script to run scrapy command line tool for you:
#! /usr/bin/python2.7
# path to python 2.7 (python 3 doesn't work well with scrapy atm)
import sys
import scrapy.cmdline
sys.exit(scrapy.cmdline.execute())
saved in a file (with execute permissions) called scrapy somewhere in your $PATH.
Verify whether you have these packages or not:
w3lib, cssselect, parsel, attrs, pyasn1-modules, service-identity, PyDispatcher, queuelib, zope.interface, constantly, incremental, Twisted, scrapy
I used :
$ pip install scrapy
on ubuntu 16.04 and all these packages were installed by it. After this I tried:
$ scrapy startproject demo
and it worked for me with this output:
New Scrapy project 'demo', using template directory '/home/*machine_name*/anaconda2/lib/python2.7/site-packages/scrapy/templates/project', created in:
/home/*machine_name*/demo
You can start your first spider with:
cd demo
scrapy genspider example example.com
Scrapy is not installed on you machine. If you want to install first run these command Which is used to install python-dev on you system
sudo apt-get install build-essential libssl-dev libffi-dev python-dev libxml2-dev
before these commands you should have run upgrade commands
sudo apt-get update
and
sudo apt-get upgrade
after these run
pip install scrapy
when it finish run following to check whether scrapy is installed or not
scrapy version
if version prompts you have installed scrapy successfully.
I tried to setup Scrapy on Windows 7 by steps described at http://doc.scrapy.org/en/latest/intro/install.html . On my PC was installed Python 3.5.1. Although Scrapy not support this python version it was installed successfully with latest Anaconda but fails to run spider script. I find that Scrapy only works with Python 3.3.+ version so uninstall version 3.5.1, uninstall Anaconda, install python 3.3.5, install pywin32 and install pip. pip fails pip install Scrapy, so I install Anaconda and run conda install -c scrapinghub scrapy Scrapy installed, but I saw that libs installed was for python 3.5 like: scrapy: 1.1.0-py35_0
Now I run the
c:\python\olxscrapy>scrapy crawl OlxCatalogSpider
and get error
File "C:\Anaconda3\lib\site-packages\twisted\internet\stdio.py", line 30, in
module>
from twisted.internet import _win32stdio
ImportError: cannot import name '_win32stdio'
How to make Scrapy run with python 3.3.+
On this blog:
https://blog.scrapinghub.com/2016/05/25/data-extraction-with-scrapy-and-python-3/
it says Scrapy on Python 3 doesn't work in Windows environments yet
Edit:
I recently installed scrapy on Ubuntu for Python 3.5 and received a lot of errors. The errors stopped after: "sudo apt-get install python3.5-dev".
I add the follow package and it works:
pip install twisted-win==0.5.5
Try to create a virtual env:
pip install virtualenv (instalation)
virtualenv -p python3.3.5 envName (creation with specific python version)
source ./envName/bin/activate (activate virtual env)
This way you can guarantee that's the right python version. Also scrapy has some requirements that can't be installed via pip and this may cause your pip install scrapy to fail
So install at your computer:
python-dev
libxslt1-dev
libxslt1.1
libxml2-dev
libxml2
libssl-dev
After this you finaly be able to install scrapy via pip inside your virtual env (probably)
Sry for my poor English isn't my native lang. Hope this work =]
Installation of Scrapy on Windows may facing error while installing Twisted.
Download Twisted according to your Python and windows version on this site http://www.lfd.uci.edu/~gohlke/pythonlibs/#twisted
Turn to your download folder and pip install <downloaded filename>
pip install scrapy
I am getting started with Scrapy, but I have two problems with installation on Linux Mint 17.2 (Ubuntu based version).
I don't get which is the difference from installing pip install
scrapy and sudo apt-get install scrapy
When i do install one of the two and I try to follow the first tutorial of Scrapy using the command scrapy startproject tutorialit gives me error /usr/bin: No such file or directory.
I have tried to uninstall and reinstall many times but still doesn't work.
1. Instalation Source
Both commands pip install scrapy and sudo apt-get install scrapy will install Scrapy on your computer, but the versions may be different. The pip option installs the latest version for Scrapy 1.0, while the one in your repositories its probably outdated.
If anyway you want to install the package from the repositories and still keep it updated, you can add the Scrapy repository:
http://doc.scrapy.org/en/1.0/topics/ubuntu.html
echo 'deb http://archive.scrapy.org/ubuntu scrapy main' | sudo tee /etc/apt-sources.list.d/scrapy.list
sudo apt-get update && sudo apt-get install scrapy
Souce: http://doc.scrapy.org/en/1.0/topics/ubuntu.html
2. Check the PATH variable
Depending on the way you install Scrapy, the binaries folder for the installation might be different. In my case I have it in /usr/local/bin.
Display your PATH variable with echo "$PATH", and check if the folder with the Scrapy binary is included.
You can add more directories to the variable with export PATH=$PATH:/path/to/dir
The installation guide tells not to use the packages provided by Ubuntu:
Don’t use the python-scrapy package provided by Ubuntu, they are
typically too old and slow to catch up with latest Scrapy.
Instead, use the official Ubuntu
Packages,
which already solve all dependencies for you and are continuously
updated with the latest bug fixes.
As mentioned you should install it using the Ubuntu packages on this page instead.
Besides the previous steps, I also had to install service-identity:
sudo pip install service-identity
I am trying to download pip onto my mac by following the instructions on the pip installation guide and I am coming up with this error after running the following command
$python get-pip.py
/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/
MacOS/Python: can't open file 'get-pip.py': [Errno 2] No such file or directory
This is happening after I download the 'get-pip.py' doc as the instructions suggest. Do I need to put this file in a certain location before I continue? I am relatively new to downloading programs through the terminal.
Thanks for the help!
It is recommended (highly) that you NOT use the version of Python that ships with your Mac. Instead use HomeBrew and install a "custom" version of Python (usually the latest). Then proceed to use virtualenv and optionally virtualenvwrapper
Prerequisites:
First, install Xcode from the App Store (it's FREE).
Install HomeBrew:
ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"
Install Python:
brew install python
This will install pip for you as well in /usr/local/bin/.
Install virtualenv:
pip install virtualenv
virtualenv Basic Usage:
virtualenv /path/to/my/env
cd /path/to/my/env
source ./bin/activate
# hack on your python project
deactivate # to go back to your normal shell
Please follow instructions for virtualenv for more details.
virtualenvwrapper is also really convenient and worthwhile learning.
Update :
More explanation at #dval 's comment
$ curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py
and then execute
$ python get-pip.py
None of the above solutions worked for me, so I decided to straight out clean install Python 3.6 from the downloads page at python.org.
After you have completed the Python installer, go into Terminal and type in:
curl -O https://bootstrap.pypa.io/get-pip.py
Wait for the download to complete and then type in:
python3 get-pip.py --user
Then for your pip commands you will use 'pip3'. For example:
pip3 install awsebcli --upgrade --user
After python and pip have been installed they should be in your user Library. So update your PATH in terminal like so:
export PATH=~/Library/Python/3.6/bin:$PATH
I have a bash_profile shell so I also ran the following command in terminal to load script into my current session:
source ~/.bash_profile
After this, verify that your pip installed component was successful.
For example:
eb --version
See AWS for the above reference.
Curl did not work for me. I had to use "wget".
$ wget https://raw.github.com/pypa/pip/master/contrib/get-pip.py
and then execute
$ python get-pip.py
I have tried to install Scrapy on mac 10.8.2. Here's what I did:
In terminal, I ran the command from with myuser directory:
pip install --user scrapy
I got the following message in Terminal:
Successfully installed scrapy
Cleaning up...
Next I do the following from the same myuser dir:
scrapy shell http://example.com
Here's the error I am getting:
-bash: scrapy: command not found
I believe this is a path issue, scrapy has been installed in /Library/Python/2.7/lib/python/site-packages. How do I get scrapy to run?
--user option is used when you want to install a package into the local user's $HOME, e.g. on Mac it should be $HOME/Library/Python/2.7/lib/python/site-packages.
scrapy executable could be found at $HOME/Library/Python/2.7/bin/scrapy. So, you should edit your .bash_login file and modify PATH env variable:
PATH="$HOME/Library/Python/2.7/bin/:$PATH"
Or, just reinstall scrapy without --user flag.
Hope that helps.