I am trying to use scrapy for my project & after some initial struggle i started with https://doc.scrapy.org/en/latest/intro/tutorial.html
When i use :
scrapy startproject tutorial
It throws me error:
ubuntu#ip-10-241-62-56:~/Selenim$ scrapy startproject tutorial
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 7, in <module>
from scrapy.cmdline import execute
File "/usr/local/lib/python2.7/dist-packages/scrapy/__init__.py", line 34, in <module>
from scrapy.spiders import Spider
File "/usr/local/lib/python2.7/dist-packages/scrapy/spiders/__init__.py", line 10, in <module>
from scrapy.http import Request
File "/usr/local/lib/python2.7/dist-packages/scrapy/http/__init__.py", line 10, in <module>
from scrapy.http.request import Request
File "/usr/local/lib/python2.7/dist-packages/scrapy/http/request/__init__.py", line 13, in <module>
from scrapy.utils.url import escape_ajax
File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/url.py", line 15, in <module>
from w3lib.url import _safe_chars, _unquotepath
ImportError: cannot import name _unquotepath
How do i resolve this?
Upgrading w3lib (to 1.15.0) solved the problem for me.
Related
I'm trying to use the TikTokPy but there is an error occurring in Greenlet module:
$ python quickstart.py
Traceback (most recent call last):
File "C:\Users\mngoc\tiktokpy\quickstart.py", line 2, in <module>
from tiktokpy import TikTokPy
File "C:/Users\mngoc\tiktokpy/tiktokpy/__init__.py", line 1, in <module>
from .bot import TikTokPy
File "C:/Users\mngoc\tiktokpy/tiktokpy/bot/__init__.py", line 15, in <module>
from tiktokpy.client import Client
File "C:/Users\mngoc\tiktokpy/tiktokpy/client/__init__.py", line 8, in <module>
from playwright.async_api import Browser, Page, Playwright, PlaywrightContextManager, Response
File "C:/Users\mngoc\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages/playwright/async_api/__init__.py", line 25, in <module>
import playwright.async_api._generated
File "C:/Users\mngoc\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages/playwright/async_api/_generated.py", line 25, in <module>
from playwright._impl._accessibility import Accessibility as AccessibilityImpl
File "C:/Users\mngoc\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages/playwright/_impl/_accessibility.py", line 17, in <module>
from playwright._impl._connection import Channel
File "C:/Users\mngoc\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages/playwright/_impl/_connection.py", line 23, in <module>
from greenlet import greenlet
File "C:/Users\mngoc\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages/greenlet/__init__.py", line 29, in <module>
from ._greenlet import _C_API # pylint:disable=no-name-in-module
ModuleNotFoundError: No module named 'greenlet._greenlet'
I've already installed Greenlet module, but I have no idea about ._greenlet, there isn't any answer for related question so I'm stuck.
You should install the module:
pip3 install greenlet
I've been trying to run a scrapy spider from a .bat file
when I run the .bat file which has this text:
#echo off
REM activate Python venv
CALL "D:\python\scrapy_projects\venv\Scripts\activate.bat"
CD "D:\python\scrapy_projects\digikalasellerdata\digikalasellerdata\spiders"
CALL "D:\python\scrapy_projects\venv\Scripts\python.exe" "D:\python\scrapy_projects\venv\Lib\site-packages\scrapy\cmdline.py" crawl my_deactivated -O kobs.csv
pause
I get this error:
Traceback (most recent call last):
File "D:\python\scrapy_projects\venv\Lib\site-packages\scrapy\cmdline.py", line 8, in <module>
import scrapy
File "D:\python\scrapy_projects\venv\lib\site-packages\scrapy\__init__.py", line 12, in <module>
from scrapy.spiders import Spider
File "D:\python\scrapy_projects\venv\lib\site-packages\scrapy\spiders\__init__.py", line 10, in <module>
from scrapy.http import Request
File "D:\python\scrapy_projects\venv\lib\site-packages\scrapy\http\__init__.py", line 8, in <module>
from scrapy.http.headers import Headers
File "D:\python\scrapy_projects\venv\lib\site-packages\scrapy\http\headers.py", line 3, in <module>
from scrapy.utils.python import to_unicode
File "D:\python\scrapy_projects\venv\lib\site-packages\scrapy\utils\python.py", line 16, in <module>
from scrapy.utils.decorators import deprecated
File "D:\python\scrapy_projects\venv\lib\site-packages\scrapy\utils\decorators.py", line 4, in <module>
from twisted.internet import defer, threads
File "D:\python\scrapy_projects\venv\lib\site-packages\twisted\internet\defer.py", line 44, in <module>
from twisted.internet.interfaces import IDelayedCall, IReactorTime
File "D:\python\scrapy_projects\venv\lib\site-packages\twisted\internet\interfaces.py", line 26, in <module>
from twisted.python.failure import Failure
File "D:\python\scrapy_projects\venv\lib\site-packages\twisted\python\failure.py", line 26, in <module>
from twisted.python import reflect
File "D:\python\scrapy_projects\venv\lib\site-packages\twisted\python\reflect.py", line 22, in <module>
from twisted.python.compat import nativeString
File "D:\python\scrapy_projects\venv\lib\site-packages\twisted\python\compat.py", line 35, in <module>
from http import cookiejar as cookielib
File "D:\python\scrapy_projects\venv\Lib\site-packages\scrapy\http\__init__.py", line 8, in <module>
from scrapy.http.headers import Headers
ImportError: cannot import name 'Headers' from partially initialized module 'scrapy.http.headers' (most likely due to a circular import) (D:\python\scrapy_projects\venv\lib\site-packages\scrapy\http\headers.py)
any solutions?
Thanks a lot
Instead of directly calling the Python file, use this:
python -m scrapy crawl my_deactivated -O kobs.csv
That's what you activated the virtual environment for.
I'm triying to use scrapy on Windows 10. There is no problem in installation, but when I use scrapy command on cmd there is always the following error
C:\Users\Isaias HL\Desktop\noticias\noticias\spiders\spider_cbr.py:6: ScrapyDeprecationWarning: Module `scrapy.spider` is deprecated, use `scrapy.spiders` instead
from scrapy.spider import CrawlSpider, Rule
2018-02-16 15:11:52 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot:
noticias)
Traceback (most recent call last):
File "C:\Users\Isaias HL\Anaconda2\Scripts\scrapy-script.py", line 5, in
<module>
sys.exit(scrapy.cmdline.execute())
File "C:\Users\Isaias HL\Anaconda2\lib\site-packages\scrapy\cmdline.py",
line 149, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "C:\Users\Isaias HL\Anaconda2\lib\site-packages\scrapy\crawler.py",
line 252, in __init__
log_scrapy_info(self.settings)
File "C:\Users\Isaias HL\Anaconda2\lib\site-packages\scrapy\utils\log.py",
line 149, in log_scrapy_info
for name, version in scrapy_components_versions()
File "C:\Users\Isaias HL\Anaconda2\lib\site-
packages\scrapy\utils\versions.py", line 35, in scrapy_components_versions
("pyOpenSSL", _get_openssl_version()),
File "C:\Users\Isaias HL\Anaconda2\lib\site-
packages\scrapy\utils\versions.py", line 43, in _get_openssl_version
import OpenSSL
File "C:\Users\Isaias HL\Anaconda2\lib\site-packages\OpenSSL\__init__.py",
line 8, in <module>
from OpenSSL import crypto, SSL
File "C:\Users\Isaias HL\Anaconda2\lib\site-packages\OpenSSL\crypto.py",
line 12, in <module>
from cryptography import x509
File "C:\Users\Isaias HL\Anaconda2\lib\site-
packages\cryptography\x509\__init__.py", line 7, in <module>
from cryptography.x509 import certificate_transparency
ImportError: cannot import name certificate_transparency
Installing https://pypi.org/project/ctutlz/ fixed the issue for me.
pip install ctutlz
gaoyaqiu:git gaoyaqiu$ scrapy
Traceback (most recent call last):
File /usr/local/bin/scrapy, line 7, in <module>
from scrapy.cmdline import execute
File /Library/Python/2.7/site-packages/scrapy/cmdline.py, line 9, in <module>
from scrapy.crawler import CrawlerProcess
File /Library/Python/2.7/site-packages/scrapy/crawler.py, line 7, in <module>
from twisted.internet import reactor, defer
File /Library/Python/2.7/site-packages/twisted/internet/reactor.py, line 38, in <module>
from twisted.internet import default
File /Library/Python/2.7/site-packages/twisted/internet/default.py, line 56, in <module>
install = _getInstallFunction(platform)
File /Library/Python/2.7/site-packages/twisted/internet/default.py, line 50, in _getInstallFunction
from twisted.internet.selectreactor import install
File /Library/Python/2.7/site-packages/twisted/internet/selectreactor.py, line 18, in <module>
from twisted.internet import posixbase
File /Library/Python/2.7/site-packages/twisted/internet/posixbase.py, line 18, in <module>
from twisted.internet import error, udp, tcp
File /Library/Python/2.7/site-packages/twisted/internet/tcp.py, line 28, in <module>
from twisted.internet._newtls import (
File /Library/Python/2.7/site-packages/twisted/internet/_newtls.py, line 21, in <module>
from twisted.protocols.tls import TLSMemoryBIOFactory, TLSMemoryBIOProtocol
File /Library/Python/2.7/site-packages/twisted/protocols/tls.py, line 63, in <module>
from twisted.internet._sslverify import _setAcceptableProtocols
File /Library/Python/2.7/site-packages/twisted/internet/_sslverify.py, line 38, in <module>
TLSVersion.TLSv1_1: SSL.OP_NO_TLSv1_1,
AttributeError: module object has no attribute OP_NO_TLSv1_1
I came across with same issue.
The following thread helped.
pip install Twisted==16.4.1
If you need sudo access then add it to your command.
https://github.com/scrapy/scrapy/issues/2473
actually it is solved by:
pip install pyopenssl --upgrade
from the link: scrapy: 'module' object has no attribute 'OP_SINGLE_ECDH_USE'
I am trying to run this program but I am receiving this error:
python questions_app.py
Traceback (most recent call last):
File "questions_app.py", line 8, in <module>
from filter_daemon import *
File "/home/mona/danac/queshuns/filter_daemon.py", line 5, in <module>
from twython import TwythonStreamer
File "/usr/local/lib/python2.7/dist-packages/twython/__init__.py", line 23, in <module>
from .api import Twython
File "/usr/local/lib/python2.7/dist-packages/twython/api.py", line 14, in <module>
from requests_oauthlib import OAuth1, OAuth2
File "/usr/local/lib/python2.7/dist-packages/requests_oauthlib/__init__.py", line 3, in <module>
from .oauth2_auth import OAuth2
File "/usr/local/lib/python2.7/dist-packages/requests_oauthlib/oauth2_auth.py", line 2, in <module>
from oauthlib.oauth2 import WebApplicationClient, InsecureTransportError
ImportError: cannot import name WebApplicationClient
What are some possible options to solve it?
I am using Ubuntu 13.04 and Python 2.7.
It seems like you do not have oauthlib.
pip install oauthlib