I want to test my scrapy spider. I want to import spider to a test file an make a test spider and override start_urls, but I have a problem with importing it. Here is a project structure
...product-scraper\test_spider.py
...product-scraper\oxygen\oxygen\spiders\oxygen_spider.py
...product-scraper\oxygen\oxygen\items.py
the problem is that spider import Product class from items.py
from oxygen.items import Product
ImportError: No module named items
cmdscrapy crawl oxygen_spider works
I tried change sys.path or site.addsitedir in all possible ways
basedir = os.path.abspath(os.path.dirname(__file__))
module_path = os.path.join(basedir, "oxygen\\oxygen")
sys.path.append(basedir) # module_path
no success :(
I use python 2.7 on windows
Do you really get the error "No module named items"? Or is it something like "No module named oxygen.items"?
Also I'm not really sure why you would want to use os.path commands. Wouldn't this just work:
from items import Product
So without the "oxygen." This would however, as far as I know, only work if Product is a class in your items.py. If it's not a class I would suggest to just use:
import items
If that does not work, please specify what Product is in your items.py
Related
I have created 2 classes
Connections.py and LogObserver.py
I am trying to import Connections.py in LogObserver.py but python keep throwing an error
ModuleNotFoundError: No module named 'connections'
The way i am importing it is
from connections.Connections import Connections
class LogObserver:
The file structure is
If your Class name in Connections file is called Connections then you can try following:
from connections import Connections
c = Connections.Connections()
Or:
import connections.Connections as myModule
c = myModule.Connections()
make sure when you import that you do following:
from <folder>.<filename> import <class_name>
There is a problem in your file structure.
Try keeping connections folder inside queries folder OR specify the file path for connections.py correctly.
Hope it resolves the issue.
I'm struggling to import a folder that has many engines I need to use. I'm importing from main_file.py.
So I think I can use - from engines import qr_code_gen, but I need to import a class which is named _QRCode_ so I tried using - from .engines.qr_code_gen import _QRCode_, but it says "module engines was not found".
Structure:
Server/start.sh
Server/wsgi.py
Server/application/main_file.py
Server/application/engines/qr_code_gen.py
Server/application/engines/__init__.py
...
I used sys.path in main_file.py and I got -
['C:\Users\Dzitc\Desktop\winteka2',
'C:\Users\Dzitc\AppData\Local\Programs\Python\Python37\Scripts\flask.exe',
'c:\users\dzitc\appdata\local\programs\python\python37\python37.zip',
'c:\users\dzitc\appdata\local\programs\python\python37\DLLs',
'c:\users\dzitc\appdata\local\programs\python\python37\lib',
'c:\users\dzitc\appdata\local\programs\python\python37',
'C:\Users\Dzitc\AppData\Roaming\Python\Python37\site-packages',
'c:\users\dzitc\appdata\local\programs\python\python37\lib\site-packages',
'c:\users\dzitc\appdata\local\programs\python\python37\lib\site-packages\win32',
'c:\users\dzitc\appdata\local\programs\python\python37\lib\site-packages\win32\lib',
'c:\users\dzitc\appdata\local\programs\python\python37\lib\site-packages\Pythonwin']
Going from comments you can import engine package.
Try this then:
import engines
engines.qr_code_gen._QRCode_
I've been trying to import some python classes which are defined in a child directory. The directory structure is as follows:
workspace/
__init__.py
main.py
checker/
__init__.py
baseChecker.py
gChecker.py
The baseChecker.py looks similar to:
import urllib
class BaseChecker(object):
# SOME METHODS HERE
The gChecker.py file:
import baseChecker # should import baseChecker.py
class GChecker(BaseChecker): # gives a TypeError: Error when calling the metaclass bases
# SOME METHODS WHICH USE URLLIB
And finally the main.py file:
import ?????
gChecker = GChecker()
gChecker.someStuff() # which uses urllib
My intention is to be able to run main.py file and call instantiate the classes under the checker/ directory. But I would like to avoid importing urllib from each file (if it is possible).
Note that both the __init__.py are empty files.
I have already tried calling from checker.gChecker import GChecker in main.py but a ImportError: No module named checker.gChecker shows.
In the posted code, in gChecker.py, you need to do
from baseChecker import BaseChecker
instead of import baseChecker
Otherwise you get
NameError: name 'BaseChecker' is not defined
Also with the mentioned folders structure you don't need checker module to be in the PYTHONPATH in order to be visible by main.py
Then in main.y you can do:
from checker import gChecker.GChecker
I'm having an issue with the import statement in Python 3. I'm following a book (Python 3 Object Oriented) and am having the following structure:
parent_directory/
main.py
ecommerce/
__init__.py
database.py
products.py
payments/
__init__.py
paypal.py
authorizenet.py
In paypal.py, I'm trying to use the Database class from database.py. So I tried this:
from ecommerce.database import Database
I get this error:
ImportError: No module named 'ecommerce'
so I try with both of these import statements:
from .ecommerce.database import Database
from ..ecommerce.database import Database
and I get this error:
SystemError: Parent module '' not loaded, cannot perform relative import
What am I doing wrong or missing?
Thank you for your time!
Add your parent_directoryto Python's search path. For example so:
import sys
sys.path.append('/full/path/to/parent_directory')
Alternatively, you can add parent_directory to the environmental variable PYTHONPATH.
I have a very basic spider, following the instructions in the getting started guide, but for some reason, trying to import my items into my spider returns an error. Spider and items code is shown below:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from myProject.items import item
class MyProject(BaseSpider):
name = "spider"
allowed_domains = ["website.com"]
start_urls = [
"website.com/start"
]
def parse(self, response):
print response.body
from scrapy.item import Item, Field
class ProjectItem(Item):
title = Field()
When I run this code scrapy either can't find my spider, or can't import my items file. What's going on here? This should be a really example to run right?
I also had this several times while working with scrapy. You could add at the beginning of your Python modules this line:
from __future__ import absolute_import
More info here:
http://www.python.org/dev/peps/pep-0328/#rationale-for-absolute-imports
http://pythonquirks.blogspot.ru/2010/07/absolutely-relative-import.html
you are importing a field ,you must import a class from items.py
like from myproject.items import class_name.
So, this was a problem that I came across the other day that I was able to fix through some trial and error, but I wasn't able to find any documentation of it so I thought I'd put this up in case anyone happens to run into the same problem I did.
This isn't so much an issue with scrapy as it is an issue with naming files and how python deals with importing modules. Basically the problem is that if you name your spider file the same thing as the project then your imports are going to break. Python will try to import from the directory closest to your current position which means it's going to try to import from the spider's directory which isn't going to work.
Basically just change the name of your spider file to something else and it'll all be up and running just fine.
if the structure like this:
package/
__init__.py
subpackage1/
__init__.py
moduleX.py
moduleY.py
subpackage2/
__init__.py
moduleZ.py
moduleA.py
and if you are in moduleX.py, the way to import other modules can be:
from .moduleY.py import *
from ..moduleA.py import *
from ..subpackage2.moduleZ.py import *
refer:PEP Imports: Multi-Line and Absolute/Relative