How to import my own modules into a scrapy project? - python

I am trying to write a scrapy spider with multiple pipelines. I select which pipeline to use with an attribute of the spider. The attribute is of an enum type I wrote myself. The problem now is importing that enum in the pipeline classes. Every time I try to import it I get the following error:
from data.file_types import FileTypes
builtins.ModuleNotFoundError: No module named 'data'
I already tried different variations to place the enum class and switched between relative and absolute imports. If I place the enum class in a own package independent of the scrapy package I can import and use the enum if I run the pipeline files directly but I still get an error if I want to run the spider over the shell.
My current project structure is:
noveldownloader:
data
enum_file.py
__init__.py
novelscraper
novelscraper
pipelines
spiders
etc
__init__.py
scrapy.cfg
And my current import is:
from data.file_types import FileTypes
If it helps I uploaded my code to GitHub:
https://github.com/JustACodingFox/NovelDownloader

One alternative folder structure that worked for me, while modularising my code was this.
create new folder 'data' inside ./noveldownloader/noveldownloader/
then you can import it like this
from noveldownloader.data import enum_file
and then you can consume this function by
enum_file.whatever_function_to_call()

Related

How to properly load a library with multiple files from source?

I downloaded a Python library from Github that contains multiple files. In the __init__.py none of these files are imported like from .xyz import xyz_class. So I thought the user probably expected to load all required classes directly. However when I tried this with one of the classes it was not able to import it properly, as it misses other classes from this library. What feels odd for me is that the beginning of this class looks like
from copy import deepcopy
from collections import OrderedDict
from fields import Fields
where fields.py is one of the files in the folder of the library. I would have thought this file should be imported using from .fields import Fields, but as the dot is missing in several other imports as well I don't think the creator did this unintentionally. So now I am wondering how can I import these classes as the library intents without the . or do I need to add it?

How to create a Python package with multiple directories

There is a question like mine in the link below:
creating python package with multiple level of folders
However, after following the answers to the question and trying some other things that I thought might work I did not succeed.
I have created a working package with a number of functions here:
https://github.com/aaronengland/prestige
In the prestige directory is an init.py file containing some classes and functions. I have a class named preprocessing and I can call any of the functions from that class using:
from prestige import preprocessing as pre
And then (for example):
pre.Binaritizer()
However, I want to be able to import those functions using:
import prestige.preprocessing as pre
Using the first link (above) I was unsuccessful in doing this. I feel like it should be a simple solution, but for some reason I have not been able to get it to work. Can someone please show me how to make this possible? Thank you in advance!
I was able to solve the problem by organizing the file structure as follows:
prestige
setup.py
init.py
general.py
preprocessing.py
setup.py was set up as I normally do, general.py contains functions/classes, and preprocessing.py contains functions/classes. The init.py file contains 2 lines of code:
from .preprocessing import * and from .general import *
So, I did not create new directories, I just divided my functions into separate .py files and imported them into my init.py file.
Now, I am able to import functions using, for example:
from prestige.preprocessing import Binaritizer
Hopefully this helps someone in the future with a similar question.
The package can be accessed here.

"No module named" error for modules in one folder and valid PYTHONPATH

I have the following structure in my project:
Python-auto-tests
Business Layer
Yandex
__init__.py
Authorization.py
Yandex_requests.py
Documents
venv
In "Authorization.py" file I have only 1 method
In "Yandex_requests.py" I trying to import "Authorization.py" module:
import Authorization
But I get following error:
"No module" named Authorization
My PATHONPATH environment variable is set to project path:
C:\Users\anduser\Python-auto-tests
Also I check my sys.path and it looks fine, my folders are here:
C:\Users\anduser\Python-auto-tests\venv\Scripts\python.exe "C:\Users\anduser\Python-auto-tests\Business Layer\Yandex\Yandex_requests.py"
C:\Users\anduser\Python-auto-tests\Business Layer\Yandex
C:\Users\anduser\Python-auto-tests
C:\Users\anduser\AppData\Local\Programs\Python\Python37-32\python37.zip
C:\Users\anduser\AppData\Local\Programs\Python\Python37-32\DLLs
C:\Users\anduser\AppData\Local\Programs\Python\Python37-32\lib
C:\Users\anduser\AppData\Local\Programs\Python\Python37-32
C:\Users\anduser\Python-auto-tests\venv
C:\Users\anduser\Python-auto-tests\venv\lib\site-packages
C:\Users\anduser\Python-auto-tests\venv\lib\site-packages\setuptools-40.8.0-py3.7.egg
C:\Users\anduser\Python-auto-tests\venv\lib\site-packages\pip-19.0.3-py3.7.egg
Can you help me solve this issue? I just can't understand why Python doesn't see my module.
In the official Python documentation there is example how to import module from the same folder and I do the same.
You have two problems here. Let's tackle the one you are aware of first.
You say you have set PYTHONPATH=C:\Users\anduser\Python-auto-tests. As such, any imports you make must be relative to that path. For example, instead of import Authorization, you have to do from Business Layer.Yandex import Authorization.
Your second problem, and I think you are unaware of it, is Business Layer. Using default import methods, Python does not handle spaces in directory of module names. (Note that it also does not handle hyphens and several other special characters). You should change that folder to something like BusinessLayer or Business_Layer. Refer to Package and Module Names from the PEP8 -- Style Guide for Python Code for more information on naming conventions for different Python constructs.
Ultimately, as long as your PYTHONPATH remains the same, the import should be written as something like from BusinessLayer.Yandex import Authorization.

How do I use modules in Django?

I'm trying to display a graph on a web page. I can get the graph to show up with a simple example that only uses functions defined within the function. However, I want to be able to expand it further. In my original code, I have one main 'graphing' function that uses the functions of other modules so that it stays organized. When I try to import these modules, which exist as files within the Django app folder, it says there is no module with that name. How do I fix this?
Error: File "/Users/andrewho/Desktop/website/charts/views.py", line 48, in <module>
import graphing
ImportError: No module named 'graphing'
I clearly have a file in the app folder called graphing.py, so why does it give me this error?
use " from 'where' import 'what'"
so may be like "from . import graphing" if its in the same directory

Using a class from another folder in the current code

My apologies for a seemingly easy question, I'm new to using classes in Python.
I am using Pycharm and my folder structure looks as follows:
The folder constant-contact-python-wrapper has a few classes defined under __init.py__ and restful_lib.py (I got this library from github). I would like to use these classes in the file Trial.py contained in ConstantContact folder. I am using the following code but it is not able to import the class.
import sys
sys.path.append('C:\\Users\\psinghal\\PycharmProjects\\ConstantContact\\constant-contact-python-wrapper')
import constant-contact-python-wrapper
API_KEY = "KEY" #not a valid key
mConnection = CTCTConnection(API_KEY, "joe", "password123")
Would someone please be able to point me to the right direction?
Part of the problem that you're trying to rectify is that you have two libraries that are together in the same scope, even though it doesn't look they necessarily need to be.
The simplest solution would be to simple put constant-contact-python-wrapper in the ConstantContact folder under a new folder for code you will be importing that you yourself did not write. This way your project is organized for this instance and for future instances where you import code that is from another library
Ideally the folder structure would be:
ConstantContact
|___ ConstantContact
|____ExternalLibraries #(or some name similar if you plan on using different libraries)
|___constant-contact-python-wrapper
Using the above model, you now have an organized hierarchy to accommodate imported code very easily.
To facilitate easy importing you would additionally setup the following:
1.Create init.py file in ExternalLibraries. The contents would be the following:
from constant-contact-python-wrapper import #The class or function you want to use
This will facilitate imports and can be extended for future libraries you choose to use.
You can then use import statements in your code written in the ConstantContact folder :
from ExternalLibraries import #The class or function you chose above
if you have multiple classes you would like to import, you can seperate them in your import statement with commas. For example:
from Example import foo,bar,baz
Since the init.py file in ExternalLibraries is import all functions/classes directly, you can use them now without even having to use dot syntax (ie. library.func).
Sources and further reading:
"all and import *" Can someone explain __all__ in Python?
"Python Project Skeleton" http://learnpythonthehardway.org/book/ex46.html
"Modules" http://docs.python-guide.org/en/latest/writing/structure/#modules
constant-contact-python-wrapper and ConstantContact are unrelated packages for python. Create a __init__.py in the same directory as manage.py and it should work as expected.

Categories

Resources