python how to run script in folder - python

This is my python path:
PYTHONPATH = D:\PythonPath
in the PythonPath folder I have MyTests folder that has a Script.py
in the PyThonPath folder I have ScrapyingProject folder
inside the Script.py I do this:
from ScrapyingProject.ScrapyingProject.spiders.XXXSpider import XXXSpider
I got this exception:
ImportError: No module named ScrapyingProjectScrapyingProject.spiders.XXXSpider
Edit:
the XXXSpider is in this location:
D:\PythonPath\ScrapyingProject2\ScrapyingProject2\spiders.py

Take a look at this to read more about Python modules and packages: http://docs.python.org/2/tutorial/modules.html
Turn your python-script-containing folder into a python package by adding __init__.py file to it. So, in your case, the directory structure should resemble this:
PYTHONPATH
- ScrapyingProject
- __init__.py
- script.py
Now, in this scheme, ScrappyProject becomes your python-package. Any .py file inside the folder becomes a python module. You can import a python module by dot-expanded python path starting PYTHONPATH. Something like,
from ScrapyingProject.script import XXXSpider
Same logic can be extended by nesting multiple packages inside each other. A nested package, for example looks like
PYTHONPATH
- ScrapyingProject2
- __init__.py
- ScrapyingProject2
- __init__.py
- script.py
Now, a package-nested script.py can be imported as
from ScrapyingProject2.ScrapyingProject2 import script
Or even
from ScrapyingProject2.ScrapyingProject2.script import XXXSpider
(Assuming you have defined class XXXSpider inside script.py)

For one you say that the directory is D:\PythonPath\ScrapyingProject2\ScrapyingProject2\spiders.py but you import from ScrapyingProject.ScrapyingProject (without the 2s).
If I understand you correctly, the wanted import should look something like this:
from ScrapyingProject2.ScrapyingProject2.spiders import XXXSpider
assuming the class XXXSpider is in the module spiders.py.
Remember to put __init__.py files in the folders you want to import from (turning them into 'packages'). This should include all folders from the PYTHONPATH to the one containing the *.py-files. See here from more details on packages.

Related

Getting a ModuleNotFoundError when trying to import from a particular module

The directory I have looks like this:
repository
/src
/main.py
/a.py
/b.py
/c.py
I run my program via python ./main.py and within main.py there's an important statement from a import some_func. I'm getting a ModuleNotFoundError: No module named 'a' every time I run the program.
I've tried running the Python shell and running the commands import b or import c and those work without any errors. There's nothing particularly special about a either, it just contains a few functions.
What's the problem and how can I fix this issue?
repository/
__init__.py
/src
__init__.py
main.py
a.py
b.py
c.py
In the __init__.py in repository, add the following line:
from . import repository
In the __init__.py in src, add the following line:
from . import main
from . import a
from . import b
from . import c
Now from src.a import your_func is going to work on main.py
Maybe you could try using a relative import, which allows you to import modules from other directories relative to the location of the current file.
Note that you will need to add a dot (.) before the module name when using a relative import, this indicates that the module is in the same directory as the current file:
from . import a
Or try running it from a different directory and appending the /src path like this:
import sys
sys.path.append('/src')
You could also try using the PYTHONPATH (environment variable) to add a directory to the search path:
Open your terminal and navigate to the directory containing the main.py file (/src).
Set the PYTHONPATH environment variable to include the current directory, by running the following command
export PYTHONPATH=$PYTHONPATH:$(pwd)
At last you could try to use the -m flag inside your command, so that Python knows to look for the a module inside the /src directory:
python -m src.main
I've had similar problems in the past. Imports in Python depend on a lot of things like how you run your program, as a script or as a module and what is your current working directory.
Thus I've created a new import library: ultraimport It gives the programmer more control over imports and lets you do file system based, relative imports.
Your main.py could look like this:
import ultraimport
a = ultraimport('__dir__/a.py')
This will always work, no matter how you run your code, no matter what is your sys.path and also no init files are necessary.

Basic Python import mechanics

I have the following directory tree:
project/
A/
__init__.py
foo.py
TestA/
__init__.py
testFoo.py
the content of testFoo is:
import unittest
from A import foo
from the project directory I run python testA/testFoo.py
I get a ModuleNotFoundError No module named A
I have two question: how to improt and run A.foo from TestA.testFoo and why is it so difficult to grasp the import logic in Python? Isn't there any debug trick to solve this kind of issues rapidly, I'm sorry I have to bother you with such basics questions?
When your are executing a file an environment variable called python path is generated, python import work with this variable to find your file to import, this path is generated with the path of the file you are executing and it will search in the current directory and sub directories containing an __init__.py file, if you want to import from a directory on the same level you need to modify your python path or change the architecture of your project so the file executed is always on top level.
you can include path to your python path like this :
import sys
sys.path.insert(0, "/path/to/file.py")
You can read more on import system : https://docs.python.org/3/reference/import.html
The best way in my opinion is to not touch the python path and include your test directoy into the directory where tested files are:
project/
A/
__init__.py
foo.py
TestA/
__init__.py
testFoo.py
Then run the python -m unittest command into your A or project directory, it will search into your current and sub directories for test and execute it.
More on unittest here : https://docs.python.org/3/library/unittest.html
Add the folder project/testA to the system pythonpath first:
import sys
sys.path.insert(0, "/path/to/pythonfile")
and try the import again.
Can you try this ?
Create an empty file __init__.py in subdirectory TestA. And add at the begin of main code
from __future__ import absolute_import
Then import as below :
import A.foo as testfoo
The recommended way in py3 may be like below
echo $pwd
$ /home/user/project
python -m testA.testFoo
The way of execute module python -m in python is a good way to replace relative references。
You definitely cannot find A because python need look from sys.path, PYTHONPATH to find the module.
And python will automatically add current top level script to sys.path not currently directory to sys.path. So if you add print(sys.path) in testFoo.py, you will see it only add project/TestA to the sys.path.
Another word, the project did not be included in sys.path, then how python can find the module A?
So you had to add the project folder to sys.path by yourself, and, this just needed in top script, something like follows:
import unittest
import sys
import os
file_path = os.path.abspath(os.path.dirname(__file__)).replace('\\', '/')
lib_path = os.path.abspath(os.path.join(file_path, '..')).replace('\\', '/')
sys.path.append(lib_path)

Do I need to add my project directory to the system path in every script to import a function from another directory?

I'm trying to keep a data science project well-organized so I've created a directory inside my src directory called utils that contains a file called helpers.py, which contains some helper functions that will be used in many scripts. What is the best practice for how I should import func_name from src/utils/helpers.py into a file in a totally different directory, such as src/processing/clean_data.py?
I see answers to this question, and I've implemented a solution that works, but this feels ugly:
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))))
Am I doing this right? Do I need to add this to every script that wants to import func_name, like train_model.py?
My current project folder structure:
myproject
/notebooks
notebook.ipynb
/src
/processing
clean_data.py
/utils
helpers.py
/models
train_model.py
__init__.py
Example files:
# clean_data.py
import os
import sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))))
from src.utils.helpers import func_name
func_name()
# helpers.py
def func_name():
print('I'm a helper function.')
The correct way to do it is to use __init__.py, setup.py and the setuptools Python package:
myPackage/
myPackage/
__init__.py
setup.py
This link has all the steps.
First of all, let me describe you the differences between a Python module & a Python package so that both of us are on the same page. ✌
A module is a single .py file (or files) that are imported under one import and used. ✔
import aModuleName
# Here 'aModuleName' is just a regular .py file.
Whereas, a package is a collection of modules in directories that give a package hierarchy. A package contains a distinct __init__.py file. ✔
from aPackageName import aModuleName
# Here 'aPackageName` is a folder with a `__init__.py` file
# and 'aModuleName', which is just a regular .py file.
Therefore, when we have a project directory named proj-dir of the following structure ⤵
proj-dir
--|--__init__.py
--package1
--|--__init__.py
--|--module1.py
--package2
--|--__init__.py
--|--module2.py
🔎 Notice that I've also added an empty __init__.py into the proj-dir itself which makes it a package too.
👍 Now, if you want to import any python object from module2 of package2 into module1 of package1, then the import statement in the file module1.py would be
from package2.module2 import object2
# if you were to import the entire module2 then,
from package2 import module2
I hope this simple explanation clarifies your doubts on Python imports' mechanism and solves the problem. If not then do comment here. 😊
First of all let me clarify you that importing an entire module, if you are going to use a part of it, then is not a good idea. Instead of that you can use from to import specific function under a library/package. By doing this, you make your program efficient in terms of memory and performance.
To know more refer these:
'import module' or 'from module import'
difference between import and from
Net let us look into the solution.
Before starting off with the solution, let me clarify you the use of __init__.py file. It just tells the python interpreter that the *.py files present there are importable which means they are modules and are/maybe a part of a package.
So, If you have N no of sub directories you have to put __init__.py file in all those sub directories such that they can also be imported. Inside __init__.py file you can also add some additional information like which path should be included, default functions,variables,scope,..etc. To know about these just google about __init__.py file or take some python library and go through the same __init__.py file to know about it. (Here lies the solution)
More Info:
modules
Be pythonic
So as stated by #Sushant Chaudhary your project structure should be like
proj-dir
--|--__init__.py
--package1
--|--__init__.py
--|--module1.py
--package2
--|--__init__.py
--|--module2.py
So now, If I put __init__.py file under my directory like above, Will
it be importable and work fine?
yes and no.
Yes :
If you are importing the modules within that project/package directory.
for example in your case
you are importing package1.module1 in pakage2.module2 as from package1 import module1.
Here you have to import the base dir inside the sub modules, Why? the project will run fine if you are running the module from the same place. i.e: inside package2 as python module2.py, But will throw ModuleNotFoundError If you run the module from some other directory. i.e: any other path except under package2 for example under proj-dir as python package2/module2.py. This is what happening in your case. You are running the module from project-dir.
So How to fix this?
1- You have to append basedir path to system path in module2.py as
from sys import path
dir_path = "/absolute/path/to/proj-dir"
sys.path.insert(0, dir_path)
So that module2 will be able to find package1 (and module1 inside it).
2- You have to add all the sub module paths in __init__.py file under proj-dir.
For example:
#__init__.py under lxml
# this is a package
def get_include():
"""
Returns a list of header include paths (for lxml itself, libxml2
and libxslt) needed to compile C code against lxml if it was built
with statically linked libraries.
"""
import os
lxml_path = __path__[0]
include_path = os.path.join(lxml_path, 'includes')
includes = [include_path, lxml_path]
for name in os.listdir(include_path):
path = os.path.join(include_path, name)
if os.path.isdir(path):
includes.append(path)
return includes
This is the __init__.py file of lxml (a python library for parsing html,xml data). You can refer any __init__.py file under any python libraries having sub modules.ex (os,sys). Here I've mentioned lxml because I thought it will be easy for you to understand. You can even check __init__.py file under other libraries/packages. Each will have it's own way of defining the path for submodules.
No
If you are trying to import modules outside the directory. Then you have to export the module path such that other modules can find them into environment variables. This can be done directly by appending absolute path of the base dir to PYTHONPATH or to PATH.
To know more:
PATH variables in OS
PYTHONPATH variable
So to solve your problem, include the paths to all the sub modules in __init__.py file under proj-dir and add the /absolute/path/to/proj-dir either to PYTHONPATH or PATH.
Hope the answer explains you about usage of __init__.py and solves your problem.
On Linux, you can just add the path to the parent folder of your src directory to ~/.local/lib/python3.6/site-packages/my_modules.pth. See
Using .pth files. You can then import modules in src from anywhere on your system.
NB1: Replace python3.6 by any version of Python you want to use.
NB2: If you use Python2.7 (don't know for other versions), you will need to create __init__.py (empty) files in src/ and src/utils.
NB3: Any name.pth file is ok for my_modules.pth.
Yes, you can only import code from installed packages or from files in you working directory or subdirectories.
the way I see it, your problem would be solved if you would have your module or package installed, like an yother package one installs and then imports (numpy, xml, json etc.)
I also have a package I constantly use in all my projects, ulitilies, and I know it's a pain with the importing.
here is a description on how to How to package a python application to make it pip-installable:
https://marthall.github.io/blog/how-to-package-a-python-app/
Navigate to your python installation folder
Navigate to lib
Navigate to site-packages
Make a new file called any_thing_you_want.pth
Type .../src/utils/helpers.py inside that file with your favorite text editor
Note: the ellipsis before scr/utils/helpers.py will look something like: C:/Users/blahblahblah/python_folders/scr... <- YOU DO NEED THIS!
This is a cheap way out but it keeps code clean, and is the least complicated. The downside is, for every folder your modules are in, example.pth will need them. Upside: works with Windows all the way up to Windows 10

Import Python file within a different parent directory

I have the following directory structure:
- src
- __init__.py
- scripts
- __init__.py
- preprocessing.py
- project1
- main.py
- project2
- main.py
I'm trying to access the script(s) inside the scripts folder from within both the main.py files.
I've tried adding in the __init__.py (blank) files, and importing with import scripts, from src import scripts, and from .. import scripts. None of these seem to work.
I either get: ValueError: Attempted relative import in non-package, or no module found.
Thanks in advance!
P.S. I assume that the directory structure will get deeper soon (e.g. multiple subdirectories within scripts and project1 / project2). So if these is an easy way to deal with this as well, that would be very much appreciated.
One way to deal with that (albeit not the cleanest one) is to manually add the root directory to the sys.path variable so that python can look for modules in it, for example, in your main.py you may add these lines at the top:
#/src/project1/main.py
import os
import sys
sys.path.append(os.getcwd() + '\\..\\') # this is the src directory
This will allow python to look for modules in the directory above the one running the script, and this will work:
import scripts.preprocessing
Keep in mind that python will only look for modules in the same directory or below as the script is running. If you start /src/project2/main.py, python doesn't know anything about /src/project1/ or /src/scripts/

Path appended but python does not find module

I have the following structure:
~/git/
~/git/folder1
~/git/folder2
in ~/git/folder1 I have main.py, which imports doing the following:
import folder2.future_data as future_data
which throws the following error:
import folder2.future_data as f_d
ImportError: No module named folder2.future_data
Despite my $PATH containing
user#mac-upload:~$ echo $PATH
/home/user/anaconda2/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/home/user/git/folder2
Why am I unable to import from folder2 despite it being in my path?
Am I missing something?
Try putting an empty __init__.py file in each directory (~/git, ~/git/folder1, and ~/git/folder2). Then do export PYTHONPATH=${HOME}/git:$PYTHONPATH (assuming bash shell).
This will also allow you to just set your PYTHONPATH once at the top level and be done with it. If you add more directories (modules) that you need to import, you can just keep adding __init__.py files to your structure (instead of having to constantly modify your PYTHONPATH every time your file/directory structure changes).
You can explicitly added the path inside the main.py script before you doing import
import sys
sys.path.append(r'~/git/folder2')
import future_data

Categories

Resources