I make a lot of little python packages to help with this that or the other. As such, I have a directory, say /packages where I like to keep all of my handy tools, and I put that on the python path $export PYTHONPATH=/packages.
Now suppose that I put an __init__.py into /packages/my_package. I can $python -c "import my_package" successfully. However, if I do the typical thing with my directory structure (since I still have docs and whatnot) as /packages/my_package/my_package/hello_world.py, then having an __init__.py in /packages/my_package/my_package is insufficient to be able to import some_code.py. I can add an __init__.py into my root /package/my_package, but then I have to import my_package.my_package.hello_world, which conflicts with the normal "installed" way of importing. How do I set up my root-level __init__.py so that imports "skip" the intervening "code" directory, and import my_package.some_code works as expected?
tl;dr
Given this directory structure:
packages/
my_package/
__init__.py
my_package/
__init__.py
hello_world.py
with the packages directory in my PYTHONPATH. How do I compose my __init__.pys so that import my_package.hello_world works?
You need to write import in every init.py file. You can try the below code, it might need some modifications as per your need. But overall this would be the format.
packages/
my_package/
__init__.py
**from .my_package import main**
my_package/
__init__.py
**from .hello_world import main**
hello_world.py
def main():
#Do something
Given the deduplicated directory structure:
packages/
my_package1/
__init__.py
my_package2/
__init__.py
hello_world.py
It is sufficient to add the following in the __init__.py file of my_package1:
from .my_package2 import hello_world
That's a relative import of the hello_world module in my_package2 into the parent package my_package1.
This solves my particular use-case, in a non-python way. Instead of putting /packages on the PYTHONPATH, I just wrote a script that puts all of its subdirectories on to the PYTHONPATH. This allows me to use all of the modules in the "normal" way, without having to install them, and without having the weird root __init__.py. It does not answer my question, however, and I feel like there must be a nice pythonic way to handle it.
Specifically, I put this at the end of my ~/.bashrc
LOCAL_PYTHON_PACKAGES="[my_package_directory]"
for d in $LOCAL_PYTHON_PACKAGES/*/ ; do
export PYTHONPATH=$PYTHONPATH:$d
done
Of course, this only works for packages for which the actual source code is one directory down. It would be better to handle this in a package-specific format.
Related
I am trying to improve the project structure while adding to a code base. I found a sample structure here which looks like this:
README.rst
LICENSE
setup.py
requirements.txt
sample/__init__.py
sample/core.py
sample/helpers.py
docs/conf.py
docs/index.rst
tests/test_basic.py
tests/test_advanced.py
I notice in particular that requirements.txt and setup.py are on a higher level than tests/ and sample/
If I add sample/classes.py you need only write from classes import MyClass in sample/core.py to get it in there. It cannot however so easily be imported into tests/test_basic.py, does not seem like python 'looks around the corner' like that when importing.
In my case, there is also a MANIFEST.in on the same level with requirements.txt and some files which are not really python but just set things up for the platform on which this runs.
If classes.py were on the same level as requirements.txt I think it would be easily importable by everything in tests/ and in sample/ and their subdirectories, but it may need a __init__.py That doesn't feel right somehow.
So where should it go if both tests/ and sample/ need to be able to use it?
Let's make it easy.
If I understand correctly, the problem is How to import simple module in test. Which means you want to use something like from simple.classes import MyClass.
That's easy, just add your root path to PYTHONPATH before executing python test/test_basic.py.
That's also what an IDE does for you when you execute tests through it.
Assuming you use a Python >= 3.3, you can simply turn the test folder in a package by adding a __init__.py module in it. Then in that __init__.py (and only there) you add the path of the parent package to sys.path. That if enough for unittest discover to use it for all the modules in tests.
My one is just:
import os
import sys
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
Then if you need to access classes.py from one of the test modules, you can just use:
from sample import classes
or to directly import MyClass:
from sample.classes import MyClass
It just works because sample is already a package, and its parent folder has been added to sys.path when python unittest has loaded the test package.
Of course, this only works in you can have your tests in a package. If for any reason it is not an option, for example because you need to run individually the test modules, then you should put the sys.path modification directly in all the test files.
Write a path_helper.py file in the tests folder:
import os
import sys
core_path = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
if core_path not in sys.path: # don't add it if it is already here
sys.path.append(core_path)
You can then import it in all test files:
import path_helper
...
I'm trying to keep a data science project well-organized so I've created a directory inside my src directory called utils that contains a file called helpers.py, which contains some helper functions that will be used in many scripts. What is the best practice for how I should import func_name from src/utils/helpers.py into a file in a totally different directory, such as src/processing/clean_data.py?
I see answers to this question, and I've implemented a solution that works, but this feels ugly:
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))))
Am I doing this right? Do I need to add this to every script that wants to import func_name, like train_model.py?
My current project folder structure:
myproject
/notebooks
notebook.ipynb
/src
/processing
clean_data.py
/utils
helpers.py
/models
train_model.py
__init__.py
Example files:
# clean_data.py
import os
import sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))))
from src.utils.helpers import func_name
func_name()
# helpers.py
def func_name():
print('I'm a helper function.')
The correct way to do it is to use __init__.py, setup.py and the setuptools Python package:
myPackage/
myPackage/
__init__.py
setup.py
This link has all the steps.
First of all, let me describe you the differences between a Python module & a Python package so that both of us are on the same page. β
A module is a single .py file (or files) that are imported under one import and used. β
import aModuleName
# Here 'aModuleName' is just a regular .py file.
Whereas, a package is a collection of modules in directories that give a package hierarchy. A package contains a distinct __init__.py file. β
from aPackageName import aModuleName
# Here 'aPackageName` is a folder with a `__init__.py` file
# and 'aModuleName', which is just a regular .py file.
Therefore, when we have a project directory named proj-dir of the following structure ‡
proj-dir
--|--__init__.py
--package1
--|--__init__.py
--|--module1.py
--package2
--|--__init__.py
--|--module2.py
π Notice that I've also added an empty __init__.py into the proj-dir itself which makes it a package too.
π Now, if you want to import any python object from module2 of package2 into module1 of package1, then the import statement in the file module1.py would be
from package2.module2 import object2
# if you were to import the entire module2 then,
from package2 import module2
I hope this simple explanation clarifies your doubts on Python imports' mechanism and solves the problem. If not then do comment here. π
First of all let me clarify you that importing an entire module, if you are going to use a part of it, then is not a good idea. Instead of that you can use from to import specific function under a library/package. By doing this, you make your program efficient in terms of memory and performance.
To know more refer these:
'import module' or 'from module import'
difference between import and from
Net let us look into the solution.
Before starting off with the solution, let me clarify you the use of __init__.py file. It just tells the python interpreter that the *.py files present there are importable which means they are modules and are/maybe a part of a package.
So, If you have N no of sub directories you have to put __init__.py file in all those sub directories such that they can also be imported. Inside __init__.py file you can also add some additional information like which path should be included, default functions,variables,scope,..etc. To know about these just google about __init__.py file or take some python library and go through the same __init__.py file to know about it. (Here lies the solution)
More Info:
modules
Be pythonic
So as stated by #Sushant Chaudhary your project structure should be like
proj-dir
--|--__init__.py
--package1
--|--__init__.py
--|--module1.py
--package2
--|--__init__.py
--|--module2.py
So now, If I put __init__.py file under my directory like above, Will
it be importable and work fine?
yes and no.
Yes :
If you are importing the modules within that project/package directory.
for example in your case
you are importing package1.module1 in pakage2.module2 as from package1 import module1.
Here you have to import the base dir inside the sub modules, Why? the project will run fine if you are running the module from the same place. i.e: inside package2 as python module2.py, But will throw ModuleNotFoundError If you run the module from some other directory. i.e: any other path except under package2 for example under proj-dir as python package2/module2.py. This is what happening in your case. You are running the module from project-dir.
So How to fix this?
1- You have to append basedir path to system path in module2.py as
from sys import path
dir_path = "/absolute/path/to/proj-dir"
sys.path.insert(0, dir_path)
So that module2 will be able to find package1 (and module1 inside it).
2- You have to add all the sub module paths in __init__.py file under proj-dir.
For example:
#__init__.py under lxml
# this is a package
def get_include():
"""
Returns a list of header include paths (for lxml itself, libxml2
and libxslt) needed to compile C code against lxml if it was built
with statically linked libraries.
"""
import os
lxml_path = __path__[0]
include_path = os.path.join(lxml_path, 'includes')
includes = [include_path, lxml_path]
for name in os.listdir(include_path):
path = os.path.join(include_path, name)
if os.path.isdir(path):
includes.append(path)
return includes
This is the __init__.py file of lxml (a python library for parsing html,xml data). You can refer any __init__.py file under any python libraries having sub modules.ex (os,sys). Here I've mentioned lxml because I thought it will be easy for you to understand. You can even check __init__.py file under other libraries/packages. Each will have it's own way of defining the path for submodules.
No
If you are trying to import modules outside the directory. Then you have to export the module path such that other modules can find them into environment variables. This can be done directly by appending absolute path of the base dir to PYTHONPATH or to PATH.
To know more:
PATH variables in OS
PYTHONPATH variable
So to solve your problem, include the paths to all the sub modules in __init__.py file under proj-dir and add the /absolute/path/to/proj-dir either to PYTHONPATH or PATH.
Hope the answer explains you about usage of __init__.py and solves your problem.
On Linux, you can just add the path to the parent folder of your src directory to ~/.local/lib/python3.6/site-packages/my_modules.pth. See
Using .pth files. You can then import modules in src from anywhere on your system.
NB1: Replace python3.6 by any version of Python you want to use.
NB2: If you use Python2.7 (don't know for other versions), you will need to create __init__.py (empty) files in src/ and src/utils.
NB3: Any name.pth file is ok for my_modules.pth.
Yes, you can only import code from installed packages or from files in you working directory or subdirectories.
the way I see it, your problem would be solved if you would have your module or package installed, like an yother package one installs and then imports (numpy, xml, json etc.)
I also have a package I constantly use in all my projects, ulitilies, and I know it's a pain with the importing.
here is a description on how to How to package a python application to make it pip-installable:
https://marthall.github.io/blog/how-to-package-a-python-app/
Navigate to your python installation folder
Navigate to lib
Navigate to site-packages
Make a new file called any_thing_you_want.pth
Type .../src/utils/helpers.py inside that file with your favorite text editor
Note: the ellipsis before scr/utils/helpers.py will look something like: C:/Users/blahblahblah/python_folders/scr... <- YOU DO NEED THIS!
This is a cheap way out but it keeps code clean, and is the least complicated. The downside is, for every folder your modules are in, example.pth will need them. Upside: works with Windows all the way up to Windows 10
I am currently having a problem with python packaging and references to that.
My structure is as follows:
code/
package/
A/
__init__.py
a.py
aa.py
B/
__init__.py
b.py
bb.py
C/
__init__.py
b.py
bb.py
__init__.py #1
documentation/
...
other_stuff/
...
(All __init__.py are empty)
According to everything I have read, I should be able to reference and import things like this (in a.py):
from package.B.bb import whatever
However, this does not work. When I duplicate the outer __init__.py to the 'code' folder, I can import things like this, however:
from code.package.B.bb import whatever
This is obviously non-ideal for most actual uses.
What is it I can do to achieve my target behaviour? (I'm assuming it's something simple which I am just missing)
(Some more details: I am using Python 2.7 and PyCharm 4.03)
You have the parent directory of code listed on sys.path, but you need to have the code directory *itselfadded tosys.path`.
In other words, you need to have /full/path/for/code in sys.path, not just /full/path/for.
Note that Python automatically adds the current working directory or the parent directory of a script to sys.path; see the various options listed in the Command Line Interface Options documentation.
For example, a Python script located inside code, when run with python path/for/code/script.py will have the code directory added to sys.path for that run.
I like to keep my source code separate from test code. So, I have my project organized like this:
my_package/
module1.py
module2.py
tests/
units/
test_a.py
test_b.py
perf_tests.py
How should test_a.py import my_package?
Note: I've googled this (including SO) and am not satisfied with the answers:
I don't want to use setup.py, because I want to run from development; this is for testing, after all
I don't want to use symlinks or other hacks
I've tried sys.path.append('../') and sys.path.append(os.path.realpath('../')). Both result in ImportError: No module named my_package. Perhaps something similar can be done with PYTHONPATH - What is the syntax?
I do want to write a proper import statement which can find the correct files
First you have to include a __init__.py file inside folder my_package in order to allow Python to recognize this folder as a valid module. You can create an empty __init__.py file just with one line pass, for example.
Then, you can do something like this in test_a.py:
import os
bkp = os.getcwd()
os.chdir(r'..\..')
import my_package
os.chdir(bkp)
Or use the other options with PYTHONPATH or sys.path.append().
This question already has answers here:
Python3 correct way to import relative or absolute?
(2 answers)
Closed 2 years ago.
I have the following directory:
mydirectory
βββ __init__.py
βββ file1.py
βββ file2.py
I have a function f defined in file1.py.
If, in file2.py, I do
from .file1 import f
I get the following error:
SystemError: Parent module '' not loaded, cannot perform relative
import
Why? And how to make it work?
Launching modules inside a package as executables is a bad practice.
When you develop something you either build a library, which is intended to be imported by other programs and thus it doesn't make much sense to allow executing its submodules directly, or you build an executable in which case there's no reason to make it part of a package.
This is why in setup.py you distinguish between packages and scripts. The packages will go under site-packages while the scripts will be installed under /usr/bin (or similar location depending on the OS).
My recommendation is thus to use the following layout:
/
βββ mydirectory
| βββ __init__.py
| βββ file1.py
βββ file2.py
Where file2.py imports file1.py as any other code that wants to use the library mydirectory, with an absolute import:
from mydirectory.file1 import f
When you write a setup.py script for the project you simply list mydirectory as a package and file2.py as a script and everything will work. No need to fiddle with sys.path.
If you ever, for some reason, really want to actually run a submodule of a package, the proper way to do it is to use the -m switch:
python -m mydirectory.file1
This loads the whole package and then executes the module as a script, allowing the relative import to succeed.
I'd personally avoid doing this. Also because a lot of people don't even know you can do this and will end up getting the same error as you and think that the package is broken.
Regarding the currently accepted answer, which says that you should just use an implicit relative import from file1 import f because it will work since they are in the same directory:
This is wrong!
It will not work in python3 where implicit relative imports are disallowed and will surely break if you happen to have installed a file1 module (since it will be imported instead of your module!).
Even if it works the file1 will not be seen as part of the mydirectory package. This can matter.
For example if file1 uses pickle, the name of the package is important for proper loading/unloading of data.
When launching a python source file, it is forbidden to import another file, that is in the current package, using relative import.
In documentation it is said:
Note that relative imports are based on the name of the current module. Since the name of the main module is always "__main__", modules intended for use as the main module of a Python application must always use absolute imports.
So, as #mrKelley said, you need to use absolute import in such situation.
since file1 and file2 are in the same directory, you don't even need to have an __init__.py file. If you're going to be scaling up, then leave it there.
To import something in a file in the same directory, just do like this
from file1 import f
i.e., you don't need to do the relative path .file1 because they are in the same directory.
If your main function, script, or whatever, that will be running the whole application is in another directory, then you will have to make everything relative to wherever that is being executed.
myproject/
mypackage
βββ __init__.py
βββ file1.py
βββ file2.py
βββ file3.py
mymainscript.py
Example to import from one file to another
#file1.py
from myproject import file2
from myproject.file3 import MyClass
Import the package example to the mainscript
#mymainscript.py
import mypackage
https://docs.python.org/3/tutorial/modules.html#packages
https://docs.python.org/3/reference/import.html#regular-packages
https://docs.python.org/3/reference/simple_stmts.html#the-import-statement
https://docs.python.org/3/glossary.html#term-import-path
The variable sys.path is a list of strings that determines the interpreterβs search path for modules. It is initialized to a default path taken from the environment variable PYTHONPATH, or from a built-in default if PYTHONPATH is not set. You can modify it using standard list operations:
import sys
sys.path.append('/ufs/guido/lib/python')
sys.path.insert(0, '/ufs/guido/myhaxxlib/python')
Inserting it at the beginning has the benefit of guaranteeing that the path is searched before others (even built-in ones) in the case of naming conflicts.