Python package structure - python

I have a Python package with several subpackages.
myproject/
__init__.py
models/
__init__.py
...
controllers/
__init__.py
..
scripts/
__init__.py
myscript.py
Within myproject.scripts.myscript, how can I access myproject.models? I've tried
from myproject import models # No module named myproject
import models # No module named models
from .. import models # Attempted relative import in non-package
I've had to solve this before, but I can never remember how it's supposed to be done. It's just not intuitive to me.

This is the correct version:
from myproject import models
If it fails with ImportError: No module named foo it is because you haven't set PYTHONPATH to include the directory which contains myproject/.
I'm afraid other people will suggest tricks to let you avoid setting PYTHONPATH. I urge you to disregard them. This is why PYTHONPATH exists: to tell Python where to look for code to load. It is robust, reasonably well documented, and portable to many environments. Tricks people play to avoid having to set it are none of these things.
The explicit relative import will work even without PYTHONPATH being set, since it can just walk up the directory hierarchy until it finds the right place, it doesn't need to find the top and then walk down. However, it doesn't work in a script you pass as a command line argument to python (or equivalently, invoke directly with a #!/usr/bin/python line). This is because in both these cases, it becomes the __main__ module of the process. There's nowhere to walk up to from __main__ - it's already at the top! If you invoke the code in your script by importing that module, then it will be fine. That is, compare:
python myproject/scripts/myscript.py
to
python -c 'import myproject.scripts.myscript'
You can take advantage of this by not executing your script module directly, but creating a bin/myscript that does the import and perhaps calls a main function:
import myprojects.scripts.myscript
myprojects.scripts.myscript.main()
Compare to how Twisted's command line scripts are defined: http://twistedmatrix.com/trac/browser/trunk/bin/twistd

Your project is not in your path.
Option A
Install your package so that python can find it via its absolute name from anywhere (using from myproject import models )
Option B
Trickery to add the relative parent to your path
sys.path.append(os.path.abspath('..'))
The former option is recommended.

Related

Resolving module conflict in python [duplicate]

Okay, the scenario is very simple. I have this file structure:
.
├── interface.py
├── pkg
│   ├── __init__.py
│   ├── mod1.py
│   ├── mod2.py
Now, these are my conditions:
mod2 needs to import mod1.
both interface.py and mod2 needs to be run independently as a main script. If you want, think interface as the actual program and mod2 as an internal tester of the package.
So, in Python 2 I would simply do import mod1 inside mod2.py and both python2 mod2.py and python2 interface.py would work as expected.
However, and this is the part I less understand, using Python 3.5.2, if I do import mod1; then I can do python3 mod2.py, but python3 interface.py throws: ImportError: No module named 'mod1' :(
So, apparently, python 3 proposes to use import pkg.mod1 to avoid collisions against built-in modules. Ok, If I use that I can do python3 interface.py; but then I can't python3 mod2.py because: ImportError: No module named 'pkg'
Similarly, If I use relative import:
from . import mod1 then python3 interface.py works; but mod2.py says SystemError: Parent module '' not loaded, cannot perform relative import :( :(
The only "solution", I've found is to go up one folder and do python -m pkg.mod2 and then it works. But do we have to be adding the package prefix pkg to every import to other modules within that package? Even more, to run any scripts inside the package, do I have to remember to go one folder up and use the -m switch? That's the only way to go??
I'm confused. This scenario was pretty straightforward with python 2, but looks awkward in python 3.
UPDATE: I have upload those files with the (referred as "solution" above) working source code here: https://gitlab.com/Akronix/test_python3_packages. Note that I still don't like it, and looks much uglier than the python2 solution.
Related SO questions I've already read:
Python -- import the package in a module that is inside the same package
How to do relative imports in Python?
Absolute import module in same package
Related links:
https://docs.python.org/3.5/tutorial/modules.html
https://www.python.org/dev/peps/pep-0328/
https://www.python.org/dev/peps/pep-0366/
TLDR:
Run your code with python -m pkg.mod2.
Import your code with from . import mod1.
The only "solution", I've found is to go up one folder and do python -m pkg.mod2 and then it works.
Using the -m switch is indeed the "only" solution - it was already the only solution before. The old behaviour simply only ever worked out of sheer luck; it could be broken without even modifying your code.
Going "one folder up" merely adds your package to the search path. Installing your package or modifying the search path works as well. See below for details.
But do we have to be adding the package prefix pkg to every import to other modules within that package?
You must have a reference to your package - otherwise it is ambiguous which module you want. The package reference can be either absolute or relative.
A relative import is usually what you want. It saves writing pkg explicitly, making it easier to refactor and move modules.
# inside mod1.py
# import mod2 - this is wrong! It can pull in an arbitrary mod2 module
# these are correct, they uniquely identify the module
import pkg.mod2
from pkg import mod2
from . import mod2
from .mod2 import foo # if pkg.mod2.foo exists
Note that you can always use <import> as <name> to bind your import to a different name. For example, import pkg.mod2 as mod2 lets you work with just the module name.
Even more, to run any scripts inside the package, do I have to remember to go one folder up and use the -m switch? That's the only way to go??
If your package is properly installed, you can use the -m switch from anywhere. For example, you can always use python3 -m json.tool.
echo '{"json":"obj"}' | python -m json.tool
If your package is not installed (yet), you can set PYTHONPATH to its base directory. This includes your package in the search path, and allows the -m switch to find it properly.
If you are in the executable's directory, you can execute export PYTHONPATH="$(pwd)/.." to quickly mount the package for import.
I'm confused. This scenario was pretty straightforward with python 2, but looks awkward in python 3.
This scenario was basically broken in python 2. While it was straightforward in many cases, it was difficult or outright impossible to fix in any other cases.
The new behaviour is more awkward in the straightforward case, but robust and reliable in any case.
I had similar problem.
I solved it adding
import sys
sys.path.insert(0,".package_name")
into the __init__.py file in the package folder.

Python imports when scripts are run from different folder (ancestor)

I have a large repository with some fixed structure and I have extended it by some folders and python scripts to add extra functionality to it as a whole. The structure looks as follows:
toplevelfolder
featureA
someModuleA.py
__ init __.py
featureB
someModuleB.py
__ init __.py
application
__ init __.py
app.py
Now someModuleA.py and someModuleB.py can be invoked via app.py but at the same time also have be able to be invoked directly, however this invocation must come from the toplevelfolder for the relative paths in the file to resolve correctly, i.e. via python ./featureA/someModuleA.py.
This all works well, but now I need some function definitions from someModuleB in someModuleA and hence I want to import this module. I have tried both absolute and relative imports, but both fail with different errors, the absolute import with
from toplevelfolder.featureA import someModuleA as A
# ModuleNotFoundError: No module named 'toplevelfolder'
and the relative import with
from toplevelfolder.featureA import someModuleA as A
# ImportError: attempted relative import with no known parent package
Now I can see that the relative import would cause problems when python is invoked from the toplevelfolder, as .. would represent the latter's parent directory, rather than the parent directory of featureA. However, I cannot get a hold of the first error message, especially since toplevelfolder should not be a module but a package.
Is there another way to import in Python that I'm not aware of, if possibly without modifying PYTHONPATH or sys.path or something like that?
Not 100% sure on what the goal is here. My advice would be:
Identify clearly what you want your top level modules and packages to be.
Make all imports absolute.
Either:
make your project a real installable project, so that those top level modules and packages are installed in the environment's site-packages directory;
or make sure that the current working directory is the one containing the top level modules and packages.
Make sure to call your code via the executable module or package method instead of the script method, if the "entry point" you want to execute is part of a package
DO (executable module or package):
path/to/pythonX.Y -m toplevelpackage.module
path/to/pythonX.Y -m toplevelpackage.subpackage (assuming there is a toplevelpackage/subpackage/__main__.py file)
DON'T (script within a package):
path/to/pythonX.Y toplevelpackage/module.py
(Optional) Later on, once it all works well and everything is under control, you might decide to change some or all imports to relative. (If things are done right, I believe it could be possible to make it so that it is possible to call the executable modules from any level within the directory structure as the current working directory.)
References:
Old reference, possibly outdated, but assuming I interpreted it right, it says that running scripts that live in a package is an anti pattern, and one should use python -m package.module instead: https://mail.python.org/pipermail/python-3000/2007-April/006793.html -- https://www.python.org/dev/peps/pep-3122/
Don't include the top level directory. In featureB.someModuleB:
from featureA.someModuleA import two
Sample directory.
Try pasting this above your import:
import os,sys,inspect
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(currentdir)
sys.path.insert(0,parentdir)
Then you should be able to import a file from the parent-folder.

Why can I import successfully without __init__.py?

What exactly is the use of __init__.py? Yes, I know this file makes a directory into an importable package. However, consider the following example:
project/
foo/
__init__.py
a.py
bar/
b.py
If I want to import a into b, I have to add following statement:
sys.path.append('/path_to_foo')
import foo.a
This will run successfully with or without __init__.py. However, if there is not an sys.path.append statement, a "no module" error will occur, with or without __init__.py. This makes it seem lik eonly the system path matters, and that __init__.py does not have any effect.
Why would this import work without __init__.py?
__init__.py has nothing to do with whether Python can find your package. You've run your code in such a way that your package isn't on the search path by default, but if you had run it differently or configured your PYTHONPATH differently, the sys.path.append would have been unnecessary.
__init__.py used to be necessary to create a package, and in most cases, you should still provide it. Since Python 3.3, though, a folder without an __init__.py can be considered part of an implicit namespace package, a feature for splitting a package across multiple directories.
During import processing, the import machinery will continue to
iterate over each directory in the parent path as it does in Python
3.2. While looking for a module or package named "foo", for each directory in the parent path:
If <directory>/foo/__init__.py is found, a regular package is imported and returned.
If not, but <directory>/foo.{py,pyc,so,pyd} is found, a module is imported and returned. The exact list of extension varies by platform
and whether the -O flag is specified. The list here is
representative.
If not, but <directory>/foo is found and is a directory, it is recorded and the scan continues with the next directory in the parent
path.
Otherwise the scan continues with the next directory in the parent path.
If the scan completes without returning a module or package, and at
least one directory was recorded, then a namespace package is created.
If you really want to avoid __init__.py for some reason, you don't sys.path. Rather, create a module object and set its __path__ to a list of directories.
if I want to import a into b, I have to add following statement:
No! You'd just say: import foo.a. All this is provided you run the entire package at once using python -m main.module where main.module is the entry point to your entire application. It imports all other modules, and the modules that import more modules will try to look for them from the root of this project. For instance, foo.bar.c will import as foo.bar.b
Then it seems that only the system path matters and init.py does not have any effect.
You need to modify sys.path only when you are importing modules from locations that are not in your project, or the places where python looks for libraries. __init__.py not only makes a folder look like a package, it also does a few more things like "export" objects to outside world (__all__)
When you import something it has to either:
Retrieve an already loaded module or
Load the module that was imported
When you do import foo and python finds a folder called foo in a folder on your sys.path then it will look in that folder for an __init__.py to be considered the top level module.
(Note that if the package is not on your sys.path then you would need to append it's location to be able to import it.)
If that is not present it will look for a __init__.pyc version possibly in the __pycache__ folder, if that is also missing then that folder foo is not considered a loadable python package. If no other options for foo are found then an ImportError is raised.
If you try deleting the __init__.pyc file as well you will see that the the initializer script for a package is indeed necessary.

Importing correctly with pytest

I just got set up to use pytest with Python 2.6. It has worked well so far with the exception of handling "import" statements: I can't seem to get pytest to respond to imports in the same way that my program does.
My directory structure is as follows:
src/
main.py
util.py
test/
test_util.py
geom/
vector.py
region.py
test/
test_vector.py
test_region.py
To run, I call python main.py from src/.
In main.py, I import both vector and region with
from geom.region import Region
from geom.vector import Vector
In vector.py, I import region with
from geom.region import Region
These all work fine when I run the code in a standard run. However, when I call "py.test" from src/, it consistently exits with import errors.
Some Problems and My Solution Attempts
My first problem was that, when running "test/test_foo.py", py.test could not "import foo.py" directly. I solved this by using the "imp" tool. In "test_util.py":
import imp
util = imp.load_source("util", "util.py")
This works great for many files. It also seems to imply that when pytest is running "path/test/test_foo.py" to test "path/foo.py", it is based in the directory "path".
However, this fails for "test_vector.py". Pytest can find and import the vector module, but it cannot locate any of vector's imports. The following imports (from "vector.py") both fail when using pytest:
from geom.region import *
from region import *
These both give errors of the form
ImportError: No module named [geom.region / region]
I don't know what to do next to solve this problem; my understanding of imports in Python is limited.
What is the proper way to handle imports when using pytest?
Edit: Extremely Hacky Solution
In vector.py, I changed the import statement from
from geom.region import Region
to simply
from region import Region
This makes the import relative to the directory of "vector.py".
Next, in "test/test_vector.py", I add the directory of "vector.py" to the path as follows:
import sys, os
sys.path.append(os.path.realpath(os.path.dirname(__file__)+"/.."))
This enables Python to find "../region.py" from "geom/test/test_vector.py".
This works, but it seems extremely problematic because I am adding a ton of new directories to the path. What I'm looking for is either
1) An import strategy that is compatible with pytest, or
2) An option in pytest that makes it compatible with my import strategy
So I am leaving this question open for answers of these kinds.
The issue here is that Pytest walks the filesystem to discover files that contain tests, but then needs to generate a module name that will cause import to load that file. (Remember, files are not modules.)
Pytest comes up with this test package name by finding the first directory at or above the level of the file that does not include an __init__.py file and declaring that the "basedir" for the module tree containing a module generated from this file. It then adds the basedir to sys.path and imports using the module name that will find that file relative to the basedir.
There are some implications of this of which you should beware:
The basepath may not match your intended basepath in which case the module will have a name that doesn't match what you would normally use. E.g., what you think of as geom.test.test_vector will actually be named just test_vector during the Pytest run because it found no __init__.py in src/geom/test/ and so added that directory to sys.path.
You may run into module naming collisions if two files in different directories have the same name. For example, lacking __init__.py files anywhere, adding geom/test/test_util.py will conflict with test/test_util.py because both are loaded as import test_util.py, with both test/ and geom/test/ in the path.
The system you're using here, without explicit __init__.py modules, is having Python create implicit namespace packages for your directories. (A package is a module with submodules.) Ideally we'd configure Pytest with a path from which it would also generate this, but it doesn't seem to know how to do that.
The easiest solution here is simply to add empty __init__.py files to all of the subdirectories under src/; this will cause Pytest to import everything using package/module names that start with directory names under src/.
The question How do I Pytest a project using PEP 420 namespace packages? discusses other solutions to this.
import looks in the following directories to find a module:
The home directory of the program. This is the directory of your root script. When you are running pytest your home directory is where it is installed (/usr/local/bin probably). No matter that you are running it from your src directory because the location of your pytest determines your home directory. That is the reason why it doesn't find the modules.
PYTHONPATH. This is an environment variable. You can set it from the command line of your operating system. In Linux/Unix systems you can do this by executing: 'export PYTHONPATH=/your/custom/path' If you wanted Python to find your modules from the test directory you should include the src path in this variable.
The standard libraries directory. This is the directory where all your libraries are installed.
There is a less common option using a pth file.
sys.path is the result of combining the home directory, PYTHONPATH and the standard libraries directory. What you are doing, modifying sys.path is correct. It is something I do regularly. You could try using PYTHONPATH if you don't like messing with sys.path
If you include an __init__.py file inside your tests directory, then when the program is looking to set a home directory it will walk 'upwards' until it finds one that does not contain an init file. In this case src/.
From here you can import by saying :
from geom.region import *
you must also make sure that you have an init file in any other subdirectories, such as the other nested test directory
I was wondering what to do about this problem too. After reading this post, and playing around a bit, I figured out an elegant solution. I created a file called "test_setup.py" and put the following code in it:
import sys, os
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
I put this file in the top-level directory (such as src). When pytest is run from the top-level directory, it will run all test files including this one since the file is prefixed with "test". There are no tests in the file, but it is still run since it begins with "test".
The code will append the current directory name of the test_setup.py file to the system path within the test environment. This will be done only once, so there are not a bunch of things added to the path.
Then, from within any test function, you can import modules relative to that top-level folder (such as import geom.region) and it knows where to find it since the src directory was added to the path.
If you want to run a single test file (such as test_util.py) instead of all the files, you would use:
pytest test_setup.py test\test_util.py
This runs both the test_setup and test_util code so that the test_setup code can still be used.
Are so late to answer that question but usining python 3.9 or 3.10 u just need to add __init__.py folder in tests folders.
When u add this file python interprets this folders as a module.
Wold be like this
src/
main.py
util.py
test/
__init__.py
test_util.py
geom/
vector.py
region.py
test/
__init__.py
test_vector.py
test_region.py
so u just run pytest.
Sorry my poor english
Not the best solution, but maybe the fastest one:
cd path/python_folder
python -m pytest python_file.py

Python project and package directories layout

I created a project in python, and I'm curious about how packages work in python.
Here is my directory layout:
top-level dir
\ tests
__init__.py
\ examples
__init__.py
example.py
module.py
How would I go about including module.py in my example.py module. I know I could set PYTHONPATH to the top-level directory, but that doesn't seem like a good solution. This is how pydev gets around this problem, but I'd like a solution that didn't require updating environment variables.
I could put something at the top of the example.py to update sys.path like this:
from os import path
import sys
sys.path.append( path.dirname(path.abspath(path.dirname(__file__))) )
I don't think this is an appropriate solution either.
I feel like I'm missing some basic part of python packages. I'm testing this on python 2.6. If any further clarification is needed, please let me know.
Python packages are very simple: a package is any directory under any entry in sys.path that has an __init__.py file. However, a module is only considered to be IN a package if it is imported via a relative import such as import package.module or from package import module. Note that this means that in general, someone must set up sys.path to contain the directories above any package you want to be importable, whether via PYTHONPATH or otherwise.
The primary wrinkle is that main modules (those run directly from the command line) are always __main__ no matter their location. They therefore have to do absolute imports, and either rely on PYTHONPATH being set up, or munge sys.path themselves.
In your case, I'd recommend either having a small Python script that runs your examples after setting up the correct path. Say you put it in the top level directory:
#!/usr/bin/env python
import sys
import os.path
sys.path.append(os.path.dirname(__file__))
example = __import__("examples", globals(), locals(), sys.argv[1])
Then your example can do "import module".
Alternately, if module.py is meant to also be in a package, add the PARENT of your "top level directory" to sys.path and then use the from .. import module syntax in your example modules. Also, change the first parameter to the __import__ in the wrapper to "tldname.examples" where tldname is the name of your top-level directory.
If you don't want to rely on an "example runner" module, each example needs sys.path boilerplate, or you need a PYTHONPATH setting. Sorry.
http://docs.python.org/tutorial/modules.html#intra-package-references
from .. import module

Categories

Resources