Given the directory structure:
/home/user/python/mypacakge/src/foo.py
/home/user/python/mypacakge/tests
/home/user/python/mypacakge/tests/fixtures
/home/user/python/mypacakge/tests/fixtures/config.json.sample
/home/user/python/mypacakge/tests/foo_tests.py
/home/user/python/mypacakge/README.md
Where src contains the source code, and test contains the unit tests, how do I setup a "package" so that my relative imports that are used in the unit tests located in test/ can load classes in src/?
Similar questions: Python Relative Imports and Packages and Python: relative imports without packages or modules, but the first doesn't really answer my question (or I don't understand it) and the second relies on symlinks to hack it together (respectively).
I figured it out.
You have to have __init__.py in each of the folders like so:
/home/user/python/mypackage/src/__init__.py
/home/user/python/mypackage/src/Foo.py
/home/user/python/mypackage/tests
/home/user/python/mypackage/tests/fixtures
/home/user/python/mypackage/tests/fixtures/config.json.sample
/home/user/python/mypackage/tests/foo_test.py
/home/user/python/mypackage/tests/__init__.py
/home/user/python/mypackage/README.md
/home/user/python/mypackage/__init__.py
This tells python that we have "a package" in each of the directories including the top level directory. So, at this point, I have the following packages:
mypackage
mypackage.test
mypackage.src
So, because python will only go "down into" directories, we have to execute the unit tests from the root of the top-most package, which in this case is:
/home/user/python/mypackage/
So, from here, I can execute python and tell it to execute the unittest module and then specify which tests I want it to perform by specifying the module using the command line options
python -m unittest tests.foo_test.TestFoo
This tells python:
Execute python and load the module unittest
Tell unit test to run the tests contained in the class TestFoo, which is in the file foo_test.py, which is in the test directory.
Python is able to find it because __init__.py in each of these directories promotes them to a package that python and unittest can work with.
Lastly, foo_test.py must contain an import statement like:
from src import Foo
Because we are executing from the top level directory, AND we have packages setup for each of the subdirectories, the src package is available in the namespace, and can be loaded by a test.
Related
I have the following structure in my project;
project
src
├── A.py
└── B.py
tests
├── test_b.py
and in B.py I import A.py like this;
from A import foo
B.py works fine when I run it.
However when testing B.py in test_b.py I get an error saying
No module named A
I can make the test work with relative imports in B.py, but that fails when I run the module by itself.
Relative imports outside packages is a recipe for nightmares. Everything forks fine when you develop and test in the source directory. And problems start to occur as soon as you want to use your code from a different directory.
The workaround: Consistently add the directory of __file__ in sys.path before your local imports. As sys.path is a writable list, it will work. You should at least try to not add the directory if it is already present...
The idiomatic way: If you need local imports, then you probably need a package. It may require some work, because packages are expected to be installed, but it is a large + if you intend to later deploy your code. The downside, it that a package must be started as a module (python -m x.y) and not as a plain script (python x/y.py). With your current structure, I would just add an empty __init__.py file in both src and tests folder, and add a __main__.py file in src if you want to lauch directly the package.
Then you should run everything (including tests and dev runs) from project: python -m src.B [params...]. Same thing for the tests python -m tests.test_b. Or directly (as the test folder and files start with test): python -m unittest discover
The problem
I've found dozens of articles and tutorials about the basics of using import in Python, but none that would provide a comprehensive guide on setting up your own Python project with multiple packages.
This is my project's structure:
codename/
__init__.py
package1.py (has class1 and is a script)
package2.py (has class2)
package3.py (has function1 and is a script)
test/
__init__.py
test_package1.py (has unit tests for package1)
test_package3.py (has unit tests for package3)
How do I setup my imports to have the following requirements met (and do all of them make sense?):
class1, class2 and function1 are in namespace codename, i.e. this works:
import codename
obj = codename.class1()
codename.function1(obj)
they may be imported the same way using from codename import * or from codename import class1
function1 can easily access class1 (how?)
package1 and package2 are executable scripts
so are test_package1.py and test_package3.py
tests are also executable via python -m unittest discover
scripts are also executable via python -m codename.package1
For some reasons I'm having issues with having all of these met and when I try to fix one issue, another one pops out.
What have I tried?
Leaving codename/__init__.py empty satisfies almost all of the requirements, because everything works, but leaves names like class1 in their module's namespaces - whereas I want them imported into the package.
Adding from codename.package1 import class1 et al again satisfies most of the requirements, but I get a warning when executing the script via python -m codename.package1:
RuntimeWarning: 'codename.package2' found in sys.modules \
after import of package 'codename', but prior to execution of \
'codename.package2'; this may result in unpredictable behaviour
which sort of makes sense...
Running the script via python codename/package1.py functions, but I guess I would probably like both ways to work.
I ran into an answer to a similar question that stated that internal modules should not also be scripts, but I don't understand why we get the -m switch then? Anyway, extracting the mains into an external scripts directory works, but is it the only canonical way of setting all of this up?
you'll need to add the parent directory of codename/ to the PYTHONPATH environment variable (or write/use a setup.py file, or modify sys.path at runtime)
You'll need to import all names that you want to export in codename/__init__.py
from .package1 import function1 if you write/use a setup.py file, otherwise from codename.package1 import function1
You should use a setup.py file for scripts/executables since it makes everything much cleaner (and you'll need a setup.py file sooner or later anyway)
(and 6.) I would suggest using py.test it will find all tests for you automagically (and can run them in parallel etc.)
That should work out-of-the-box, but if you've written a setup.py then you can run them from anywhere (and on any platform) as just package1.
What exactly is the use of __init__.py? Yes, I know this file makes a directory into an importable package. However, consider the following example:
project/
foo/
__init__.py
a.py
bar/
b.py
If I want to import a into b, I have to add following statement:
sys.path.append('/path_to_foo')
import foo.a
This will run successfully with or without __init__.py. However, if there is not an sys.path.append statement, a "no module" error will occur, with or without __init__.py. This makes it seem lik eonly the system path matters, and that __init__.py does not have any effect.
Why would this import work without __init__.py?
__init__.py has nothing to do with whether Python can find your package. You've run your code in such a way that your package isn't on the search path by default, but if you had run it differently or configured your PYTHONPATH differently, the sys.path.append would have been unnecessary.
__init__.py used to be necessary to create a package, and in most cases, you should still provide it. Since Python 3.3, though, a folder without an __init__.py can be considered part of an implicit namespace package, a feature for splitting a package across multiple directories.
During import processing, the import machinery will continue to
iterate over each directory in the parent path as it does in Python
3.2. While looking for a module or package named "foo", for each directory in the parent path:
If <directory>/foo/__init__.py is found, a regular package is imported and returned.
If not, but <directory>/foo.{py,pyc,so,pyd} is found, a module is imported and returned. The exact list of extension varies by platform
and whether the -O flag is specified. The list here is
representative.
If not, but <directory>/foo is found and is a directory, it is recorded and the scan continues with the next directory in the parent
path.
Otherwise the scan continues with the next directory in the parent path.
If the scan completes without returning a module or package, and at
least one directory was recorded, then a namespace package is created.
If you really want to avoid __init__.py for some reason, you don't sys.path. Rather, create a module object and set its __path__ to a list of directories.
if I want to import a into b, I have to add following statement:
No! You'd just say: import foo.a. All this is provided you run the entire package at once using python -m main.module where main.module is the entry point to your entire application. It imports all other modules, and the modules that import more modules will try to look for them from the root of this project. For instance, foo.bar.c will import as foo.bar.b
Then it seems that only the system path matters and init.py does not have any effect.
You need to modify sys.path only when you are importing modules from locations that are not in your project, or the places where python looks for libraries. __init__.py not only makes a folder look like a package, it also does a few more things like "export" objects to outside world (__all__)
When you import something it has to either:
Retrieve an already loaded module or
Load the module that was imported
When you do import foo and python finds a folder called foo in a folder on your sys.path then it will look in that folder for an __init__.py to be considered the top level module.
(Note that if the package is not on your sys.path then you would need to append it's location to be able to import it.)
If that is not present it will look for a __init__.pyc version possibly in the __pycache__ folder, if that is also missing then that folder foo is not considered a loadable python package. If no other options for foo are found then an ImportError is raised.
If you try deleting the __init__.pyc file as well you will see that the the initializer script for a package is indeed necessary.
I just got set up to use pytest with Python 2.6. It has worked well so far with the exception of handling "import" statements: I can't seem to get pytest to respond to imports in the same way that my program does.
My directory structure is as follows:
src/
main.py
util.py
test/
test_util.py
geom/
vector.py
region.py
test/
test_vector.py
test_region.py
To run, I call python main.py from src/.
In main.py, I import both vector and region with
from geom.region import Region
from geom.vector import Vector
In vector.py, I import region with
from geom.region import Region
These all work fine when I run the code in a standard run. However, when I call "py.test" from src/, it consistently exits with import errors.
Some Problems and My Solution Attempts
My first problem was that, when running "test/test_foo.py", py.test could not "import foo.py" directly. I solved this by using the "imp" tool. In "test_util.py":
import imp
util = imp.load_source("util", "util.py")
This works great for many files. It also seems to imply that when pytest is running "path/test/test_foo.py" to test "path/foo.py", it is based in the directory "path".
However, this fails for "test_vector.py". Pytest can find and import the vector module, but it cannot locate any of vector's imports. The following imports (from "vector.py") both fail when using pytest:
from geom.region import *
from region import *
These both give errors of the form
ImportError: No module named [geom.region / region]
I don't know what to do next to solve this problem; my understanding of imports in Python is limited.
What is the proper way to handle imports when using pytest?
Edit: Extremely Hacky Solution
In vector.py, I changed the import statement from
from geom.region import Region
to simply
from region import Region
This makes the import relative to the directory of "vector.py".
Next, in "test/test_vector.py", I add the directory of "vector.py" to the path as follows:
import sys, os
sys.path.append(os.path.realpath(os.path.dirname(__file__)+"/.."))
This enables Python to find "../region.py" from "geom/test/test_vector.py".
This works, but it seems extremely problematic because I am adding a ton of new directories to the path. What I'm looking for is either
1) An import strategy that is compatible with pytest, or
2) An option in pytest that makes it compatible with my import strategy
So I am leaving this question open for answers of these kinds.
The issue here is that Pytest walks the filesystem to discover files that contain tests, but then needs to generate a module name that will cause import to load that file. (Remember, files are not modules.)
Pytest comes up with this test package name by finding the first directory at or above the level of the file that does not include an __init__.py file and declaring that the "basedir" for the module tree containing a module generated from this file. It then adds the basedir to sys.path and imports using the module name that will find that file relative to the basedir.
There are some implications of this of which you should beware:
The basepath may not match your intended basepath in which case the module will have a name that doesn't match what you would normally use. E.g., what you think of as geom.test.test_vector will actually be named just test_vector during the Pytest run because it found no __init__.py in src/geom/test/ and so added that directory to sys.path.
You may run into module naming collisions if two files in different directories have the same name. For example, lacking __init__.py files anywhere, adding geom/test/test_util.py will conflict with test/test_util.py because both are loaded as import test_util.py, with both test/ and geom/test/ in the path.
The system you're using here, without explicit __init__.py modules, is having Python create implicit namespace packages for your directories. (A package is a module with submodules.) Ideally we'd configure Pytest with a path from which it would also generate this, but it doesn't seem to know how to do that.
The easiest solution here is simply to add empty __init__.py files to all of the subdirectories under src/; this will cause Pytest to import everything using package/module names that start with directory names under src/.
The question How do I Pytest a project using PEP 420 namespace packages? discusses other solutions to this.
import looks in the following directories to find a module:
The home directory of the program. This is the directory of your root script. When you are running pytest your home directory is where it is installed (/usr/local/bin probably). No matter that you are running it from your src directory because the location of your pytest determines your home directory. That is the reason why it doesn't find the modules.
PYTHONPATH. This is an environment variable. You can set it from the command line of your operating system. In Linux/Unix systems you can do this by executing: 'export PYTHONPATH=/your/custom/path' If you wanted Python to find your modules from the test directory you should include the src path in this variable.
The standard libraries directory. This is the directory where all your libraries are installed.
There is a less common option using a pth file.
sys.path is the result of combining the home directory, PYTHONPATH and the standard libraries directory. What you are doing, modifying sys.path is correct. It is something I do regularly. You could try using PYTHONPATH if you don't like messing with sys.path
If you include an __init__.py file inside your tests directory, then when the program is looking to set a home directory it will walk 'upwards' until it finds one that does not contain an init file. In this case src/.
From here you can import by saying :
from geom.region import *
you must also make sure that you have an init file in any other subdirectories, such as the other nested test directory
I was wondering what to do about this problem too. After reading this post, and playing around a bit, I figured out an elegant solution. I created a file called "test_setup.py" and put the following code in it:
import sys, os
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
I put this file in the top-level directory (such as src). When pytest is run from the top-level directory, it will run all test files including this one since the file is prefixed with "test". There are no tests in the file, but it is still run since it begins with "test".
The code will append the current directory name of the test_setup.py file to the system path within the test environment. This will be done only once, so there are not a bunch of things added to the path.
Then, from within any test function, you can import modules relative to that top-level folder (such as import geom.region) and it knows where to find it since the src directory was added to the path.
If you want to run a single test file (such as test_util.py) instead of all the files, you would use:
pytest test_setup.py test\test_util.py
This runs both the test_setup and test_util code so that the test_setup code can still be used.
Are so late to answer that question but usining python 3.9 or 3.10 u just need to add __init__.py folder in tests folders.
When u add this file python interprets this folders as a module.
Wold be like this
src/
main.py
util.py
test/
__init__.py
test_util.py
geom/
vector.py
region.py
test/
__init__.py
test_vector.py
test_region.py
so u just run pytest.
Sorry my poor english
Not the best solution, but maybe the fastest one:
cd path/python_folder
python -m pytest python_file.py
I have some problems in structuring my python project. Currently it is a bunch of files in the same folder. I have tried to structure it like
proj/
__init__.py
foo.py
...
bar/
__init__.py
foobar.py
...
tests/
foo_test.py
foobar_test.py
...
The problem is that I'm not able, from inner directories, to import modules from the outer directories. This is particularly annoying with tests.
I have read PEP 328 about relative imports and PEP 366 about relative imports from the main module. But both these methods require the base package to be in my PYTHONPATH. Indeed I obtain the following error
ValueError: Attempted relative import in non-package.
So I added the following boilerplate code on top of the test files
import os, sys
sys.path.append(os.path.join(os.getcwd(), os.path.pardir))
Still I get the same error. What is the correct way to
structure a package, complete with tests, and
add the base directory to the path to allow imports?
EDIT As requested in the comment, I add an example import that fails (in the file foo_test.py)
import os, sys
sys.path.append(os.path.join(os.getcwd(), os.path.pardir))
from ..foo import Foo
When you use the -m switch to run code, the current directory is added to sys.path. So the easiest way to run your tests is from the parent directory of proj, using the command:
python -m proj.tests.foo_test
To make that work, you will need to include an __init__.py file in your tests directory so that the tests are correctly recognised as part of the package.
I like to import modules using the full proj.NAME package prefix whenever possible. This is the approach the Google Python styleguide recommends.
One option to allow you to keep your package structure, use full package paths, and still move forward with development would be to use a virtualenv and put your project in develop mode. Your project's setup.py will need to use setuptools instead of distutils, to get the develop command.
This will let you avoid the sys.path.append stuff above:
% virtualenv ~/virt
% . ~/virt/bin/activate
(virt)~% cd ~/myproject
(virt)~/myproject% python setup.py develop
(virt)~/myproject% python tests/foo_test.py
Where foo_test.py uses:
from proj.foo import Foo
Now when you run python from within your virtualenv your PYTHONPATH will point to all of the packages in your project. You can create a shorter shell alias to enter your virtualenv without having to type . ~/virt/bin/activate every time.