python project files hierarchy:
parent/
__init__.py
one/
__init__.py
bar.py
two/
__init__.py
foo.py
foo.py
from one import bar
I tried to run foo.py from terminal in other directory (e.g. users/user), I got the next error:
No module named one
When I trying to run foo.py, I guess it is trying to import the files from the directory that the code had been executed from, I had tried lot of ways and I couldn't find solution, finally I found a solution, the problem with this solution is that the solution is not elegant and I hope there is an elegant and better solution.
foo.py
from pathlib import Path
import sys
sys.path.append(str(Path(__file__).parent.parent))
sys.path.append("..")
from one import bar
This solution is not elegant because it preventing me to put all the imports in the start of the page.
The fact that you have an __init.py__ in the parent directory suggests that parent is part of your package structure and that its parent directory, whatever that might be, should be in the PATH. Therefore your import should really be:
from parent.one import bar
It can be useful for an application directory structure to have a single root. Then the __init.py__ in that single root package can be used to load modules from subpackages, but this is certainly not a requirement. If that was not your intention, then you should probably delete the __init__.py that is in parent as it is serving no purpose (and is confusing) and ensure that directory parent is in your PATH.
HOWEVER: As long as the current directory you are in when you run your program is the parent directory of the root(s) of your package structure, Python should be able to find your packages with no special action on your part because the current directory is automatically added to the path. If that is inconvenient, you can set environment variable PYTHONPATH.
So, determine whether you should be changing your import statement or not based on which directories are part of your package structure. Then you should arrange for Python to find your packages either by setting the current directory, PYTHONPATH, or sys.path to the required directory -- but do this once. If you have to set sys.path, I would do this in your main program at startup before it needs to include anything:
If foo.py is your main program, then at the top of the program I would have:
if __name__ == '__main__':
from pathlib import Path
import sys
# if your import statement is: from parent.one import bar, then:
sys.path.insert(0, str(Path(__file__).parent.parent))
"""
# if your import statement is: from one import bar, then:
sys.path.insert(0, str(Path(__file__).parent))
"""
Why don’t you let the parent act like a path provider to the child, by creating a path dictionary ? like this way :
class parent:
...
def createPathDict(self):
self.path_dict = {}
self.path_dict ['parent'] = self.parentPath
self.path_dict ['one'] = os.path.join(self.parentPath, 'one')
self.path_dict ['two'] = os.path.join(self.parentPath, 'two')
# self.path_dict ['three'] = …
# ...
From child ‘two’ you import the dictionary like this (I assume you use classes) :
class foo:
def __init__(self, parent):
self.parent = parent
def addPathsToPythDirs(self):
sys.path.insert(1, self.parent.path_dict ['one']) # better
# sys.path.insert(0, self.parent.path_dict [key])
...
In that way you could keep your imports in foo.py
Why use sys.path.append(path) instead of sys.path.insert(1, path)?
Related
Suppose you get a pathlib.Path that points to a *.py file.
And also suppose that this is a resource you could import in another Python file, given the appropriate import path, because your sys.path allows for it.
How do you determine the dotted import path to use in import, from just the python file?
Unlike most of the import-related questions, this is NOT about the path from one Python file to another within a directory hierarchy, it is really more about the import path you could specify in the REPL from anywhere to import that module and that's affected by sys.path contents.
Example:
$test_366_importpath$ tree -I __pycache__
.
└── sub
└── somemodule.py
somemodule.py
"some module"
class Foo:
"Foo class"
If I start python at that location, because sys.path gets the current directory, this works:
from sub.somemodule import Foo
sub.somemodule is what I am interested in.
However, if the sys.path gets altered, then I can use a different import path.
import sys
sys.path.insert(0, "/Users/me/explore/test_366_importpath/sub")
from somemodule import Foo
(note: I wouldn't be doing this "for real", neither the sys.path.insert, nor varying the dotted path I'd use, see #CryptoFool's comment. This is just a convient way to show sys.path impact)
Question:
How do I determine, programmatically, that import sub.somemodule needs to be used as the dotted path? Or import somemodule given different sys.path conditions?
Raising an ImportError or ValueError or some other exceptions if the *.py file is not importable is perfectly OK.
I'm writing a helper using
pa_script = pathlib.Path("somemodule.py").absolute().resolve() and then looking at sys.path. Once I find that a given sys.path entry is the parent for the pa_script, I can use pa_script.relative_to(parent).
From there it's trivial to get the import path by removing the .py extension and replacing os.sep with ..
Then I can feed that dotted path to importlib. Or paste into my code editor.
It's a bit tricky but not particularly hard. Makes me wonder however if there isn't a builtin or canonical way however.
I can post my code, but really if there is a canonical way to do it, it would just give the wrong impression that's it's necessary to do these complicated steps.
Well here goes then, in case anyone needs something similar
(for Python 3.10+, but removing typehints should make it work down to much earlier 3.x versions)
from pathlib import Path
import sys
import os
def get_dotted_path(path_to_py: str | Path, paths: list[str] | None = None) -> str:
"""
return a dotted-path import string from a Python filename
if given, `paths` will be examined as if it was `sys.path` else
`sys.path` is used to determine import points (this is to compute paths
assuming a different sys.path context than the current one)
)
example:
.../lib/python3.10/collections/__init__.py => "collections"
.../lib/python3.10/collections/abc.py => "collections.abc"
raises ImportError if the Python script is not in sys.path or paths
"""
parent = None
pa_target = Path(path_to_py)
paths = paths or sys.path
# get the full file path AND resolve if it's a symlink
pa_script = pa_target.absolute().resolve().absolute()
# consider pkg/subpk/__init__.py as pkg/subpk
if pa_script.name == "__init__.py":
pa_script = pa_script.parent
for path in paths:
pa_path = Path(path)
if pa_path in pa_script.parents:
parent = pa_path
break
else:
newline = "\n"
raise ImportError(
f"{pa_script} nowhere in sys.path: {newline.join([''] + paths)}"
)
pa_relative = pa_script.relative_to(parent)
res = str(pa_relative).removesuffix(".py").replace(os.sep, ".")
return res
The project has the same structure as in the picture: I'm trying to import from "mod.py " in "index.py "
from .. import mod
However, it gives the error: "ImportError: attempted relative import with no known parent package" If you use this option:
from pack1 import mod
Then error: "ModuleNotFoundError error: there is no module named 'pack1'"
enter image description here
PROJECT/
pack1/
__init__.py
mod.py
pack2/
__init__.py
index.py
What is the problem?
This is a recurring question on StackOverflow. And much of the confusion (in my opinion) comes from how Python interprets the files and folders it sees is based on where Python is run from. First, some terminology:
module: a file containing Python code.
package: a folder containing files with Python code and other folders.
When you start Python in a directory (folder), it doesn't "know" what the namespace of that directory should be. I.e., if you are working in Z:\path\to_my\project\ when you start Python:
it does NOT consider project to be a package.
any .py files you want to import from will be in their own namespace as modules.
any folders you want to import from will also be in their own namespace as packages.
What about __init__.py? Since version 3.3, Python has implicit namespace packages, which allows importing without needing to create an empty __init__.py file.
Consider #2: if you have two files: first.py and second.py:
path/
to_my/
project/
>>Python is running here<<
first.py
second.py
with these contents:
# first.py
first_var = 'hello'
# second.py
from .first import first_var
second_var = first_var + ' world'
if you try to import like this:
>>> import second
Python basically does the following:
"ok, I see second.py"
"Reading that in as a module, chief!"
"Ok, it wants to import .first
"The . means get the package (folder) that contains first.py"
"Wait, I don't have a parent package for first.py!"
"Better raise an error."
The same rules apply for #3 as well. If we add a few packages to the project like this:
path/
to_my/
project/
>>Python is running here<<
first.py
second.py
pack1/
mod.py
other_mod.py
pack2/
index.py
with the following contents:
# pack1/mod.py
mod_var = 1234
# pack1/other_mod.py
from .mod import mod_var
other_var = mod_var * 10
# pack2/index.py
from ..pack1 import mod
and when you try to import like this:
>>> from pack2 import index.py
The import in pack2/index.py is going to fail for the same reason second.py, Python will work its way up the import chain of dots like this:
"Reading in in index.py as a module."
"Looks like it wants to import mod from ..pack1.
"Ok, . is the pack2 parent package namespace of index.py, found that."
"So, .. is the parent package of pack2."
"But, I don't have a parent package for pack2!"
"Better raise an error."
How do we make it work? Two thing.
First, move where Python is running up one level so that all of the .py files and subfolders are considered to be part of the same package namespace, which allows the file to reference each other using relative references.
path/
to_my/
>>Python is running here now<<
project/
first.py
second.py
pack1/
mod.py
other_mod.py
pack2/
index.py
So now Python sees project as a package namespace, and all of the files within can use relative references up to that level.
This changes how you import when you are in the Python interpreter:
>>> from project.pack2 import index.py
Second, you make explicit references instead of relative references. That can make the import statements really long, but if you have several top-level modules that need to pull from one another, this is how you can do it. This is useful when you are defining your functions in one file and writing your script in another.
# first.py
first_var = 'hello'
# second.py
from first import first_var # we dropped the dot
second_var = first_var + ' world'
I hope this helps clear up some of the confusion about relative imports.
I have the following structure:
LICENSE.md
README.md
requirements.txt
src
routes
route_a.py
__ init __.py
util
__ init __.py
db.py
And in db.py, I have something that looks like this:
import mysql.connector
def get_value():
# Query database using mysql.connector
return value
value = get_value()
def query_that_uses_value(value):
# do stuff with value
return value2
I want to be able to use value inside of route_a.py and also inside of other functions in db.py. What's the best way to do this?
import sys
sys.path.insert(0, "path")
that's how I did it.
the path is the folder u want to use I would choose the main folder so u have the same starting point for your imports.
in your case, the import would look like this
from util.dp.py import get_value
if you have this at the start of your programme that u want the function imported to
sys.path.insert(0,"path_to_src/src")
but path to src must be an absolut path beginning from your root folder
In route_a.py, simply import the variable value (and other functions you need) from ..util.db, which is a relative import that will reference src/util/db.py. Here's what the file src/routes/route_a.py should contain:
from ..util.db import value, function1, function2
Best way to deal with imports is to export PYTHONPATH=$(pwd) in your project root directory where there are src, requirements.txt, etc.
So in your terminal, run export PYTHONPATH=$(pwd) and all your imports should be consistent and start from src.
For example:
from src.util.db import value
from src.routes.route_a import something
Note that everytime you open a new terminal you should run export PYTHONPATH=$(pwd) cause this is not permanent. When your terminal disappears your PYTHONPATH should reset and that's a good thing and is best practice.
Also don't forget to run everything from src. Like: python src/util/db.py
If you follow this structure, I promise you will never have any import problems.
I spent some time researching this and I just cannot work this out in my head.
I run a program in its own directory home/program/core/main.py
In main.py I try and import a module called my_module.py thats located in a different directory, say home/program/modules/my_module.py
In main.py this is how I append to sys.path so the program can be run on anyone's machine (hopefully).
import os.path
import sys
# This should give the path to home/program
sys.path.append(os.path.join(os.path.abspath(os.path.dirname(__file__), '..'))
# Which it does when checking with
print os.path.join(os.path.abspath(os.path.dirname(__file__), '..')
# So now sys.path knows the location of where modules directory is, it should work right?
import modules.my_module # <----RAISES ImportError WHY?
However if I simply do:
sys.path.append('home/program/modules')
import my_module
It all works fine. But this is not ideal as it now depends on the fact that the program must exist under home/program.
that's because modules isn't a valid python package, probably because it doesn't contain any __init__.py file (You cannot traverse directories with import without them being marked with __init__.py)
So either add an empty __init__.py file or just add the path up to modules so your first snippet is equivalent to the second one:
sys.path.append(os.path.join(os.path.abspath(os.path.dirname(__file__), '..','modules'))
import my_module
note that you can also import the module by giving the full path to it, using advanced import features: How to import a module given the full path?
Although the answer can be found here, for convenience and completeness here is a quick solution:
import importlib
dirname, basename = os.path.split(pyfilepath) # pyfilepath: /my/path/mymodule.py
sys.path.append(dirname) # only directories should be added to PYTHONPATH
module_name = os.path.splitext(basename)[0] # /my/path/mymodule.py --> mymodule
module = importlib.import_module(module_name) # name space of defined module (otherwise we would literally look for "module_name")
Now you can directly use the namespace of the imported module, like this:
a = module.myvar
b = module.myfunc(a)
In a big application I am working, several people import same modules differently e.g.
import x
or
from y import x
the side effects of that is x is imported twice and may introduce very subtle bugs, if someone is relying on global attributes
e.g. suppose I have a package mypakcage with three file mymodule.py, main.py and init.py
mymodule.py contents
l = []
class A(object): pass
main.py contents
def add(x):
from mypackage import mymodule
mymodule.l.append(x)
print "updated list",mymodule.l
def get():
import mymodule
return mymodule.l
add(1)
print "lets check",get()
add(1)
print "lets check again",get()
it prints
updated list [1]
lets check []
updated list [1, 1]
lets check again []
because now there are two lists in two different modules, similarly class A is different
To me it looks serious enough because classes itself will be treated differently
e.g. below code prints False
def create():
from mypackage import mymodule
return mymodule.A()
def check(a):
import mymodule
return isinstance(a, mymodule.A)
print check(create())
Question:
Is there any way to avoid this? except enforcing that module should be imported one way onyl. Can't this be handled by python import mechanism, I have seen several bugs related to this in django code and elsewhere too.
Each module namespace is imported only once. Issue is, you're importing them differently. On the first you're importing from the global package, and on the second you're doing a local, non-packaged import. Python sees modules as different. The first import is internally cached as mypackage.mymodule and the second one as mymodule only.
A way to solve this is to always use absolute imports. That is, always give your module absolute import paths from the top-level package onwards:
def add(x):
from mypackage import mymodule
mymodule.l.append(x)
print "updated list",mymodule.l
def get():
from mypackage import mymodule
return mymodule.l
Remember that your entry point (the file you run, main.py) also should be outside the package. When you want the entry point code to be inside the package, usually you use a run a small script instead. Example:
runme.py, outside the package:
from mypackage.main import main
main()
And in main.py you add:
def main():
# your code
I find this document by Jp Calderone to be a great tip on how to (not) structure your python project. Following it you won't have issues. Pay attention to the bin folder - it is outside the package. I'll reproduce the entire text here:
Filesystem structure of a Python project
Do:
name the directory something
related to your project. For example,
if your project is named "Twisted",
name the top-level directory for its
source files Twisted. When you do
releases, you should include a version
number suffix: Twisted-2.5.
create a directory Twisted/bin and
put your executables there, if you
have any. Don't give them a .py
extension, even if they are Python
source files. Don't put any code in
them except an import of and call to a
main function defined somewhere else
in your projects.
If your project
is expressable as a single Python
source file, then put it into the
directory and name it something
related to your project. For example,
Twisted/twisted.py. If you need
multiple source files, create a
package instead (Twisted/twisted/,
with an empty
Twisted/twisted/__init__.py) and
place your source files in it. For
example,
Twisted/twisted/internet.py.
put
your unit tests in a sub-package of
your package (note - this means that
the single Python source file option
above was a trick - you always need at
least one other file for your unit
tests). For example,
Twisted/twisted/test/. Of course,
make it a package with
Twisted/twisted/test/__init__.py.
Place tests in files like
Twisted/twisted/test/test_internet.py.
add Twisted/README and Twisted/setup.py to explain and
install your software, respectively,
if you're feeling nice.
Don't:
put your source in a directory
called src or lib. This makes it
hard to run without installing.
put
your tests outside of your Python
package. This makes it hard to run the
tests against an installed version.
create a package that only has a
__init__.py and then put all your
code into __init__.py. Just make a
module instead of a package, it's
simpler.
try to come up with
magical hacks to make Python able to
import your module or package without
having the user add the directory
containing it to their import path
(either via PYTHONPATH or some other
mechanism). You will not correctly
handle all cases and users will get
angry at you when your software
doesn't work in their environment.
I can only replicate this if main.py is the file you are actually running. In that case you will get the current directory of main.py on the sys path. But you apparently also have a system path set so that mypackage can be imported.
Python will in that situation not realize that mymodule and mypackage.mymodule is the same module, and you get this effect. This change illustrates this:
def add(x):
from mypackage import mymodule
print "mypackage.mymodule path", mymodule
mymodule.l.append(x)
print "updated list",mymodule.l
def get():
import mymodule
print "mymodule path", mymodule
return mymodule.l
add(1)
print "lets check",get()
add(1)
print "lets check again",get()
$ export PYTHONPATH=.
$ python mypackage/main.py
mypackage.mymodule path <module 'mypackage.mymodule' from '/tmp/mypackage/mymodule.pyc'>
mymodule path <module 'mymodule' from '/tmp/mypackage/mymodule.pyc'>
But add another mainfile, in the currect directory:
realmain.py:
from mypackage import main
and the result is different:
mypackage.mymodule path <module 'mypackage.mymodule' from '/tmp/mypackage/mymodule.pyc'>
mymodule path <module 'mypackage.mymodule' from '/tmp/mypackage/mymodule.pyc'>
So I suspect that you have your main python file within the package. And in that case the solution is to not do that. :-)