I am structuring my Python application with the following folder architecture (roughly following the outline here):
myapp:
myapp_service1:
myapp_service1_start.py
...
myapp_service2:
myapp_service2_start.py
...
myapp_service3:
myapp_service3_start.py
...
common:
myapp_common1.py
myapp_common2.py
myapp_common3.py
...
scripts:
script1.py
script2.py
script3.py
...
tests:
...
docs:
...
LICENSE.txt
MANIFEST.in
README
This is ideal file/folder hierarchy for me, however, I am confused on how to reference modules from outside folders. For instance, myapp_service1_start.py needs to reference function in myapp_common1.py and myapp_common2.py.
I know I need to somehow add reference to the system path or pythonpath, but I'm not sure the best way to do this in code. Or rather where I would even do this in code.
How can I do this?
I've seen a lot of posts about creating a full Python package to be installed via pip, but this seems like overkill to me.
One way is to make each of your myapp_service*_start.py files add myapp/ directory to sys.path.
For example, drop a file called import_me.py into myapp_service1/ with code that appends the "one up" directory (relative to importing file) to sys.path:
import os
import sys
import inspect
this_dir = os.path.dirname(inspect.getfile(inspect.currentframe()))
src_dir = os.path.join(this_dir, '..')
sys.path.insert(0, src_dir)
Then, in your myapp_service1_start.py you can do something like:
import import_me
from common import myapp_common1
from common import myapp_common2
Of course, be sure to make common directory a Python package by dropping a (possibly empty) __init__.py file into it.
Related
I am trying to improve the project structure while adding to a code base. I found a sample structure here which looks like this:
README.rst
LICENSE
setup.py
requirements.txt
sample/__init__.py
sample/core.py
sample/helpers.py
docs/conf.py
docs/index.rst
tests/test_basic.py
tests/test_advanced.py
I notice in particular that requirements.txt and setup.py are on a higher level than tests/ and sample/
If I add sample/classes.py you need only write from classes import MyClass in sample/core.py to get it in there. It cannot however so easily be imported into tests/test_basic.py, does not seem like python 'looks around the corner' like that when importing.
In my case, there is also a MANIFEST.in on the same level with requirements.txt and some files which are not really python but just set things up for the platform on which this runs.
If classes.py were on the same level as requirements.txt I think it would be easily importable by everything in tests/ and in sample/ and their subdirectories, but it may need a __init__.py That doesn't feel right somehow.
So where should it go if both tests/ and sample/ need to be able to use it?
Let's make it easy.
If I understand correctly, the problem is How to import simple module in test. Which means you want to use something like from simple.classes import MyClass.
That's easy, just add your root path to PYTHONPATH before executing python test/test_basic.py.
That's also what an IDE does for you when you execute tests through it.
Assuming you use a Python >= 3.3, you can simply turn the test folder in a package by adding a __init__.py module in it. Then in that __init__.py (and only there) you add the path of the parent package to sys.path. That if enough for unittest discover to use it for all the modules in tests.
My one is just:
import os
import sys
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
Then if you need to access classes.py from one of the test modules, you can just use:
from sample import classes
or to directly import MyClass:
from sample.classes import MyClass
It just works because sample is already a package, and its parent folder has been added to sys.path when python unittest has loaded the test package.
Of course, this only works in you can have your tests in a package. If for any reason it is not an option, for example because you need to run individually the test modules, then you should put the sys.path modification directly in all the test files.
Write a path_helper.py file in the tests folder:
import os
import sys
core_path = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
if core_path not in sys.path: # don't add it if it is already here
sys.path.append(core_path)
You can then import it in all test files:
import path_helper
...
I'm trying to keep a data science project well-organized so I've created a directory inside my src directory called utils that contains a file called helpers.py, which contains some helper functions that will be used in many scripts. What is the best practice for how I should import func_name from src/utils/helpers.py into a file in a totally different directory, such as src/processing/clean_data.py?
I see answers to this question, and I've implemented a solution that works, but this feels ugly:
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))))
Am I doing this right? Do I need to add this to every script that wants to import func_name, like train_model.py?
My current project folder structure:
myproject
/notebooks
notebook.ipynb
/src
/processing
clean_data.py
/utils
helpers.py
/models
train_model.py
__init__.py
Example files:
# clean_data.py
import os
import sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))))
from src.utils.helpers import func_name
func_name()
# helpers.py
def func_name():
print('I'm a helper function.')
The correct way to do it is to use __init__.py, setup.py and the setuptools Python package:
myPackage/
myPackage/
__init__.py
setup.py
This link has all the steps.
First of all, let me describe you the differences between a Python module & a Python package so that both of us are on the same page. โ
A module is a single .py file (or files) that are imported under one import and used. โ
import aModuleName
# Here 'aModuleName' is just a regular .py file.
Whereas, a package is a collection of modules in directories that give a package hierarchy. A package contains a distinct __init__.py file. โ
from aPackageName import aModuleName
# Here 'aPackageName` is a folder with a `__init__.py` file
# and 'aModuleName', which is just a regular .py file.
Therefore, when we have a project directory named proj-dir of the following structure โคต
proj-dir
--|--__init__.py
--package1
--|--__init__.py
--|--module1.py
--package2
--|--__init__.py
--|--module2.py
๐ Notice that I've also added an empty __init__.py into the proj-dir itself which makes it a package too.
๐ Now, if you want to import any python object from module2 of package2 into module1 of package1, then the import statement in the file module1.py would be
from package2.module2 import object2
# if you were to import the entire module2 then,
from package2 import module2
I hope this simple explanation clarifies your doubts on Python imports' mechanism and solves the problem. If not then do comment here. ๐
First of all let me clarify you that importing an entire module, if you are going to use a part of it, then is not a good idea. Instead of that you can use from to import specific function under a library/package. By doing this, you make your program efficient in terms of memory and performance.
To know more refer these:
'import module' or 'from module import'
difference between import and from
Net let us look into the solution.
Before starting off with the solution, let me clarify you the use of __init__.py file. It just tells the python interpreter that the *.py files present there are importable which means they are modules and are/maybe a part of a package.
So, If you have N no of sub directories you have to put __init__.py file in all those sub directories such that they can also be imported. Inside __init__.py file you can also add some additional information like which path should be included, default functions,variables,scope,..etc. To know about these just google about __init__.py file or take some python library and go through the same __init__.py file to know about it. (Here lies the solution)
More Info:
modules
Be pythonic
So as stated by #Sushant Chaudhary your project structure should be like
proj-dir
--|--__init__.py
--package1
--|--__init__.py
--|--module1.py
--package2
--|--__init__.py
--|--module2.py
So now, If I put __init__.py file under my directory like above, Will
it be importable and work fine?
yes and no.
Yes :
If you are importing the modules within that project/package directory.
for example in your case
you are importing package1.module1 in pakage2.module2 as from package1 import module1.
Here you have to import the base dir inside the sub modules, Why? the project will run fine if you are running the module from the same place. i.e: inside package2 as python module2.py, But will throw ModuleNotFoundError If you run the module from some other directory. i.e: any other path except under package2 for example under proj-dir as python package2/module2.py. This is what happening in your case. You are running the module from project-dir.
So How to fix this?
1- You have to append basedir path to system path in module2.py as
from sys import path
dir_path = "/absolute/path/to/proj-dir"
sys.path.insert(0, dir_path)
So that module2 will be able to find package1 (and module1 inside it).
2- You have to add all the sub module paths in __init__.py file under proj-dir.
For example:
#__init__.py under lxml
# this is a package
def get_include():
"""
Returns a list of header include paths (for lxml itself, libxml2
and libxslt) needed to compile C code against lxml if it was built
with statically linked libraries.
"""
import os
lxml_path = __path__[0]
include_path = os.path.join(lxml_path, 'includes')
includes = [include_path, lxml_path]
for name in os.listdir(include_path):
path = os.path.join(include_path, name)
if os.path.isdir(path):
includes.append(path)
return includes
This is the __init__.py file of lxml (a python library for parsing html,xml data). You can refer any __init__.py file under any python libraries having sub modules.ex (os,sys). Here I've mentioned lxml because I thought it will be easy for you to understand. You can even check __init__.py file under other libraries/packages. Each will have it's own way of defining the path for submodules.
No
If you are trying to import modules outside the directory. Then you have to export the module path such that other modules can find them into environment variables. This can be done directly by appending absolute path of the base dir to PYTHONPATH or to PATH.
To know more:
PATH variables in OS
PYTHONPATH variable
So to solve your problem, include the paths to all the sub modules in __init__.py file under proj-dir and add the /absolute/path/to/proj-dir either to PYTHONPATH or PATH.
Hope the answer explains you about usage of __init__.py and solves your problem.
On Linux, you can just add the path to the parent folder of your src directory to ~/.local/lib/python3.6/site-packages/my_modules.pth. See
Using .pth files. You can then import modules in src from anywhere on your system.
NB1: Replace python3.6 by any version of Python you want to use.
NB2: If you use Python2.7 (don't know for other versions), you will need to create __init__.py (empty) files in src/ and src/utils.
NB3: Any name.pth file is ok for my_modules.pth.
Yes, you can only import code from installed packages or from files in you working directory or subdirectories.
the way I see it, your problem would be solved if you would have your module or package installed, like an yother package one installs and then imports (numpy, xml, json etc.)
I also have a package I constantly use in all my projects, ulitilies, and I know it's a pain with the importing.
here is a description on how to How to package a python application to make it pip-installable:
https://marthall.github.io/blog/how-to-package-a-python-app/
Navigate to your python installation folder
Navigate to lib
Navigate to site-packages
Make a new file called any_thing_you_want.pth
Type .../src/utils/helpers.py inside that file with your favorite text editor
Note: the ellipsis before scr/utils/helpers.py will look something like: C:/Users/blahblahblah/python_folders/scr... <- YOU DO NEED THIS!
This is a cheap way out but it keeps code clean, and is the least complicated. The downside is, for every folder your modules are in, example.pth will need them. Upside: works with Windows all the way up to Windows 10
I'm developing a system in Python that includes a calculation engine and a front end. I've split them up into two projects as the calculation engine can be used for other front ends as well.
I'm using Eclipse and PyDev. Everything works perfectly in Eclipse/PyDev, but as soon as I try to run it outside of PyDev (from command line) I get importing errors. I've done quite a bit of research to find the problem, but I just don't see a solution that works nicely. I believe that PyDev modifies the Python path.
In my project layout below I have two packages (package1 and tests) within one project (Calculations). I can't seem to import anything from package1 in tests. I also have another project (Frontend). Here I also can't import anything from package1.
What I want to understand is the proper way of calling my script/tests files from the command line? Both for two separate projects and two packages in the same project. I assume it would be similar to how PyDev does it. So far I think I have the following options:
Create python code to append to sys.path (seems hacky/not good practice)
Modify the PYTHONPATH when I call the test_some_calc.py like this: PYTHONPATH= python test_some_calc.py (I think this is how PyDev does it, but it seems a bit long - there must be a simpler way?
Make a install package (eventually I might go this method, but not yet.)
I have the following project layout.
CodeSolution/
Calculations/
package1/
__init__.py
subpackage/
__init__.py
some_calc.py
subpackage2/
__init__.py
another_calc.py
tests/
__init__.py
subpackage/
__init__.py
test_some_calc.py # Unable to import from package1
subpackage2/
__init__.py
test_another_calc.py # Unable to import from package1
Frontend/
some_script.py # Unable to import from package1
Comments on my project layout will also be appreciated.
A clean, quick and modular way to include certain python from anywhere on your system is to make a file named mymodule.pth and put it inside the path of site-packages
mymodule.pth should have the path of your project. Project folder must have an __init__.py file.
for example put:
for Linux:
/home/user/myproject
inside
/usr/lib/python2.7/site-packages/mymodule.pth
or
for Windows
C:\\Users\myUsername\My Documents\myproject
inside
C:\PythonXY\Lib\site-packages\mymodule.pth
I wrote a script to load PYTHONPATHs from PyDev's project properties. It's allow you to run your code from console without problems like "ModuleNotFoundError: No module named ...".
import sys
from xml.dom import minidom
import os
print(sys.path)
def loadPathsFromPyDev():
sys_path = sys.path[0]
# Load XML
xmldoc = minidom.parse(sys_path+'\.pydevproject')
# Get paths
xmlpaths = xmldoc.getElementsByTagName('path')
# Get paths' values
paths = list()
for xmlpath in xmlpaths:
paths.append(xmlpath.firstChild.data)
# set path variable
for path in paths:
# Set backslashes to forwardslashes
path = os.path.normpath(path)
# Set string's sys_path
path = path.replace("\${PROJECT_DIR_NAME}", sys_path)
if path not in sys.path:
# Add to system path
sys.path.insert(1,path)
loadPathsFromPyDev()
print(sys.path)
I hope it will help You :)
I am having the following directory structure and I'm new to python. In folder bin I have a single file "script.py" in which I want to import "module.py" from code package. I have seen many solutions on stackoverflow to this problem that involve modifying sys-path or giving full pathname. Is there any way I could import these with an import statement that works relatively, instead of specifying full path?
project/
bin/
script.py
code/
module.py
__init__.py
To make sure that bin/script.py can be run without configuring environment, add this preambule to script.py before from code import module line (
from twisted/bin/_preambule.py):
# insert `code/__init__.py` parent directory into `sys.path`
import sys, os
path = os.path.abspath(sys.argv[0]) # path to bin/script.py
while os.path.dirname(path) != path: # until top-most directory
if os.path.exists(os.path.join(path, 'code', '__init__.py')):
sys.path.insert(0, path)
break
path = os.path.dirname(path)
The while loop is to support running it from bin/some-other-directory/.../script.py
While correct, I think that the "dynamically add my current directory to the path" is a dirty, dirty hack.
Add a (possibly blank) __init__.py to project. Then add a .pth file containing the path to project to your sites-packages directory. Then:
from project.code import module
This has two advantages, in my opinion:
1) If you refactor, you just need to change the from project.code line and avoid messing with anything else.
2) There will be nothing special about your code -- it will behave exactly like any other package that you've installed through PyPi.
It may seem messy to add your project directory to your PYTHONPATH but I think it's a much, much cleaner solution than any of the alternatives. Personally, I've added the parent directory that all of my python code lives in to my PYTHONPATH with a .pth file, so I can deal with all of the code I write just like 3rd party libraries.
I don't think that there is any issue with cluttering up your PYTHONPATH, since only folders with an __init__.py will be importable.
I am running into pathing issues with some scripts that I wrote to test parsers I've written. It would appear that Python (27 and 3) both do not act like IronPython when it comes to using the current working directory as part of sys.path. As a result my inner package pathings do not work.
This is my package layout.
MyPackage
__init__.py
_test
__init__.py
common.py
parsers
__init__.py
_test
my_test.py
I am attempting to call the scripts from within the MyPackage directory.
Command Line Statements:
python ./parsers/_test/my_test.py
ipy ./parsers/_test/my_test.py
The import statement located in my my_test.py file is.
from _test.common import TestClass
In the python scenario I get ONLY the MyPackage/parsers/_test directory appended to sys.path so as a result MyPackage cannot be found. In the IronPython scenario both the MyPackage/parsers/_test directory AND MyPackage/ is in sys.path. I could get the same bad reference if I called the test from within the _test directory (it would have no idea about MyPackage). Even if i consolidated the test files into the _test directory I would still have this pathing issue.
I should note, that I just tested and if i did something like
import sys
import os
sys.path.append(os.getcwd())
It will load correctly. But I would have to do this for every single file.
Is there anyway to fix this kind of pathing issue without doing the following.
A. Appending a relative pathing for sys.path in every file.
B. Appending PATH to include MyPackage (I do not want my package as part of the python "global" system pathing)
Thanks alot for any tips!
Two options spring to mind:
Use the environment variable PYTHONPATH and include .
Add it to your sys.path at the beginning of your program
import sys.path
sys.path.append('.')
If you need to import a relative path dynamically you can always do something like
import somemodule
import sys
dirname = os.path.dirname(os.path.abspath(somemodule.__file__))
sys.path.append(dirname)