Name spacing your modules in Python - python

I have several repos that I want to name space. All of the repos follow the standard Python folder structures where
repo1 - repo1 - __init__.py
Outermost repo1 folder is the root folder and the inner repo1 folder is the root of the module. All of these repos will be installed using
pip install -e .
Currently, import statements like the following is used to import these modules.
import repo1
import repo2
import repo3
Is there a way to name space these modules so that I can have
import mymodule.repo1
import mymodule.repo2
import mymodule.repo3
I have to achieve the name spacing while keeping the repos separate. Merging the repos is not an option at this moment.

Implementation details depends on your needs for version support and distribution, but take a look at setuptools namespace_packages, this will do the work.
As pointed above, packaging site has an useful page on namespaced packaging.
Example for native namespaces (python >=3.3). Project layout for isolated repos:
project_root1
├── finance_namespace # no __init__ file here, this is important
│   └── repo1
│   ├── __init__.py
│   └── module1.py
└── setup.py
===============================
# setup.py
import setuptools
setuptools.setup(
name='repo1',
version='1',
description='',
long_description='',
author='Big bank',
author_email='john#bank.com',
license='MIT',
packages=['finance_namespace.repo1'],
zip_safe=False,
)
Now, by making cd project_root1 && pip install -e . you should be able to do
>>> from finance_namespace.repo1 import module1
>>> module1.func()

Related

Using git submodules with python

I've read a lot of blog posts and questions on this site about the usage of git submodules and still have no idea how to better use them with python.
I mean, what is the easier way to manage dependencies if I have such a package:
├── mypkg
│   └── __init__.py
├── setup.py
└── submodules
├── subm1
└── subm2
Then, what if I need to use "mypkg" as a submodule for "top_level_pkg":
├── setup.py
├── submodules
│   └── mypkg
└── top_level_package
└── __init__.py
, I want to run pip install . and have all resolved correctly (have each submodule installed to the VENV in correct order).
What I've tried:
Install each submodule using "pip" running in a subprocess. But it seems to be a hacky way and hard to manage (Unexpected installation of GIT submodule)
Use "install_requires" with "setuptools.find_packages()" but without success
Use requirements.txt file for each submodule, but I can't find a way how to automate it so "pip" could automatically install all requirements for all submodules.
Ideally, I imagine a separate setup.py file for each submodule with install_requires=['submodules/subm1', 'submodules/submn'], but setuptools does not support it.
I'm not saying it's impossible, but very hard and very tricky. A safer way is to turn each submodule into an installable Python module (with it's own setup.py) and install the submodules from Git.
This link describes how to install packages from Git with setup.py: https://stackoverflow.com/a/32689886/2952185
Thankfully to Gijs Wobben and sinoroc I came up with solution that works for my case:
install_requires=['subm1 # file://localhost/<CURENT_DIR>/path/to/subm1']
I have managed to install a Python package from a git submodule together with a main package. These are proprietary and are never published to PyPI. And both pip and tox seem to work just fine.
To set the context, I have a git repo with a single Python package and a single git submodule; the git submodule also contains a single Python package. I think this structure is as generic and simple as it can possibly be, here's a visualization:
main-git-repo-name
├── mainpkg
│ └── __init__.py
├── setup.py
├── tests
└── util-git-repo-name (this is a git submodule)
├── setup.py
├── test
└── utilpkg
└── __init__.py
I wanted to have pip install everything in a single invocation, and the utilpkg should be usable in mainpkg via just import utilpkg (not nested oddly).
The answer for me was all in setup.py:
First, specify the packages to install and their locations:
packages=find_packages(exclude=["tests"])
+ find_packages(where="util-git-repo-name/utilpkg", exclude=["test"]),
package_dir={
"mainpkg": "mainpkg",
"utilpkg": "util-git-repo-name/utilpkg"
},
Second, copy all the install_requires items from the git submodule package's setup.py file into the top level. In my case the utility package is an API client generated by swagger-codegen, so I had to add:
install_requires=[
"urllib3 >= 1.15", "six >= 1.10", "certifi", "python-dateutil",
...],
Anyhow, when running pip3 install . this config results in exactly what I want in the site-packages area: a directory mainpkg/ and a directory utilpkg/
HTH

How to properly package set of callable python scripts or modules

I've been searching the net for quite some time now but I can't seem to wrap my head around on how can I distribute my python scripts for my end user.
I've been using my scripts on my command line using this command python samplemodule.py "args1"
And this is also the way I want my user to also use it on their end with their command line. But my worry is that this certain modules have dependencies on other library or modules.
My scripts are working when they are all in the Project's root directory, but everything crumbles when I try to package them and put them in sub directories.
An example of this is I can't now run my scripts since its having an error when I'm importing a module from the data subdirectory.
This is my project structure.
MyProject
\formatter
__init__.py
__main__.py
formatter.py
addfilename.py
addscrapertype.py
...\data
__init__.py
helper.py
csv_formatter.py
setup.py
The csv_formatter.py file is just a wrapper to call the formatter.main.
Update: I was now able to generate a tar.gz package but the package wasn't callable when installed on my machine.
This is the setup.py:
import setuptools
with open("README.md", "r") as fh:
long_description = fh.read()
setuptools.setup(
name="formatter",
version="1.0.1",
author="My Name",
author_email="sample#email.com",
description="A package for cleaning and reformatting csv data",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/RhaEL012/Python-Scripts",
packages=["formatter"],
include_package_data=True,
package_data={
# If any package contains *.txt or *.rst files, include them:
"": ["*.csv", "*.rst", "*.txt"],
},
entry_points={
"console_scripts": [
"formatter=formatter.formatter:main"
]
},
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
],
python_requires='>=3.6',
install_requires=[
"pandas"
]
)
Now, after installing the package on the machine I wasn't able to call the module and it results in an error:
Z:\>addfilename "C:\Users\Username\Desktop\Python Scripts\"
Update: I try to install the setup.py in a virtual environment just to see where the error is coming from.
I install it then I get the following error: FileNotFoundError: [Errno 2] no such file or directory: 'README.md'
I try to include the README.md in the MANIFEST.in but still no luck.
So I try to make it a string just to see if the install will proceed.
The install proceed but then again, I encounter an error that says that the package directory 'formatter' does not exist
As I am not able to look into your specific files I will just explain how I usually tackle this issue.
This is the manner how I usually setup the command line interface (cli) tools. The project folder looks like:
Projectname
├── modulename
│   ├── __init__.py # this one is empty in this example
│   ├── cli
│   │   ├── __init__.py # this is the __init__.py that I refer to hereafter
│   ├── other_subfolder_with_scripts
├── setup.py
Where all functionality is within the modulename folder and subfolders.
In my __init__.py I have:
def main():
# perform the things that need to be done
# also all imports are within the function call
print('doing the stuff I should be doing')
but I think you can also import what you want into the __init__.py and still reference to it in the manner I do in setup.py.
In setup.py we have:
import setuptools
setuptools.setup(
name='modulename',
version='0.0.0',
author='author_name',
packages=setuptools.find_packages(),
entry_points={
'console_scripts': ['do_main_thing=modulename.cli:main'] # so this directly refers to a function available in __init__.py
},
)
Now install the package with pip install "path to where setup.py is" Then if it is installed you can call:
do_main_thing
>>> doing the stuff I should be doing
For the documentation I use: https://setuptools.readthedocs.io/en/latest/.
My recommendation is to start with this and slowly add the functionality that you want. Then step by step solve your problems, like adding a README.md etc.
I disagree with the other answer. You shouldn't run scripts in __init__.py but in __main__.py instead.
Projectfolder
├── formatter
│ ├── __init__.py
│ ├── cli
│ │ ├── __init__.py # Import your class module here
│ │ ├── __main__.py # Call your class module here, using __name__ == "__main__"
│ │ ├── your_class_module.py
├── setup.py
If you don't want to supply a readme, just remove that code and enter a description manually.
I use https://setuptools.readthedocs.io/en/latest/setuptools.html#find-namespace-packages instead of manually setting the packages.
You can now install your package by just running pip install ./ like you have been doing before.
After you've done that run: python -m formatter.cli arguments. It runs the __main__.py file you've created in the CLI folder (or whatever you've called it).
An important note about packaging modules is that you need to use relative imports. You'd use from .your_class_module import YourClassModule in that __init__.py for example. If you want to import something from an adjacent folder you need two dots, from ..helpers import HelperClass.
I'm not sure if this is helpful, but usually I package my python scripts using the wheel package:
pip install wheel
python setup.py sdist bdist_wheel
After those two commands a whl package is created in a 'dist' folder which you can then either upload to PyPi and download/install from there, or you can install it offline with the "pip install ${PackageName}.py"
Here's A useful user guide just in case there is something else that I didn't explain:
https://packaging.python.org/tutorials/packaging-projects/

Partially initialized module in minimal pip managed Python package

I have the following structure in package dir:
├── bin
│   └── package.py
├── package
│   ├── __init__.py
│   └── a_file.py
└── setup.py
a_file.py:
def a(): pass
__init__.py:
from .a_file import a
bin/package.py:
#!/usr/bin/env python
from package import a
setup.py:
setup(name='package',
version='0.1',
description='',
url='',
author='',
author_email='',
license='MIT',
packages=['package'],
scripts=['bin/package.py'],
zip_safe=False)
I install the package using:
pip install -e .
Once when I run $ package.py from the command line, the error is:
ImportError: cannot import name 'a' from partially initialized module 'package' (most likely due to a circular import)
As far as I understand this is obviously not a circular import. bin/package.py imports package/a_file.py through package/__init__.py. And package/a_file.py does not import anything.
What is the real problem here?
bin/package.py imports package/a_file.py through package/__init__.py
No, bin/package.py imports package and Python is trying to import any name from that bin/package.py. It's because Python automatically prepends the script's directory (bin in your case) to sys.path so any import related to package imports from bin/package.py, not from package/.
Never name your scripts the same as existing packages, especially packages from the standard library. Never create scripts email.py, test.py and so on.
Rename your bin/package.py to just package (no extension) or any other name.

Multiple subpackages within one package

I am trying to write a package with the following structure
/package
setup.py
/subpackage1
subpackage1.py
__init__.py
/subpackage2
subpackage2.py
__init__.py
/utils
some_other_files_and_codes
__init__.py
My setup.py currently looks like this:
from setuptools import setup, find_packages
setup(
name = 'subpackage1',
version = '1.0',
install_requires=['numpy',
'scipy'],
packages = find_packages(),
)
I then install it using pip install -e . from the /package folder.
However, I am not able to import subpackage2, only subpackage1.
I would like to be able to import them as
from package import subpackage1
from package import subpackage2
This is important because subpackage1 and subpackage2 exist as standalone packages too on my system.
Could someone help me with this?
The snippets you are showing do not make sense. Looks like there's a misunderstanding, in particular there's probably confusion between the name of the Python project and the names of the top-level importable packages.
In the setuptools.setup() function call, the parameter to the name argument should be the name of the project, not the name of an importable top level package. They can be the same names, but not necessarily.
The following might make it more explicit:
MyPythonProject
├── my_importable_package_one
│   ├── __init__.py
│   └── my_module_foo.py
├── my_importable_package_two
│ ├── __init__.py
│ └── my_module_bar.py
└── setup.py
setup.py
import setuptools
setuptools.setup(
name='MyPythonProject',
version='1.2.3',
packages=['my_importable_package_one', 'my_importable_package_two'],
# ...
)
from my_importable_package_one import my_module_foo
from my_importable_package_two import my_module_bar
Maybe this article on the terminology of Python packaging might help.

Failure to import names when custom project is installed in virtual environment

Problem
I have read this post, which provides a way to permanently avoid the sys.path hack when importing names between sibling directories. However, I followed the procedures listed in that post but found that I could not import installed package (i.e. test).
The following are things I have already done
Step1: create a project that looks like following. Both __init__.py are empty.
test
├── __init__.py
├── setup.py
├── subfolder1
│   ├── __init__.py
│   ├── program1.py
├── subfolder2
│   ├── __init__.py
│   └── program2.py
# setup.py
from setuptools import setup, find_packages
setup(name="test", version="0.1", packages=find_packages())
# program1
def func1():
print("I am from func1 in subfolder1/func1")
# program2
from test.subfolder1 import program1
program1.func1()
Step2. create virtual environment in project root directory (i.e. test directory)
conda create -n test --clone base
launch a new terminal and conda activate test
pip install -e .
conda list and I see the following, which means my test project is indeed installed in the virtual environment
...
test 0.1 dev_0 <develop>
...
Step3: go to the subfolder2 and python program2.py, but unexpectedly it returned
ModuleNotFoundError: No module named 'test.subfolder1'
The issue is I think test should be available as long as I am in virtual environment. However, it does not seem to be the case here.
Could some one help me? Thank you in advance!
You need to create an empty __init__.py file in subfolder1 to make it a package.
Edit:
You should change the import in program2.py:
from subfolder1 import program1
Or you can move setup.py a level up.

Categories

Resources