Accessing resources included in python source distribution package

Accessing resources included in python source distribution package - python

I'm trying to create a python package, and I've added some files which are needed for the module to function, like this: https://docs.python.org/2/distutils/setupscript.html#distutils-additional-files
Due to circumstances, this is the method I need to use, if at all possible. I also need to end up with a source distributable, so something that works when making other types of python distributables doesn't work for me.
My setup.py looks like this:
from setuptools import setup
setup(name='mypackage',
version='0.1',
py_modules=['mypackage'],
install_requires=['numpy'],
data_files=[('data', ['data/file0.npz', 'data/file1.npz'])]
)
The directory structure looks like this:
├── PKG-INFO
├── data
│   ├── data0.npz
│   └── data1.npz
├── dist
│   ├── mypackage-0.1.zip
├── mypackage.egg-info
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   ├── dependency_links.txt
│   ├── requires.txt
│   └── top_level.txt
├── mypackage.py
├── setup.cfg
└── setup.py
I'm trying to load it in like this(every function but init removed for simplicity):
import numpy as np
class MyClass ():
def __init__(self):
self.data0 = np.load("data/file0.npz")
self.data1 = np.load("data/file1.npz")
And get this error when trying to instantiate the class:
No such file or directory: 'data/file0.npz'
What do I need to change to get this working?

To load package resources, I usually use pkg_resources module
Here is an example to get resource file relative to current module:
from pkg_resources import resource_filename
def main():
print(resource_filename(__name__, 'data/test.txt'))
In your setup.py you can use package_data to include package data files.
setup(
name='',
# (...)
package_data={
'': [
'*.txt',
],
},
)
Note: To make it works, data has to be a python module.

Related

Error: package directory XYZ does not exist

I have a directory with following structure,
.
├── awesome
│   ├── alice
│   │   ├── conf.py
│   │   └── __init__.py
│   ├── bob
│   │   ├── conf.py
│   │   └── __init__.py
│   ├── conf.py
│   ├── __init__.py
│   └── john
│   ├── conf.py
│   └── __init__.py
├── not_awesome_1
│   ├── __init__.py
│   └── something.py
├── not_awesome_2
│   ├── __init__.py
│   └── something.py
└── setup.py
I want to make the awesome package to be shippable. So, I made the setup.py as below,
from setuptools import find_packages, setup
setup(
name="my-awesome-package",
version="0.1.0",
description="",
long_description="",
license="BSD",
packages=find_packages(where="awesome"),
include_package_data=True,
author="JPG",
author_email="foo#gmail.com",
install_requires=[],
)
I ran the command python3 setup.py bdist_wheel and it gave me the result
running bdist_wheel
running build
running build_py
error: package directory 'alice' does not exist
What was I trying to achieve?
I wanted to decouple the awesome package and wanted to reuse it in multiple projects as I'm currently using the same in not_awesome_1 or not_awesome_2 packages.
In other words, after the successful installation of my-awesome-package I should be able to use the awesome packge as
from awesome.alice.conf import Alice
alice = Alice()
What have I tried so far?
replaced packages=find_packages(where="awesome"), with packages=find_packages(),, but, during the build it also includes the not_awesome_X packages as well - which is not intended.
Intriduced package_dir as well
setup(
# other options
packages=find_packages(where="awesome"),
package_dir={"": "awesome"},
)
But, this doesn't allow me to import my packages as from awesome.alice.conf import Alice, but, from alice.conf import Alice (ie, awesome is missing)
Questions?
What was I doing wrong here?
How to properly configure packages and package_dir?

I encountered a similar error. Try manually defining both the top-level package and the sub-packages:
packages=["awesome", "awesome.alice", "awesome.bob", "awesome.john", "awesome.something.somethingelse"].
Edit:
The issue is that using the where kwarg defines the package to search in. Since you have packages in the root of the project that should not be bundled, you'll likely need to manually add the parent package's name in front of each of its sub-packages.
from setuptools import find_packages
if __name__ == "__main__":
print(find_packages(where="awesome"))
# ['bob', 'alice', 'john', 'john.child']
# the problem here is that 'awesome' is the root, not the current directory containing awesome
root_package = "awesome"
print([root_package] + [f"{root_package}.{item}" for item in find_packages(where=root_package)])
# ['awesome', 'awesome.bob', 'awesome.alice', 'awesome.john', 'awesome.john.child']
Then, in your setup.py:
...
root_package = "awesome"
...
setup(
# other options
packages=[root_package] + [f"{root_package}.{item}" for item in find_packages(where=root_package)],
# package_dir={"": "awesome"}, <- not needed
)

Python executable script ModuleNotFound

I am trying to better understand importing modules. I read about how to do this from here https://stackoverflow.com/a/14132912/14179793 and I can get it to work using solution 1. There is an additional variable that I need to figure out though.
This is a dummy project I am testing with:
.
├── a_package
│   ├── __init__.py
│   └── lib_a.py
├── b_package
│   ├── __init__.py
│   └── test_script.py
├── main.py
└── src
└── src_lib
└── src_lib.py
With this setup I can do:
python -m b_package.test_script
this is lib a function
This is src_lib_function.
test_script.py:
from a_package.lib_a import lib_a_function
from src.src_lib.src_lib import src_lib_function
if __name__ == '__main__':
lib_a_function()
src_lib_function()
pass
The goal is to make b_package/test_script.py executable without using python test_script ie ./test_script
However, adding the shebang at the top #!/usr/bin/env python causes an import error:
$ ./b_package/test_script.py
Traceback (most recent call last):
File "./b_package/test_script.py", line 2, in <module>
from a_package.lib_a import lib_a_function
ModuleNotFoundError: No module named 'a_package'
I assume it is because python is not loading it as a module based off the above mentioned question but I am not sure how to resolve this.

I ended up using setuptools as suggested by kosciej16 to achieve the desired results.
New project structure:
.
├── a_package
│   ├── __init__.py
│   └── lib_a.py
├── b_package
│   ├── __init__.py
│   └── test_script.py
├── main.py
├── pyproject.toml
├── setup.cfg
└── src
├── __init__.py
└── src_lib
├── __init__.py
└── src_lib.py
setup.cfg:
[metadata]
name = cli_test
version = 0.0.1
[options]
packages = find:
[options.entry_points]
console_scripts =
test_script = b_package.test_script:main
This allows the user to clone the repo and run pip install . from the top level then they can execute the script by just typing test_script

Namespace corrupted when using setup.py and causes AttributeError: module has no attribute

I wrote a small tool (package) that reuses an existing namespace, pki.server. I named my package as pki.server.healthcheck. The old namespace did not use setuptools to install the package, while my package uses it.
Contents of setup.py
from setuptools import setup
setup(
name='pkihealthcheck',
version='0.1',
packages=[
'pki.server.healthcheck.core',
'pki.server.healthcheck.meta',
],
entry_points={
# creates bin/pki-healthcheck
'console_scripts': [
'pki-healthcheck = pki.server.healthcheck.core.main:main'
]
},
classifiers=[
'Programming Language :: Python :: 3.6',
],
python_requires='!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*',
setup_requires=['pytest-runner',],
tests_require=['pytest',],
)
The installation tree (from scenario 1 below) looks like:
# tree /usr/lib/python3.8/site-packages/pki/
├── __init__.py <---- Has methods and classes
├── cli
│   ├── __init__.py <---- Has methods and classes
│   ├── <some files>
├── server
│   ├── cli
│   │   ├── __init__.py <---- Has methods and classes
│   │   ├── <Some files>
│   ├── deployment
│   │   ├── __init__.py <---- Has methods and classes
│   │   ├── <some files>
│   │   └── scriptlets
│   │   ├── __init__.py <---- Has methods and classes
│   │   ├── <some files>
│   ├── healthcheck
│   │   ├── core
│   │   │   ├── __init__.py <---- EMPTY
│   │   │   └── main.py
│   │   └── pki
│   │   ├── __init__.py <---- EMPTY
│   │   ├── certs.py
│   │   └── plugin.py
│   └── instance.py <---- Has class PKIInstance
└── <snip>
# tree /usr/lib/python3.8/site-packages/pkihealthcheck-0.1-py3.8.egg-info/
├── PKG-INFO
├── SOURCES.txt
├── dependency_links.txt
├── entry_points.txt
└── top_level.txt
I read the official documentation and experimented with all 3 suggested methods. I saw the following results:
Scenario 1: Native namespace packages
At first, everything seemed smooth. But:
# This used to work before my package gets installed
>>> import pki.server
>>> instance = pki.server.instance.PKIInstance("pki-tomcat")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'pki.server' has no attribute 'instance'
Now, only this works
>>> import pki.server.instance
>>> instance = pki.server.instance.PKIInstance("pki-tomcat")
>>> instance
pki-tomcat
Scenario 2: pkgutil-style namespace packages
I am restricted from using this method as my other __init__.py contain classes and functions
Scenario 3: pkg_resources-style namespace packages
Though this method was not-recommended, I went ahead and experimented with it by adding namespace=pki.server.healthcheck to my setup.py. This made all pki.* modules invisible
So I am convinced that Scenario 1 seems to be the closest to what I'm trying to achieve. I was reading an old post to understand more on how import in python works.
My question is: Why does a perfectly working snippet break after I install my package?

Your __init__.py files need to import the files. You have two options--absolute and relative imports:
Relative Imports
pki/__init__.py:
from . import server
pki/server/__init__.py:
from . import instance
Absolute Imports
pki/__init__.py:
import pki.server
pki/server/__init__.py:
import pki.server.instance

When package has same name as .py file inside the package, import X imports the module and not the package itself. How to avoid?

So the package Im constructing looks something like this:
Blur/
├── blur
│   ├── __init__.py
│   ├── blur.py
│   ├── funcs
│   │   ├── __init__.py
│   │   ├── face_funcs.py
│   │   └── funcs.py
│   ├── tests
│   │   ├── __init__.py
│   │   └── test_blur.py
│   └── utils
│   ├── __init__.py
│   └── timer.py
└── setup.py
Doing import blur imports blur.py module, and not the whole package itself. If I change blur.py's name, then do the import, i get the whole package. Any way to get the whole package without changing blur.py's name?

If you specify the file path to the "Blur" folder in the import statement, that should force python to import the package, and not the .py file. I believe by default python looks for files ending in .py when importing.

AFAIK it is not possible as a built-in Python mechanism, so changing the name would be your best bet.
There are, however, workarounds available, but in my opinion such solutions are rather ugly.
Here is an example:
import importlib
import os
import sys
def custom_import(repo_path, directory, package=False):
absolute_path = repo_path if package else os.path.join(repo_path, directory)
if os.path.exists(absolute_path):
sys.path.insert(0, absolute_path)
if package:
imported = importlib.import_module(f'{directory}')
else:
imported = importlib.import_module(f'{directory}.{directory}')
sys.path.pop(0)
return imported
raise ImportError(' No such package/module found')
Example use:
package = custom_import('D:\PycharmProjects\python-tools', 'roomba', package=True)
module = custom_import('D:\PycharmProjects\python-tools', 'roomba')
module.main()

How can a `console_scripts` access file from `package_data`

When create a console_scripts as an entry_points, how do you access a data file (package_data) within the package?
setup(
# other stuff
entry_points={
'console_scripts': [
'runme = mypackage.a_py_file:a_function_within_the_py_file',
]
}
)
Directory structure:
├── mypackage
│   ├── __init__.py
│   └── a_py_file.py
├── requirements.txt
├── setup.py
└── data
   ├── a.data
   └── b.data
Python file to handle console_scripts:
# a_py_file.py
def a_function_within_the_py_file:
# HOW TO READ a.data FILE HERE

How about changing cwd?
import os
os.chdir(__file__)
conftest.py sounds like a good place to do this. Or the file which is attached to your test command.

So this is what I did and it worked:
import os
import pkg_resources
os.chdir(pkg_resources.get_distribution('mypackage').location)
# running as if the script is invoked from project's root

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Accessing resources included in python source distribution package - python

Related

Error: package directory XYZ does not exist

Python executable script ModuleNotFound

Namespace corrupted when using setup.py and causes AttributeError: module has no attribute

When package has same name as .py file inside the package, import X imports the module and not the package itself. How to avoid?

How can a `console_scripts` access file from `package_data`

Categories

Resources