How to determine the modules available in a PyPI package - python

Given a PyPI package name, like PyYAML, how can one programmatically
determine the modules available within the package (distribution package) that could be imported?
Detail
I'm not specifically interested in PyYAML, it's just a good example of a popular PyPI package which has a different
package name (PyYAML)
from it's primary module name (yaml)
such that you can't easily guess the module name from the package name.
I've seen other answers to questions that sound like this but are different, likely because of a naming collision
package meaning a python construct allowing for a collection of modules
package meaning a "Distribution Package", an archive file that
contains Python packages, modules, and other resource files that are used to distribute a Release.
My question is about the relationship between distribution packages and the modules within.
Possible Solution Spaces
Areas that seem like they might be fruitful (but which I've not had success with yet) are :
The pydoc.help function
(surfaced as the help built-in)
outputs a complete list of all available modules when called as help('modules'). This
shows modules that have not been imported but could be. It outputs in a human readable form
to stdout, and I've been unable to figure out how the pydoc code
enumerates the modules.
I could imagine calling this, gathering the module list, installing a new distribution package into a virtualenv with
pip programatically, calling it again and diffing the results.
Progamatically installing a distribution package with pip in order to
Iterate through elements of the python path to find modules

My project johnnydep provides exactly this feature:
$ johnnydep --fields=import_names PyYAML
name import_names
------ --------------
PyYAML yaml
Note that some distributions export multiple top-level names, some distributions export none at all, and there is not necessarily any obvious relationship between the distribution name (used with a pip install command) and the package name (used with an import statement) - though it is a common convention for them to be matched.
For example, the popular project setuptools exposes three top-level names:
$ johnnydep --fields=import_names setuptools
name import_names
---------- ---------------------------------------
setuptools easy_install, pkg_resources, setuptools
API usage is via attribute access:
>>> from johnnydep.lib import JohnnyDist
>>> jdist = JohnnyDist("setuptools")
>>> jdist.import_names
['easy_install', 'pkg_resources', 'setuptools']
If you are interested to know submodule names, not top-level names, that's possible with stdlib pkgutil, for example:
>>> import pkgutil, requests
>>> [name for finder, name, ispkg in pkgutil.walk_packages(requests.__path__)]
['__version__',
'_internal_utils',
'adapters',
'api',
'auth',
'certs',
'compat',
'cookies',
'exceptions',
'help',
'hooks',
'models',
'packages',
'sessions',
'status_codes',
'structures',
'utils']

Related

Specifying an internal PyPI repository for Python Setup

I've created 2 libraries, foo and bar for use in proprietary projects. I have an internal PyPI repository that I publish these libraries to. Additionally bar depends on foo and I have added it accordingly to my requirements.txt and the install_requires field in setup.py. Here's an example:
setup(
name='bar',
...
install_requires=['foo~=1.0.0'],
dependency_links=[url_to_foo],
)
However, when I try to use bar in my other projects (let's call this foobar, I get this error:
ERROR: No matching distribution found for foo==1.0.0
Unless I specify url_to_foo in foobar's dependency links as well, like so:
setup(
name='foobar',
...
install_requires=['bar~=1.0.0'],
dependency_links=[url_to_bar, url_to_foo],
)
This would be really bad if I had other modules that depend on foobar as I would have to specify the urls to all dependencies, and so on.
pip currently has this command line argument, --extra-index-url, where I can simply specify the url to the PyPI repository. Is there an equivalent attribute I can specify in setuptools's setup function?

display __version__ using setuptools.setup values in setup.py

I have a packaged project mytools which uses setuptools' setup to store its version in a setup.py project file e.g.
import setuptools
setuptools.setup(
name='mytools',
version='0.1.0'
)
I'd like to get the common mytools.__version__ feature based on the version value e.g.
import mytools
mytools.__version__
>>>'0.1.0'
Is there native / simple way in setuptools to do so? Couldn't find a reference to __version__ in setuptools.
Furthermore, I don't want to store the version in __init__.py because I'd prefer to keep the version in its current place (setup.py).
The many answers to similar questions do not speak to my specific problem, e.g. How can I get the version defined in setup.py (setuptools) in my package?
Adding __version__ to all top-level modules and packages is a recommendation from PEP 396.
Lately I have seen growing concerns raised about this recommendation and its actual usefulness, for example here:
https://gitlab.com/python-devs/importlib_resources/-/issues/100
https://gitlab.com/python-devs/importlib_metadata/-/merge_requests/125
some more that I can't find right now...
With that said...
Such a thing is often solved like the following:
# my_top_level_module/__init__.py
import importlib.metadata
__version__ = importlib.metadata.version('MyProject')
References:
https://docs.python.org/3/library/importlib.metadata.html
https://importlib-metadata.readthedocs.io/en/latest/using.html#distribution-versions

Python - Creating egg file what is the use of description and long description

I am creating an egg file and I am able to do that successfully. However, the value I have provided in description and long_description is not visible.
setup.py
description = "desc"
long_description = "lond desc"
setup(
name="abc",
version="0.2",
packages=find_packages(),
description=description,
long_description=long_description,
author='Gaurang Shah',
author_email='gaurang.shah#abc.com'
)
Build script
rm -rf build dist dataplaform.egg-info
python setup.py bdist_egg
After installing a package, when I run following command. I don't see anything?
import abc
abc.__doc__
You would see description and/or long_description on pip show abc or on the PyPI repository. Basically on places that refer to the Python project abc.
When you type import abc; print(abc.__doc__) you refer to a Python top level package (or module) abc that coincidentally has been made available by installing the distribution (in this case a bdist_egg) of a project bearing the same name abc.
Python projects and Python packages are not the same thing though. The confusion comes from the fact that it is almost always the case that a Python project contains a single top level package of the same name, and so both are used interchangeably to great confusion. See beautifulsoup4 for a famous counter example.
In your case abc.__doc__ actually refers to the docstring of your abc/__init__.py (or eventually a top level abc.py).

Python module development workflow - setup and build [duplicate]

I'm developing my own module in python 2.7. It resides in ~/Development/.../myModule instead of /usr/lib/python2.7/dist-packages or /usr/lib/python2.7/site-packages. The internal structure is:
/project-root-dir
/server
__init__.py
service.py
http.py
/client
__init__.py
client.py
client/client.py includes PyCachedClient class. I'm having import problems:
project-root-dir$ python
Python 2.7.2+ (default, Jul 20 2012, 22:12:53)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from server import http
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "server/http.py", line 9, in <module>
from client import PyCachedClient
ImportError: cannot import name PyCachedClient
I didn't set PythonPath to include my project-root-dir, therefore when server.http tries to include client.PyCachedClient, it tries to load it from relative path and fails. My question is - how should I set all paths/settings in a good, pythonic way? I know I can run export PYTHONPATH=... in shell each time I open a console and try to run my server, but I guess it's not the best way. If my module was installed via PyPi (or something similar), I'd have it installed in /usr/lib/python... path and it'd be loaded automatically.
I'd appreciate tips on best practices in python module development.
My Python development workflow
This is a basic process to develop Python packages that incorporates what I believe to be the best practices in the community. It's basic - if you're really serious about developing Python packages, there still a bit more to it, and everyone has their own preferences, but it should serve as a template to get started and then learn more about the pieces involved. The basic steps are:
Use virtualenv for isolation
setuptools for creating a installable package and manage dependencies
python setup.py develop to install that package in development mode
virtualenv
First, I would recommend using virtualenv to get an isolated environment to develop your package(s) in. During development, you will need to install, upgrade, downgrade and uninstall dependencies of your package, and you don't want
your development dependencies to pollute your system-wide site-packages
your system-wide site-packages to influence your development environment
version conflicts
Polluting your system-wide site-packages is bad, because any package you install there will be available to all Python applications you installed that use the system Python, even though you just needed that dependency for your small project. And it was just installed in a new version that overrode the one in the system wide site-packages, and is incompatible with ${important_app} that depends on it. You get the idea.
Having your system wide site-packages influence your development environment is bad, because maybe your project depends on a module you already got in the system Python's site-packages. So you forget to properly declare that your project depends on that module, but everything works because it's always there on your local development box. Until you release your package and people try to install it, or push it to production, etc... Developing in a clean environment forces you to properly declare your dependencies.
So, a virtualenv is an isolated environment with its own Python interpreter and module search path. It's based on a Python installation you previously installed, but isolated from it.
To create a virtualenv, install the virtualenv package by installing it to your system wide Python using easy_install or pip:
sudo pip install virtualenv
Notice this will be the only time you install something as root (using sudo), into your global site-packages. Everything after this will happen inside the virtualenv you're about to create.
Now create a virtualenv for developing your package:
cd ~/pyprojects
virtualenv --no-site-packages foobar-env
This will create a directory tree ~/pyprojects/foobar-env, which is your virtualenv.
To activate the virtualenv, cd into it and source the bin/activate script:
~/pyprojects $ cd foobar-env/
~/pyprojects/foobar-env $ . bin/activate
(foobar-env) ~/pyprojects/foobar-env $
Note the leading dot ., that's shorthand for the source shell command. Also note how the prompt changes: (foobar-env) means your inside the activated virtualenv (and always will need to be for the isolation to work). So activate your env every time you open a new terminal tab or SSH session etc..
If you now run python in that activated env, it will actually use ~/pyprojects/foobar-env/bin/python as the interpreter, with its own site-packages and isolated module search path.
A setuptools package
Now for creating your package. Basically you'll want a setuptools package with a setup.py to properly declare your package's metadata and dependencies. You can do this on your own by by following the setuptools documentation, or create a package skeletion using Paster templates. To use Paster templates, install PasteScript into your virtualenv:
pip install PasteScript
Let's create a source directory for our new package to keep things organized (maybe you'll want to split up your project into several packages, or later use dependencies from source):
mkdir src
cd src/
Now for creating your package, do
paster create -t basic_package foobar
and answer all the questions in the interactive interface. Most are optional and can simply be left at the default by pressing ENTER.
This will create a package (or more precisely, a setuptools distribution) called foobar. This is the name that
people will use to install your package using easy_install or pip install foobar
the name other packages will use to depend on yours in setup.py
what it will be called on PyPi
Inside, you almost always create a Python package (as in "a directory with an __init__.py) that's called the same. That's not required, the name of the top level Python package can be any valid package name, but it's a common convention to name it the same as the distribution. And that's why it's important, but not always easy, to keep the two apart. Because the top level python package name is what
people (or you) will use to import your package using import foobar or from foobar import baz
So if you used the paster template, it will already have created that directory for you:
cd foobar/foobar/
Now create your code:
vim models.py
models.py
class Page(object):
"""A dumb object wrapping a webpage.
"""
def __init__(self, content, url):
self.content = content
self.original_url = url
def __repr__(self):
return "<Page retrieved from '%s' (%s bytes)>" % (self.original_url, len(self.content))
And a client.py in the same directory that uses models.py:
client.py
import requests
from foobar.models import Page
url = 'http://www.stackoverflow.com'
response = requests.get(url)
page = Page(response.content, url)
print page
Declare the dependency on the requests module in setup.py:
install_requires=[
# -*- Extra requirements: -*-
'setuptools',
'requests',
],
Version control
src/foobar/ is the directory you'll now want to put under version control:
cd src/foobar/
git init
vim .gitignore
.gitignore
*.egg-info
*.py[co]
git add .
git commit -m 'Create initial package structure.
Installing your package as a development egg
Now it's time to install your package in development mode:
python setup.py develop
This will install the requests dependency and your package as a development egg. So it's linked into your virtualenv's site-packages, but still lives at src/foobar where you can make changes and have them be immediately active in the virtualenv without re-installing your package.
Now for your original question, importing using relative paths: My advice is, don't do it. Now that you've got a proper setuptools package, that's installed and importable, your current working directory shouldn't matter any more. Just do from foobar.models import Page or similar, declaring the fully qualified name where that object lives. That makes your source code much more readable and discoverable, for yourself and other people that read your code.
You can now run your code by doing python client.py from anywhere inside your activated virtualenv. python src/foobar/foobar/client.py works just as fine, your package is properly installed and your working directory doesn't matter any more.
If you want to go one step further, you can even create a setuptools entry point for your CLI scripts. This will create a bin/something script in your virtualenv that you can run from the shell.
setuptools console_scripts entry point
setup.py
entry_points='''
# -*- Entry points: -*-
[console_scripts]
run-fooobar = foobar.main:run_foobar
''',
client.py
def run_client():
# ...
main.py
from foobar.client import run_client
def run_foobar():
run_client()
Re-install your package to activate the entry point:
python setup.py develop
And there you go, bin/run-foo.
Once you (or someone else) installs your package for real, outside the virtualenv, the entry point will be in /usr/local/bin/run-foo or somewhere simiar, where it will automatically be in $PATH.
Further steps
Creating a release of your package and uploading it PyPi, for example using zest.releaser
Keeping a changelog and versioning your package
Learn about declaring dependencies
Learn about Differences between distribute, distutils, setuptools and distutils2
Suggested reading:
The Hitchhiker’s Guide to Packaging
The pip cookbook
So, you have two packages, the first with modules named:
server # server/__init__.py
server.service # server/service.py
server.http # server/http.py
The second with modules names:
client # client/__init__.py
client.client # client/client.py
If you want to assume both packages are in you import path (sys.path), and the class you want is in client/client.py, then in you server you have to do:
from client.client import PyCachedClient
You asked for a symbol out of client, not client.client, and from your description, that isn't where that symbol is defined.
I personally would consider making this one package (ie, putting an __init__.py in the folder one level up, and giving it a suitable python package name), and having client and server be sub-packages of that package. Then (a) you could do relative imports if you wanted to (from ...client.client import something), and (b) your project would be more suitable for redistribution, not putting two very generic package names at the top level of the python module hierarchy.

Usage of "provides" keyword-argument in python's setup.py

I am working on a fork of a python projet (tryton) which uses setuptools for packaging. I am trying to extend the server part of the project, and would like to be able to use the existing modules with my fork.
Those modules are distributed with setuptools packaging, and are requiring the base project for installation.
I need a way to make it so that my fork is considered an acceptable requirement for those modules.
EDIT : Here is what I used in my setup.py :
from setuptools import setup
setup(
...
provides=["trytond (2.8.2)"],
...
)
The modules I want to be able to install have those requirements :
from setuptools import setup
setup(
...
install_requires=["trytond>=2.8"]
...
)
As it is, with my package installed, trying to install a module triggers the installation of the trytond package.
Don’t use provides, it comes from a packaging specification (a metadata PEP) that is not implemented by any tool. The requiremens in the install_requires argument map to the name in your other setup.py. IOW, replace your provides with setup(name='trytond', version='2.8.2').
If you are building rpms, it is possible to use the setup.cfg as follows:
[bdist_rpm]
provides = your-package = 0.8
obsoletes = your-package

Categories

Resources