How to package a command line Python script

How to package a command line Python script - python

I've created a python script that's intended to be used from the command line. How do I go about packaging it? This is my first python package and I've read a bit about setuptools, but I'm still not sure the best way to do this.
Solution
I ended up using setup.py with the key configurations noted below:
setup(
....
entry_points="""
[console_scripts]
mycommand = mypackage.mymodule:main
""",
....
)
Here's a good example in context.

Rather than using setuptools non standard way of proceeding, it is possible to directly rely on distutils setup's function, using the scripts argument, as stated here: http://docs.python.org/distutils/setupscript.html#installing-scripts
from distutils import setup
setup(
...,
scripts=['path/to/your/script',],
...
)
It allows you to stay compatible a) with all python versions and b) not having to rely on a setuptools as an external dependency.

#Zach, given your clarification in your comment to #soulmerge's answer, it looks like what you need is to write a setup.py as per the instructions regarding the distutils -- here in particular is how you register on pypi, and here on how to upload to pypi once you are registrered -- and possibly (if you need some extra functionality wrt what the distutils supply on their own) add setuptools, of which easy_install is part, via the instructions here.

Last month, I have written an article answering exactly your question. You can find it here: http://gehrcke.de/2014/02/distributing-a-python-command-line-application/
There, I am using only currently recommended methods (twine, pure setuptools instead of distutils, the console_scripts key in the entry_points dictionary, ...), which work for Python 2 and 3.

What do you mean by packaging? If it is a single script to be run on a box that already has python installed, you just need to put a shebang into the first line of the file and that's it.
If you want it to be executed under Windows or on a box without python, though, you will need something external, like pyinstaller.
If your question is about where to put configuration/data files, you'll need to work platform-dependently (like writing into the registry or the home folder), as far as I know.

For those who are beginners in Python Packaging, I suggest going through this Python Packaging Tutorial.
Note about the tutorial:
At this time, this documentation focuses on Python 2.x only, and may not be as applicable to packages targeted to Python 3.x

Related

Is there a way to include shell scripts in a Python package with pyproject?

Previously with setup.py you could just add
setuptools.setup(
...
scripts=[ "scripts/myscript.sh" ]
)
and the shell script was just copied to the path of the environment. But with the new pyproject scpecification, this seems to not be possible any more. According to the Python specification of entry points and the setuptools specification, only python functions that will be wrapped later, are allowed. Does anyone know a simple way of doing this like in setup.py? Or at least simpler than just doing a python function that calls the shell script with subprocess, which is what I think I will do if there's no simpler way.

Probably using the script-files field of the [tool.setuptools] section should work:
[tool.setuptools]
script-files = ["scripts/myscript.sh"]
It was not standardized in PEP 621, so it belongs in a setuptools-specific section.
Setuptools marks it as deprecated, but personally I would assume that it is safe to use for the next couple of years at least. It seems like such scripts are standardized in the wheel file format, so it is a bit strange that they are not in pyproject.toml's [project] section. Maybe it will be added later, but that is just speculation.
Reference:
https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html#setuptools-specific-configuration
https://packaging.python.org/en/latest/specifications/binary-distribution-format/
https://discuss.python.org/t/whats-the-status-of-scripts-vs-entry-points/18524

How do I package a single python script which takes command line arguments and also has dependencies?

I have a single Python file which is supposed to take in a bunch of inputs during the command.
For eg: python script.py "string_1" "string_2"
I also have a bunch of dependencies including pandas, datetime and even Python3.
I want to package all this code in a manner that anyone can install the package along with the dependencies as well (in a directory or so) and then just call the script/module : in the above manner. Without having to actually go into a Python interpreter.
I tried using the python-packaging resource, but with that I would need to go into the interpreter, right ?

I found a good article today that explains quite well the procedure: https://medium.com/dreamcatcher-its-blog/making-an-stand-alone-executable-from-a-python-script-using-pyinstaller-d1df9170e263
pyinstaller --onefile <script.py> is the tl;dr on linux. On windows you need also py32exe

If you can rely on a base install of python being present already.
Then it's worth looking at Python's zipapp module introduced in Python3.5 https://docs.python.org/3/library/zipapp.html#creating-standalone-applications-with-zipapp For background info PEP441 https://www.python.org/dev/peps/pep-0441/
Also there is a project called Shiv which adds some extra abilities to the zipapp module bundled in python3.5
https://shiv.readthedocs.io/en/latest/

Have a look at pex (https://pex.readthedocs.io/en/stable/). It wraps up your python scripts, files, dependencies, etc into a single executable. You still need the python interpreter installed, but it includes everything else.

Override the shebang mangling in python setuptools

Background
I write small python packages for a system that uses modules (https://luarocks.org/) to manage packages. For those of you who don't know it, you can run module load x and a small script is run that modifies various environmental variables to make software 'x' work, you can then undo this with module unload x.
This method of software management is nearly ubiquitous in scientific computing and has a lot of value in that arena: you can run ancient unmaintained software alongside packages that that software would interfere with, you can run multiple versions of software, which allows you to reproduce your data exactly (you can go back to old versions), and you can run frankly poorly written non updated software with outdated dependencies.
These features are great, but they create an issue with the python 2/3 split:
What if you want to write a package that works with both python 2 and 3 and use it alongside software that requires either python 2 or 3?
The way you make old python2 dependent software work on these large systems is that you make a python/2.7.x module and a python/3.5 module. When you want to run a script that uses python 2, you load that module, etc.
However, I want to write a single python package that can work in either environment, because I want that software to be active regardless of which python interpreter is being used.
This is fundamentally extremely easy: just use a #!/usr/bin/env python shebang line, done. That works. I write all my software to work with either, so no problem.
Question
The issue is: I want to use setuptools to distribute my package to other scientists in the same situation, and setup tools mangles the shebang line.
I don't want to get into a debate about whether mangling the shebang line is a good idea or not, I am sure it is since it has existed for years now in the same state. I honestly don't care, it doesn't work for me. The default setuptools install causes the software not to run because when a python interpreter's module is not loaded, that python interpreter does not function, the PYTHONPATH is totally wrong for it.
If all of my users had root access, I could use the data_files option to just copy the scripts to /usr/bin, but this is a bad idea for compatibility, and my users don't have root access anyway so it is a moot point.
Things I tried so far:
I tried setting the sys.executable to /usr/bin/env python in the setup.py file, but that doesn't work, because then the shebang is: #!"/usr/bin/env python", which obviously doesn't work.
I tried the Don't touch my shebang class idea in this question: Don't touch my shebang! (it is the bottom answer with 0 votes). That didn't work either, probably because it is written for distutils and not setuptools. Plus that question is 6 years old.
I also looked at these questions:
Setuptools entry_points/console_scripts have specific Python version in shebang
Changing console_script entry point interpreter for packaging
The methods described there do not work, the shebang line is still altered.
Creating a setup.cfg file with the contents::
[build]
executable = /usr/bin/env python
also does not change the shebang line mangling behavior.
There is an open issue on the setuptools github page that discusses something similar:
https://github.com/pypa/setuptools/issues/494
So I assume this isn't possible to do natively, but I wonder if there is a workaround?
Finally, I don't like any solution that involves asking the user to modify their install flags, e.g. with -e.
Is there anyway to modify this behavior, or is there another distribution system I can use instead? Or is this too much of an edge case and I just need to write some kind of custom installation script?
Thanks all.
Update
I think I was not clear enough in my original question, what I want the user to be able to do is:
Install the package in both python2 and python3 (the modules will go into lib/pythonx/site-lib.
Be able to run the scripts irrespective of which python environment is active.
If there is a way to accomplish this without preventing shebang munging, that would be great.
All my code is already compatible with python 2.7 and python 3.3+ out of the box, the main thing is just making the scripts run irrespective of active python environment.

I accidentally stumbled onto a workaround while trying to write a custom install script.
import os
from setuptools import setup
from setuptools.command.install import install
here = os.path.abspath(os.path.dirname(__file__))
# Generate a list of python scripts
scpts = []
scpt_dir = os.listdir(os.path.join(here, 'bin'))
for scpt in scpt_dir:
scpts.append(os.path.join(here, 'bin', scpt))
class ScriptInstaller(install):
"""Install scripts directly."""
def run(self):
"""Wrapper for parent run."""
super(ScriptInstaller, self).run()
setup(
cmdclass={'install': ScriptInstaller},
scripts=scpts,
...
)
This code doesn't do exactly what I wanted (alter just the shebang line), it actually just copies the whole script to ~/.local/bin, instead of wrapping it in::
__import__('pkg_resources').run_script()
Additionally, and more concerningly, this method makes setuptools create a root module directory plus an egg-info directory like this::
.local/lib/python3.5/site-packages/cluster
.local/lib/python3.5/site-packages/python_cluster-0.6.1-py3.5.egg-info
Instead of a single egg, which is the usual behavior::
.local/lib/python3.5/site-packages/python_cluster-0.6.1-py3.5.egg
As far as I am aware this is the behavior of the old distutils, which makes me worry that this install would fail on some systems or have other unexpected behavior (although please correct me if I am wrong, I really am no expert on this).
However, given that my code is going to be used almost entirely on linux and OS X, this isn't the end of the world. I am more concerned that this behavior will just disappear sometime very soon.
I posted a comment on an open feature request on the setuptools github page:
https://github.com/pypa/setuptools/issues/494
The ideal solution would be if I could add an executable=/usr/bin/env python statement to setup.cfg, hopefully that is reimplemented soon.
This workaround will work for me for now though. Thanks all.

How to version a Python program

I'm doing a Python program (for use in a Linux environment) and I want it to support version actualizations of the program. I read somewhere I should add a line like "__ Version __" but I'm not really sure where and how to put it. What I want to achieve is not having to erase my whole program every time I want to install a new version of it.
thanks!

I highly recommend you to use setuptools instead of manually versioning it. It is the de-facto stanrad now and very simple to use. All you need to do is to create a setup.py in your projects root directory:
from setuptools import setup, find_packages
setup(name='your_programms_name',
version='major.minor.patch',
packages=find_packages(),
)
and then just run:
python setup.py sdist
and then there will be eggs under your dist folder.

What you actually want to do is make your Python program into a package using distutils.
You would create a setup.py. A simple example would look something like this:
from distutils.core import setup
setup(name='foo',
version='1.0',
py_modules=['foo'],
)
This is the most standard way to version and distribute Python code. If your code is open source you can even register it with the cheese shop^M^M^M^MPyPi and install it on any machine in the world with a simple pip install mypackage.

It depends what you want to be versioned.
Single modules with independent versions, or the whole program/package.
You could theoretically add a __version__ string to every class and do dynamic imports, by testing these variables.
If you want to version the main, or whole program you could add a __version__ string somewhere at the top of your __init__.py file and import this string into you setup.py when generating packages. This way you wouldn't have to manually edit multiple files and setup.py could be left mostly untouched.
Also consider using a string, not a number or tuple. See PEP 396.

If you are only concerned with versions on your local machine, I would suggest becoming familiar with Git. Information about version control in Git can be found here.
If you are concerned with version control on a module that others would use, information can be found here.

Get available modules

With PHP you have the phpinfo() which lists installed modules and then from there look up what they do.
Is there a way to see what packages/modules are installed to import?

Type help() in the interpreter
then
To get a list of available modules, keywords, or topics, type "modules",
"keywords", or "topics". Each module also comes with a one-line summary
of what it does; to list the modules whose summaries contain a given word
such as "spam", type "modules spam".
help> modules

If you use ipython, which is an improved interactive Python shell (aka "REPL"), you can type import  (note the space at the end) followed by a press of the [TAB] key to get a list of importable modules.
As noted in this SO post, you will have to reset its hash of modules after installing (certain?) new ones. You likely don't need to worry about this yet.
If you don't use ipython, and you haven't tried it, it might be worth checking out. It's a lot better than the basic Python shell, or pretty much any other REPL I've used.
ipython Installation
If you're running linux, there is most likely an ipython package that you can install through your system management tools. Others will want to follow these instructions.
If your installation route requires you to use easy_install, you may want to consider instead using pip. pip is a bit smarter than easy_install and does a better job of keeping track of file locations. This is very helpful if you end up wanting to uninstall ipython.
Listing packages
Note that the above tip only lists modules. For a list which also includes packages —which contain modules— you can do from  + [TAB]. An explanation of the difference between packages and modules can be found in the Modules chapter of the helpful official Python tutorial.
#rtfm
As an added note, if you are very new to python, your time may be better spent browsing the standard library documentation than by just selecting modules based on their name. Python's core documentation is well-written and well-organized. The organizational groups —File and Directory Access, Data Types, etc.— used in the library documentation's table of contents are not readily apparent from the module/package names, and are not really used elsewhere, but serve as a valuable learning aid.

This was very useful. Here is a script version of this:
# To list all installed packages just execfile THIS file
# execfile('list_all_pkgs.py')
for dist in __import__('pkg_resources').working_set:
print dist.project_name.replace('Python', '')

You can list available modules like so:
python -c "for dist in __import__('pkg_resources').working_set:print dist.project_name.replace('Python', '')"

As aaronasterling says, all .py or .pyc files on sys.path is a module because it can be imported. There are scripts that can let you find what external module is installed in site-packages.
Yolk is a Python command-line tool and library for obtaining information about packages installed by setuptools, easy_install and distutils and it can also query pypi packages.
http://tools.assembla.com/yolk/

You may use the pip module:
from pip._internal.operations.freeze import freeze
for line in freeze():
print(line.split('=='))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to package a command line Python script - python

For those who are beginners in Python Packaging, I suggest going through this Python Packaging Tutorial. Note about the tutorial: At this time, this documentation focuses on Python 2.x only, and may not be as applicable to packages targeted to Python 3.x

Related

Is there a way to include shell scripts in a Python package with pyproject?

How do I package a single python script which takes command line arguments and also has dependencies?

Override the shebang mangling in python setuptools

How to version a Python program

Get available modules

Categories

Resources