Python submodule as independently installable package - python

I have a Python 3 (>=3.7) project, that contains a complex web service. At the same time, I want to put a portion of the project (e.g. API client) on PyPi to allow external application to interface with it.
I would like to publish morphocluster while morphocluster.server (containing all service-related functionality) should be excluded. (That means that pip install morphocluster and pip install git+https://github.com/morphocluster/morphocluster/ should install morphocluster with all submodules except morphocluster.server.)
Moreover, morphocluster.server must be installable via a separate setup.py for installation in the service docker container.
Is that achievable in any way without splitting the project into distinct morphocluster and morphocluster_server packages?
I already had a look at namespace packages, but it seems that they don't allow functionality in the namespace itself.
Also setuptools.find_packages looked helpful, but wouldn't help with making morphocluster.server installable separately.

Related

How to have python libraries already installed in python project?

I am working on a python project that requires a few libraries. This project will further be shared with other people.
The problem I have is that I can't use the usual pip install 'library' as part of my code because the project could be shared with offline computers and the work proxy could block the download.
So what I first thought of was installing .whl files and running pip install 'my_file.whl' but this is limited since some .whl files work on some computers but not on others, so this couldn't be the solution of my problem.
I tried sharing my project with another project and i had an error with a .whl file working on one computer but not the other.
What I am looking for is to have all the libraries I need to be already downloaded before sharing my project. So that when the project is shared, the peers can launch it without needing to download the libraries.
Is this possible or is there something else that can solve my problem ?
There are different approaches to the issue here, depending on what the constraints are:
1. Defined Online Dependencies
It is a good practice to define the dependencies of your project (not only when shared). Python offers different methods for this.
In this scenario every developer has access to a pypi repository via the network. Usually the official main mirrors (i.e. via internet). New packages need to be pulled individually from here, whenever there are changes.
Repository (internet) access is only needed when pulling new packages.
Below the most common ones:
1.1 requirements.txt
The requirements.txt is a plain text list of required packages and versions, e.g.
# requirements.txt
matplotlib==3.6.2
numpy==1.23.5
scipy==1.9.3
When you check this in along with your source code, users can freely decide how to install it. The mosty simple (and most convoluted way) is to install it in the base python environment via
pip install -r requirements.txt
You can even automatically generate such a file, if you lost track with pipreqs. The result is usually very good. However, a manual cleanup afterwards is recommended.
Benefits:
Package dependency is clear
Installation is a one line task
Downsides:
Possible conflicts with multiple projects
Not sure that everyone has the exact same version if flexibility is allowed (default)
1.2 Pipenv
There is a nice and almost complete Answer to Pipenv. Also the Pipenv documentation itself is very good.
In a nutshell: Pipenv allows you to have virtual environments. Thus, version conflicts from different projects are gone for good. Also, the Pipfile used to define such an environment allows seperation of production and development dependencies.
Users now only need to run the following commands in the folder with the source code:
pip install pipenv # only needed first time
pipenv install
And then, to activate the virtual environment:
pipenv shell
Benefits:
Seperation between projects
Seperation of development/testing and production packages
Everyone uses the exact same version of the packages
Configuration is flexible but easy
Downsides:
Users need to activate the environment
1.3 conda environment
If you are using anaconda, a conda environment definition can be also shared as a configuration file. See this SO answer for details.
This scenario is as the pipenv one, but with anaconda as package manager. It is recommended not to mix pip and conda.
1.4 setup.py
When you are anyway implementing a library, you want to have a look on how to configure the dependencies via the setup.py file.
2. Defined local dependencies
In this scenario the developpers do not have access to the internet. (E.g. they are "air-gapped" in a special network where they cannot communicate to the outside world. In this case all the scenarios from 1. can still be used. But now we need to setup our own mirror/proxy. There are good guides (and even comlplete of the shelf software) out there, depending on the scenario (above) you want to use. Examples are:
Local Pypi mirror [Commercial solution]
Anaconda behind company proxy
Benefits:
Users don't need internet access
Packages on the local proxy can be trusted (cannot be corrupted / deleted anymore)
The clean and flexible scenarios from above can be used for setup
Downsides:
Network connection to the proxy is still required
Maintenance of the proxy is extra effort
3. Turn key environments
Last, but not least, there are solutions to share the complete and installed environment between users/computers.
3.1 Copy virtual-env folders
If (and only if) all users (are forced to) use an identical setup (OS, install paths, uses paths, libraries, LOCALS, ...) then one can copy the virtual environments for pipenv (1.2) or conda (1.3) between PCs.
These "pre-compiled" environments are very fragile, as a sall change can cause the setup to malfunction. So this is really not recommended.
Benefits:
Can be shared between users without network (e.g. USB stick)
Downsides:
Very fragile
3.2 Virtualisation
The cleanest way to support this is some kind of virtualisation technique (virtual machine, docker container, etc.).
Install python and the dependencies needed and share the complete container.
Benefits:
Users can just use the provided container
Downsides:
Complex setup
Complex maintenance
Virtualisation layer needed
Code and environment may become convoluted
Note: This answer is compiled from the summary of (mostly my) comments

Dependency management in Python packages

I'm currently developing a CI process for a micro-services Python application. It is built as such: each micro service is packed as a docker image, and there are several of those. In addition, there's some common code which is packed as a PyPI package, and is consumed by the services.
Fot the manner of the discussion, let's say we have a service called foo, and the common code called lib.
In the day-to-day development, we want foo to consume the latest versions of lib. But once we want to release a version of foo, we want to merge the code to the main branch, and record the exact version of lib in the requirements.txt of foo.
The idea that came up is to work in the following manner: In foo, We'll have a develop and master branches. On each push to develop, we'll build an image with the latest version of lib. When the developers merge the code to master, we run pip freeze > requirements.txt , and git push it to main again, so that when we want to come back to this version, we'll have a the requirements.txt file pinned to a specific version of lib (and the rest of the dependencies).
That sounds OK overall, but let's add a complexity to this:
Our lib, in turn, depends on another PyPI package, let's call it utils. The setup.py of lib contains a install_requires field which specified utils. Again, in the day-to-day, we want it to consume the latest version of utils, but when we merge the code to main (in lib), we want to pin a specific version.
The question is, is there a way to automatically update the setup.py install_requires section as there is to automatically update the requirements.txt with pip freeze?
But actually my question is - does this process make sense? Maybe we're missing something here?

Python how to share package between multiple projects

I have a package that contains protocols for specific hardware from multiple vendors. This is basically a collection of multiple different protocols that have similar interface. It is ported to python by me. But now I have two projects that handle communication with this equipment using different transport protocols one is over TCP another is over RS-485/232. How to share the package with protocols between this two projects so that I don't need to copy it after I add new functionality support or bug fixes.
You can build a python package (more info here: https://python-packaging.readthedocs.io/en/latest/)
Once you have created your package containing a setup.py file, you can install it on your machine using the:
pip install -e myPackage
command. (Where "myPacakge" is the folder containing the setup.py file). The advantage of using the -e flag is that changes made to your package are directly included in the installed version.
Once your package is installed, you can share it across multiple projects, simply by importing your package in Python, e.g.
import myPackage as myPackge

How can I distribute custom Python scripts, including dependencies and other resources?

My project contains:
My own custom Python files
Unique package-specific generated Python code
Resources (e.g. binaries)
Dependencies on 3rd party modules (e.g. numpy)
The generated Python code makes things tricky, and separates this use case from a typical Python package where everyone gets the same code. I may create several packages to be distributed to different clients. Each package will have different/unique generated Python code, but use identical versions of my custom Python scripts and 3rd party dependencies. For example I may make a "package builder" script, which generates the unique Python code and bundles the dependencies together, depending on the builder arguments.
I want to distribute my Python scripts, including the resources and dependencies. The receiver of this package cannot download the 3rd party dependencies using a requirements.txt and pip; all dependencies and binaries must be included in this package.
The way I envision the client using this package is that they simply unzip the archive I provide, set their PYTHONPATH to the unzipped directory, and invoke my custom Python file to start the process.
If I'm going about this the wrong way I'd appreciate suggestions.

Python Libraries

I'm in desperate need of a cross platform framework as I have vast numbers of .NET products that I'm trying to port to Linux. I have started to work with Python/pyQt and the standard library and all was going well until I try to import non-standard libraries. I'm hearing about pip and easy_install and I'm completely confused about this.
My products need to ship with everything required to execute them, so in the .NET world I simply package my DLLs (or licensed DLLs) with my product.
As a test bed I'm trying to import this library called requests: https://github.com/kennethreitz/requests
I've got an __init__.py file and the library source in my program directory but it isn't working. Please tell me that there is a simple way to include libraries without needing any kind of extra package installer.
I would suggest you start by familiarizing yourself with python packages (see the distutils docs. Pip is simply a manager that install packages directly from the internet repository, so that you don't need to manually go and download them. So for, example, as stated under "Installing" on the requests homepage, you simply run pip install requests in a terminal, without manually downloading anything.
Packaging your product is a different story, and the way you do it depends on the target system. On windows, the easiest might be to create an installer using NSIS which will install all dependencies. You might also want to use cx-freeze to pull all the dependencies (including the python interpreter) into a single package.
On linux, many of the dependencies will already be including in most distributions. so you should just list them as requirements when creating your package (e.g. deb for ubuntu). Other dependencies might not be included in the distro's repo, but you can still list them as requirements in setup.py.
I can't really comment on Mac, since I've never used python on one, but I think that it would be similar to the linux approach.

Categories

Resources