I'm unsure whether to use the CherryPy download from the official site, or the version found in my distro's package manager.
If I use the official download, portability will be less of an issue if I need to move between a dev environment and a live environment, and I'm guaranteed the same version on both systems. On the other hand, if I let my distro's package manager handle it, then I won't have to worry about keeping CherryPy updated and I also won't need to keep it in source control. Another potential downside of allowing my package manager to handle updates is that there is generally quite a delay between an official software release and the software finding its way into the repos.
What is the accepted practice for this?
for each python project that i work on i create a file called setup-env.sh which builds a local virtual environment. this is included with the source. for example, in a recent project:
#!/bin/bash
virtualenv --python=python3.2 env
source env/bin/activate
easy_install cherrypy
easy_install pytache
easy_install sql_alchemy
easy_install stagger
easy_install nose
easy_install pystache
this creates an environment that is unique to the project, which contains the latest stable releases, and which is easy to reproduce.
before working on the project do:
source env/bin/activate
to modify your PATH and PYTHONPATH correctly.
if you do not have easy_install available you need to install the distutils package or similar.
this is the best solution because:
you get to use recent stable versions rather than whatever the distro packaged
you use a minimal, documented set of packages (you don't 'accidentally' use a package installed for another project)
it's easy to recreate on another machine
it's easy to recreate if you update your distro
it's easy to extend to specify particular version numbers (easy_install cherrypy==3.2.0)
it's easy to specify a particular python version
As you use gentoo, I would use a custom overlay that is shared between the servers and your workstation. (see http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=3&chap=5)
In your overlay, if you need a more recent version of cherrypy, you can always bump it in there and install it on your workstation then when it is time to upgrade, unmask it from your servers.
As it is your overlay you don't need to wait for an official packaging. Either you adapt the ebuild yourself (usually 90% of the time it is just a matter of renaming it to bump the version) or you can copy it from more advanced overlays on your subject like the python overlay (which is now named progress) http://code.google.com/p/gentoo-progress/ for your example.
Related
I am working on a python project that requires a few libraries. This project will further be shared with other people.
The problem I have is that I can't use the usual pip install 'library' as part of my code because the project could be shared with offline computers and the work proxy could block the download.
So what I first thought of was installing .whl files and running pip install 'my_file.whl' but this is limited since some .whl files work on some computers but not on others, so this couldn't be the solution of my problem.
I tried sharing my project with another project and i had an error with a .whl file working on one computer but not the other.
What I am looking for is to have all the libraries I need to be already downloaded before sharing my project. So that when the project is shared, the peers can launch it without needing to download the libraries.
Is this possible or is there something else that can solve my problem ?
There are different approaches to the issue here, depending on what the constraints are:
1. Defined Online Dependencies
It is a good practice to define the dependencies of your project (not only when shared). Python offers different methods for this.
In this scenario every developer has access to a pypi repository via the network. Usually the official main mirrors (i.e. via internet). New packages need to be pulled individually from here, whenever there are changes.
Repository (internet) access is only needed when pulling new packages.
Below the most common ones:
1.1 requirements.txt
The requirements.txt is a plain text list of required packages and versions, e.g.
# requirements.txt
matplotlib==3.6.2
numpy==1.23.5
scipy==1.9.3
When you check this in along with your source code, users can freely decide how to install it. The mosty simple (and most convoluted way) is to install it in the base python environment via
pip install -r requirements.txt
You can even automatically generate such a file, if you lost track with pipreqs. The result is usually very good. However, a manual cleanup afterwards is recommended.
Benefits:
Package dependency is clear
Installation is a one line task
Downsides:
Possible conflicts with multiple projects
Not sure that everyone has the exact same version if flexibility is allowed (default)
1.2 Pipenv
There is a nice and almost complete Answer to Pipenv. Also the Pipenv documentation itself is very good.
In a nutshell: Pipenv allows you to have virtual environments. Thus, version conflicts from different projects are gone for good. Also, the Pipfile used to define such an environment allows seperation of production and development dependencies.
Users now only need to run the following commands in the folder with the source code:
pip install pipenv # only needed first time
pipenv install
And then, to activate the virtual environment:
pipenv shell
Benefits:
Seperation between projects
Seperation of development/testing and production packages
Everyone uses the exact same version of the packages
Configuration is flexible but easy
Downsides:
Users need to activate the environment
1.3 conda environment
If you are using anaconda, a conda environment definition can be also shared as a configuration file. See this SO answer for details.
This scenario is as the pipenv one, but with anaconda as package manager. It is recommended not to mix pip and conda.
1.4 setup.py
When you are anyway implementing a library, you want to have a look on how to configure the dependencies via the setup.py file.
2. Defined local dependencies
In this scenario the developpers do not have access to the internet. (E.g. they are "air-gapped" in a special network where they cannot communicate to the outside world. In this case all the scenarios from 1. can still be used. But now we need to setup our own mirror/proxy. There are good guides (and even comlplete of the shelf software) out there, depending on the scenario (above) you want to use. Examples are:
Local Pypi mirror [Commercial solution]
Anaconda behind company proxy
Benefits:
Users don't need internet access
Packages on the local proxy can be trusted (cannot be corrupted / deleted anymore)
The clean and flexible scenarios from above can be used for setup
Downsides:
Network connection to the proxy is still required
Maintenance of the proxy is extra effort
3. Turn key environments
Last, but not least, there are solutions to share the complete and installed environment between users/computers.
3.1 Copy virtual-env folders
If (and only if) all users (are forced to) use an identical setup (OS, install paths, uses paths, libraries, LOCALS, ...) then one can copy the virtual environments for pipenv (1.2) or conda (1.3) between PCs.
These "pre-compiled" environments are very fragile, as a sall change can cause the setup to malfunction. So this is really not recommended.
Benefits:
Can be shared between users without network (e.g. USB stick)
Downsides:
Very fragile
3.2 Virtualisation
The cleanest way to support this is some kind of virtualisation technique (virtual machine, docker container, etc.).
Install python and the dependencies needed and share the complete container.
Benefits:
Users can just use the provided container
Downsides:
Complex setup
Complex maintenance
Virtualisation layer needed
Code and environment may become convoluted
Note: This answer is compiled from the summary of (mostly my) comments
I am on macOS, using brew, pyenv, and virtualenv.
I have a Python project that depends on bokeh and gdal (both python packages were installed with pip inside a virtual environment). Both bokeh and gdal depend on a system version of libopenssl, but they depend on different versions (1.0 and 1.1).
I have had this project working at various points in the past, with some combination of libraries (using pip for all python packages and brew for system packages) but when I change python versions and environments (using pyenv) to work on other projects, and then come back to this project, it no longer works. Usually something along these lines with a problem finding a shared library for openssl:
$ ./my_python_program.py
...
ImportError: dlopen(/Users/userBob/.pyenv/versions/3.7.0/lib/python3.7/lib-dynload/_ssl.cpython-37m-darwin.so, 2):
Library not loaded: /usr/local/opt/openssl#1.1/lib/libssl.1.1.dylib
Referenced from: /Users/userBob/.pyenv/versions/3.7.0/lib/python3.7/lib-dynload/_ssl.cpython-37m-darwin.so
Reason: image not found
I feel like I am eventually able to get things to work by trying random combinations of installing and uninstalling various package versions using pip and brew. But this is a fragile and inefficient way to maintain my projects.
In general what is the best way to handle this kind of situation? Do I need to simply record the exact brew and pip install/uninstall commands to get it working? Am I missing the concept of version "pinning"? Are there additional options with brew and pyenv that I am missing that might make this process easier?
I'm not sure this is the best way to do it, but I can tell you what I do usually.
First of all, I'm using Anaconda.
When I'm on a project, I switch to the relevant virtual environment.
Before switching out, when I commit/push my modifications, I also create an export file of my environment like you can find it there.
I also track this file with git, this way, if I make any modification when working on the environment, it's stored in the .yml file.
This way, I can reinstall all dependencies needed for the project if I format my machine or get a new one, etc. And the reference for every dependency I need is stored in the cloud with my sources. So in case I start getting weird behaviour, I just restore my environment with this reference file from the time when it was working.
I'm not switching between projects fast enough for me to justify automatizing this process, but I'm sure that's feasible if you want to.
I would like to pack a python program and ship it in a deb package.
For reasons (I know in 99% it is bad practice) I want to ship the program in a python virtual environment within a debian package.
I know I can do this using dh-virtualenv. This works great - generally no problem.
But the problem arises when I want to upload this to launchpad. Uploading to launchpad means uploading a source package. In terms of dh-virtualenv a source package is the package description, where the virtualenv has not been created, yet.
What happens when I upload this to launchpad is, that the package will not build, since the dh-virtualenv which is executed during the build process on launchpad will try to install python modules into the virtualenv, which means installing these from the PyPI, which will not work, since launchpad does not allow external network access.
So basically there are two possible solutions:
Approach A
Prepare the virtualenv and somehow incorporate it into the source package and having the dh build process simply "move" this prepared virtualenv to its end location. This could work with virtualenv --relocatable. BUT the relocation strips the utf-8 marker at the beginning of all python scripts, rendering all python scripts in the virtualenv broken.
Apporach B
Somehow cache all necessary python packages in the source package and have dh_virtualenv install from the cache instead of from PyPI.
This seems like to be doable with pip2pi, but certain experiements show, that it will not install packages, although they are located in the local package index.
Both approaches seem a bit clumsy and prone to errors.
What do you think of this?
What are your experiences?
What would you recommend?
I'm in desperate need of a cross platform framework as I have vast numbers of .NET products that I'm trying to port to Linux. I have started to work with Python/pyQt and the standard library and all was going well until I try to import non-standard libraries. I'm hearing about pip and easy_install and I'm completely confused about this.
My products need to ship with everything required to execute them, so in the .NET world I simply package my DLLs (or licensed DLLs) with my product.
As a test bed I'm trying to import this library called requests: https://github.com/kennethreitz/requests
I've got an __init__.py file and the library source in my program directory but it isn't working. Please tell me that there is a simple way to include libraries without needing any kind of extra package installer.
I would suggest you start by familiarizing yourself with python packages (see the distutils docs. Pip is simply a manager that install packages directly from the internet repository, so that you don't need to manually go and download them. So for, example, as stated under "Installing" on the requests homepage, you simply run pip install requests in a terminal, without manually downloading anything.
Packaging your product is a different story, and the way you do it depends on the target system. On windows, the easiest might be to create an installer using NSIS which will install all dependencies. You might also want to use cx-freeze to pull all the dependencies (including the python interpreter) into a single package.
On linux, many of the dependencies will already be including in most distributions. so you should just list them as requirements when creating your package (e.g. deb for ubuntu). Other dependencies might not be included in the distro's repo, but you can still list them as requirements in setup.py.
I can't really comment on Mac, since I've never used python on one, but I think that it would be similar to the linux approach.
I am trying to define a process for migrating django projects from my development server to my production server using git, and it's driving me crazy that distutils installs python modules system-wide. I've read the documentation but unless I'm missing something it seems to be mostly about how to change the installation directory. I need to be able to use different versions of the same module in different projects running on the same server, and deploy projects from git without having to download and install dependencies.
tl;dr: I need to know how to install python modules, using distutils, into my project's source tree for version control without compromising other projects using different versions of the same module.
I'm new to python, so I apologize in advance if this is common knowledge.
Besides the already mentioned virtualenv which is a good option but has the potential drawback of requiring a third-party module, Distutils itself has options to install modules into arbitrary locations. In particular, there is the home scheme which allows you to "build and maintain a personal stash of Python modules". It's described in the Python documentation set here.
Perhaps you are looking for virtualenv. It will allow you to install packages into a separate virtual Python "root".
for completeness sake, virtualenvwrapper makes every day work with virtualenv a lot quicker and simpler once you are working on multiple projects and/or on multiple development platforms at the same time.
If you are looking something akin to npm or yarn of the JavaScript world or composer of the PHP world, then you may want to look at pipenv (not to be confused with pip). Here's a guide to get you started.
Alternatively there is also Poetry, which some people say is even better, but I haven't used it yet.