Dependency control for Python project - python

I am working with python in my personal project, and when someone needs it, they be confused about version of python and its library.
So, is there something like Maven (java) for python version control.

If you just want to set required packages and versions you can use requirements files. Virtualenv is used to create isolated environments for several projects. Say you use different dependency stacks or version for different projects, then you can use different virtual environments instead of changing your library versions. Of course you can use requirement files with virtualenv or virtualenvwrapper.
Version control is usually something like git or SVN and is used to control versions of your code, not libraries/ dependencies, so I don't think you mean something like that?

Related

How to manage python project that depends on multiple versions of a shared library?

I am on macOS, using brew, pyenv, and virtualenv.
I have a Python project that depends on bokeh and gdal (both python packages were installed with pip inside a virtual environment). Both bokeh and gdal depend on a system version of libopenssl, but they depend on different versions (1.0 and 1.1).
I have had this project working at various points in the past, with some combination of libraries (using pip for all python packages and brew for system packages) but when I change python versions and environments (using pyenv) to work on other projects, and then come back to this project, it no longer works. Usually something along these lines with a problem finding a shared library for openssl:
$ ./my_python_program.py
...
ImportError: dlopen(/Users/userBob/.pyenv/versions/3.7.0/lib/python3.7/lib-dynload/_ssl.cpython-37m-darwin.so, 2):
Library not loaded: /usr/local/opt/openssl#1.1/lib/libssl.1.1.dylib
Referenced from: /Users/userBob/.pyenv/versions/3.7.0/lib/python3.7/lib-dynload/_ssl.cpython-37m-darwin.so
Reason: image not found
I feel like I am eventually able to get things to work by trying random combinations of installing and uninstalling various package versions using pip and brew. But this is a fragile and inefficient way to maintain my projects.
In general what is the best way to handle this kind of situation? Do I need to simply record the exact brew and pip install/uninstall commands to get it working? Am I missing the concept of version "pinning"? Are there additional options with brew and pyenv that I am missing that might make this process easier?
I'm not sure this is the best way to do it, but I can tell you what I do usually.
First of all, I'm using Anaconda.
When I'm on a project, I switch to the relevant virtual environment.
Before switching out, when I commit/push my modifications, I also create an export file of my environment like you can find it there.
I also track this file with git, this way, if I make any modification when working on the environment, it's stored in the .yml file.
This way, I can reinstall all dependencies needed for the project if I format my machine or get a new one, etc. And the reference for every dependency I need is stored in the cloud with my sources. So in case I start getting weird behaviour, I just restore my environment with this reference file from the time when it was working.
I'm not switching between projects fast enough for me to justify automatizing this process, but I'm sure that's feasible if you want to.

Why is virtualenv necessary?

I am a beginner in Python.
I read virtualenv is preferred during Python project development.
I couldn't understand this point at all. Why is virtualenv preferred?
Virtualenv keeps your Python packages in a virtual environment localized to your project, instead of forcing you to install your packages system-wide.
There are a number of benefits to this,
the first and principle one is that you can have multiple virtulenvs, so you
can have multiple sets of packages that for different projects, even
if those sets of packages would normally conflict with one another.
For instance, if one project you are working on runs on Django 1.4
and another runs on Django 1.6, virtualenvs can keep those projects
fully separate so you can satisfy both requirements at once.
the second, make it easy for you to release your project with its own dependent
modules.Thus you can make it easy to create your requirements.txt
file.
the third, is that it allows you to switch to another installed python interpreter for that project*. Very useful (Think old 2.x scripts), but sadly not available in the now built-in venv.
Note that virtualenv is about "virtual environments" but is not the same as "virtualization" or "virtual machines" (this is confusing to some). For instance, VMWare is totally different from virtualenv.
A Virtual Environment, put simply, is an isolated working copy of Python which allows you to work on a specific project without worry of affecting other projects.
For example, you can work on a project which requires Django 1.3 while also maintaining a project which requires Django 1.0.
VirtualEnv helps you create a Local Environment(not System wide) Specific to the Project you are working upon.
Hence, As you start working on Multiple projects, your projects would have different Dependencies (e.g different Django versions) hence you would need a different virtual Environment for each Project. VirtualEnv does this for you.
As, you are using VirtualEnv.. Try VirtualEnvWrapper : https://pypi.python.org/pypi/virtualenvwrapper
It provides some utilities to create switch and remove virtualenvs easily, e.g:
mkvirtualenv <name>: To create a new Virtualenv
workon <name> : To use a specified virtualenv
and some others
Suppose you are working on multiple projects, one project requires certain version of python and other project requires some other version. In case you are not working on virtual environment both projects will access the same version installed in your machine locally which in case can give error for one.
While in case of virtual environment you are creating a new instance of your machine where you can store all the libraries, versions separately. Each time you can create a new virtual environment and work on it as a new one.
There is no real point to them in 2022, they are a mechanism to accomplish what C#, Java, Node, and many other ecosystems have done for years without virtual environments.
Projects need to be able to specify their package and interpreter dependencies in isolation from other projects. Virtual Environments are a fine but legacy solution to that issue. (Vs a config file which specifies interpreter version and local __pypackages__)
pep 582 aims to address this lack of functionality in the python ecosystem.

Moving a Python environment over to a new OS install

I have reinstalled my operating system (moved from windows XP to Windows 7).
I have reinstalled Python 2.7.
But i had a lot of packages installed in my old environment.
(Django, sciPy, jinja2, matplotlib, numpy, networkx, to name just a view)
I still have my old Python installation lying around on a data partition, so i wondered if i can just copy-paste the old Python library folders onto the new installation?
Or do i need to reinstall every package?
Do the packages keep any information in registry, system variables or similar?
Does it depend on the package?
That's the point where you must be able to layout your project, thus having special tools for that.
Normally, Python packages do not do such wierd things as dealing with registry (unless they are packaged via MSI installer). The problems may start with packages that contain C extensions, so moving to another version of OS or from 32 to 64-bit architecture will require recompiling/rebuilding of those. So, it would be much better to reinstall all packages to new system as written below.
Your demands may vary, but you definitely must choose the way of building your environment. If you don't have and plan to have a large variety of projects you may consider the first approach as follows below, the second approach is more likely for setting up development environment for different projects or for different versions of the same project.
Global environment (your Python installation in your system along with installed packages).
Here you can consider using pip. In this case your project can have requirements file containing all needed packages for your project. Basically, requirements file is a text file containing package names (on PyPI and their versions).
Isolated environment. It can be achieved using special tools or specially organized path.
Here where pip can be gracefully combined with virtualenv. This way is highly recommended by a lot of developers (I must remind that Python 3.3 that will soon be released contains virtualenv as a part of standard library). This approach assumes creating virtual shell with its own instance of Python interpreter and installed packages.
Another popular tool for achieving isolated environment is called buildout. It lays out your project source and dependencies in one path so you achieve the same effect as virtualenv creates. The great advantage of buildout that it's built upon an idea of pluggable recipes (pieces of code implementing different common project deployment tasks) and there are hundreds of stable and reliable recipes over Internet.
Both virtualenv and buildout help you to remove head-ache when installing dependencies and solve the problem of different versions of the same package kept on a single machine.
Choose your destiny...
The short answer to this question is "no", since packages can execute arbitrary code on installation and do whatever the heck they want wherever they want on your system.
Just reinstall all of them.

How to install python modules on a per project basis, rather than system wide?

I am trying to define a process for migrating django projects from my development server to my production server using git, and it's driving me crazy that distutils installs python modules system-wide. I've read the documentation but unless I'm missing something it seems to be mostly about how to change the installation directory. I need to be able to use different versions of the same module in different projects running on the same server, and deploy projects from git without having to download and install dependencies.
tl;dr: I need to know how to install python modules, using distutils, into my project's source tree for version control without compromising other projects using different versions of the same module.
I'm new to python, so I apologize in advance if this is common knowledge.
Besides the already mentioned virtualenv which is a good option but has the potential drawback of requiring a third-party module, Distutils itself has options to install modules into arbitrary locations. In particular, there is the home scheme which allows you to "build and maintain a personal stash of Python modules". It's described in the Python documentation set here.
Perhaps you are looking for virtualenv. It will allow you to install packages into a separate virtual Python "root".
for completeness sake, virtualenvwrapper makes every day work with virtualenv a lot quicker and simpler once you are working on multiple projects and/or on multiple development platforms at the same time.
If you are looking something akin to npm or yarn of the JavaScript world or composer of the PHP world, then you may want to look at pipenv (not to be confused with pip). Here's a guide to get you started.
Alternatively there is also Poetry, which some people say is even better, but I haven't used it yet.

Tool (or combination of tools) for reproducible environments in Python

I used to be a java developer and we used tools like ant or maven to manage our development/testing/UAT environments in a standardized way. This allowed us to handle library dependencies, setting OS variables, compiling, deploying, running unit tests, and all the required tasks. Also, the scripts generated guaranteed that all the environments were almost equally configured, and all the task were performed in the same way by all the members of the team.
I'm starting to work in Python now and I'd like your advice in which tools should I use to accomplish the same as described for java.
virtualenv to create a contained virtual environment (prevent different versions of Python or Python packages from stomping on each other). There is increasing buzz from people moving to this tool. The author is the same as the older working-env.py mentioned by Aaron.
pip to install packages inside a virtualenv. The traditional is easy_install as answered by S. Lott, but pip works better with virtualenv. easy_install still has features not found in pip though.
scons as a build tool, although you won't need this if you stay purely Python.
Fabric paste, or paver for deployment.
buildbot for continuous integration.
Bazaar, mercurial, or git for version control.
Nose as an extension for unit testing.
PyFit for FIT testing.
I also work with both java and python.
For python development the maven equivalent is setuptools (http://peak.telecommunity.com/DevCenter/setuptools). For web application development I use this in combination with paster (http://pythonpaste.org/) for the deployment process
Other than easy_install?
For our Linux servers, we use easy_install and yum.
For our Windows development laptops, we use easy_install and a few MSI's for some projects.
Most of the Python libraries we use are source-only, so we can use the same distribution on all boxes. If we could have a network shared device, we'd put them all there. Sadly, our infrastructure is kind of scattered, so we have to either move .TAR files around or redo the installs to rebuild the environments.
In a few cases (e.g., PIL), we have to recompile and check the version numbers.
You will want easy_setup to get the eggs (roughly what Maven calls an artifact).
For setting up your environment, have a look at working-env.py
Python is not compiled but you can put all files for a project in an egg. This is done with setuptools
For CI, check this answer.
We would be remiss not to also mention Paver, which was created by Kevin Dangoor of TurboGears fame. The project is still in alpha, but it appears very promising. A snippet from the project page:
Paver is a Python-based build/distribution/deployment scripting tool along the lines of Make or Rake. What makes Paver unique is its integration with commonly used Python libraries. Common tasks that were easy before remain easy. More importantly, dealing with your applications specific needs and requirements is now much easier.
I do exactly this with a combination of setuptools and Hudson. I know Hudson is a java app, but it can run Python stuff just fine.
You might want to check our Devenv. It allows you to standardize the build environments for development, QA and UAT. It's free as in "free beer".
HTH

Categories

Resources