How do I access different python module versions? - python

so I am working on a shared computer. It is a workhorse for computations across the department. The problem we have run into is controlling versions of imported modules. Take for example Biopython. Some people have requirements of an older version of biopython - 1.58. And yet others have requirements for the latest biopython - 1.61. How would I have both versions of the module installed side by side, and how does one specifically access a particular version. I ask because sometimes these apis change and break old scripts for other people (or they expect certain functionality that is no longer there).
I understand that one could locally (i.e. per user) install the module and specifically direct python to that module. Is there another way to handle this? Or would everyone have to create an export PYTHONPATH before using?

I'm not sure if it's possible to change the active installed versions of a given module. Given my understanding of how imports and site-packages work, I'm leaning towards no.
Have you considered using virtualenv though ?
With virtualenv, you could create multiple shared environments -- one for biopython 1.58 , another for 1.61 , another for whatever other special situations you need. They don't need to be locked down to a particular user, so while it would take more space than what you desired, it could take less space than everyone having their own python environment.

It sounds like you're doing scientific computing. You should use Anaconda, and make particular note of the conda tool, documented here.
Conda uses hard links whenever possible to avoid copies of the same files. It also manages non-python binary modules in a much better way than virtualenv (virtualenv chokes on VTK, for example).

Related

How to install multiple versions of Python in Windows?

up until recently I have only worked with one version of Python and used virtual environments every now and then. Now, I am working with some libraries that require older version of Python. So, I am very confused. Could anyone please clear up some of my confusion?
How do I install multiple Python versions?
I initially had Python version 3.8.x but upgraded to 3.10.x last month. There is currently only that one version on my PC now.
I wanted to install one of the Python 3.8.x version and went to https://www.python.org/downloads/. It lists a lot of versions and subversions like 3.6, 3.7, 3.8 etc. etc. with 3.8.1, 3.8.2 till 3.8.13. Which one should I pick?
I actually went ahead with 3.8.12 and downloaded the Tarball on the page: https://www.python.org/downloads/release/python-3812/
I extracted the tarball (23.6MB) and it created a folder with a setup.py file.
Is Python 3.8.12 now installed? Clicking on the setup.py file simply flashes the terminal for a second.
I have a few more questions. Hopefully, they won't get me downvoted. I am just confused and couldn't find proper answers for them.
Why does Python have such heavy dependency on the exact versions of libraries and packages etc?
For example, this question
How can I run Mozilla TTS/Coqui TTS training with CUDA on a Windows system?. This seems very beginner unfriendly. Slightly mismatched package
version can prevent any program from running.
Do virtual environments copy all the files from the main Python installation to create a virtual environment and then install specific packages inside it? Isn't that a lot of wasted resources in duplication because almost all projects require there own virtual environment.
Your questions depend a bit on "all the other software". For example, as #leiyang indicated, the answer will be different if you use conda vs just pip on vanilla CPython (the standard Windows Python).
I'm also going to assume you're actually on Windows, because on Linux I would recommend looking at pyenv. There is a pyenv-win, which may be worth looking into, but I don't use it myself because it doesn't play as nice if you also want (mini)conda environments.
1. (a) How do I install multiple Python versions?
Simply download the various installers and install them in sensible locations. E.g. "C:\Program Files\Python39" for Python 3.9, or some other location where you're allowed to install software.
Don't have Python add itself to the PATH though, since that'll only find the last version to do so and can really confuse things.
Also, you probably want to use virtual environments consistently, as this ties a specific project very clearly to a specific Python version, avoiding future confusion or problems.
1. (b) "3.8.1, 3.8.2 till 3.8.13" which should I pick?
Always pick the latest 3.x.y, so if there's a 3.8.13 for Windows, but no 3.8.14, pick that. Check if the version is actually available for your operating system, sometimes there are later versions for one OS, but not for another.
The reason is that between a verion like 3.6 and 3.7, there may be major changes that change how Python works. Generally, there will be backwards compatibility, but some changes may break how some of your packages work. However, when going up a minor version, there won't be any such breaking changes, just fixes and additions that don't get in the way of what was already there. A change from 2.x to 3.x only happens if the language itself goes through a major change, and rarely happens (and perhaps never will again, depending on who you ask).
An exception to the "no minor version change problems" is of course if you run some script that very specifically relies on something that was broken in 3.8.6, but no fixed in 3.8.7+ (as an example). However, that's very bad coding, to rely on what's broken and not fixing it later, so only go along with that if you have no other recourse. Otherwise, just the latest minor version of any version you're after.
Also: make sure you pick the correct architecture. If there's no specific requirement, just pick 64-bit, but if your script needs to interact with other installed software at the binary level, it may require you to install 32-bit Python (and 32-bit packages as well). If you have no such requirement, 64-bit allows more memory access and has some other benefits on modern computers.
2. Why does Python have such heavy dependency on the exact versions of libraries and packages etc?
It's not just Python, this is true for many languages. It's just more visible to the end user for Python, because you run it as an interpreted language. It's only compiled at the very last moment, on the computer it's running on.
This has the advantage that the code can run on a variety of computers and operating systems, but the downside that you need the right environment where you're running it. For people who code in languages like C++, they have to deal with this problem when they're coding, but target a much smaller number of environments (although there's still runtimes to contend with, and DirectX versions, etc.). Other languages just roll everything up into the program that's being distributed, while a Python script by itself can be tiny. It's a design choice.
There are a lot of tools to help you automate the process though and well-written packages will make the process quite painless. If you feel Python is very shakey when it comes to this, that's probable to blame on the packages or scripts you're using, not really the language. The only fault of the language is that it makes it very easy for developers to make such a mess for you and make your life hard with getting specific requirements.
Look for alternatives, but if you can't avoid using a specific script or package, once you figure out how to install or use it, document it or better yet, automate it so you don't have to think about it again.
3. Do virtual environments copy all the files from the main Python installation to create a virtual environment and then install specific packages inside it? Isn't that a lot of wasted resources in duplication because almost all projects require there own virtual environment.
Not all of them, but quite a few of them. However, you still need the original installation to be present on the system. Also, you can't pick up a virtual environment and put it somewhere else, not even on the same PC without some careful changes (often better to just recreate it).
You're right that this is a bit wasteful - but this is a difficult choice.
Either Python would be even more complicated, having to manage many different version of packages in a single environment (Java developers will be able to tell you war stories about this, with their dependency management - or wax lyrically about it, once they get it themselves).
Or you get what we have: a bit wasteful, but in the end diskspace is a lot cheaper than your time. And unlike your time, diskspace is almost infinitely expandable.
You can share virtual environments between very similar projects though, but especially if you get your code from someone else, it's best to not have to worry and just give up a few dozen MB for the project. On the upside: you can just delete a virtual environment directory and that pretty much gets rid of the whole things. Some applications like PyCharm may remember that it was once there, but other than that, that's the virtual environment gone.
Just install them. You can have any number of Python installations side by side. Unless you need to have 2 different minor versions, for example 3.10.1 and 3.10.2, there is no need to do anything special. (And if you do need that then you don't need any advice.) Just set up separate shortcuts for each one.
Remember you have to install any 3rd-party libraries you need in each version. To do this, navigate to the Scripts folder in the version you want to do the install in, and run pip from that folder.
Python's 3rd-party libraries are open-source and come from projects that have release schedules that don't necessarily coincide with Python's. So they will not always have a version available that coincides with the latest version of Python.
Often you can get around this by downloading unofficial binaries from Christoph Gohlke's site. Google Python Gohlke.
Install Python using the windows executable installers from python.org. If the version is 3.x.y, use the highest y that has a windows executable installer. Unless your machine is very old, use the 64-bit versions. Do not have them add python to your PATH environment variable, but in only one of the installs have it install the python launcher py. That will help you in using multiple versions. See e.g. here.
Python itself does not. But some modules/libraries do. Especially those that are not purely written in Python but contain extensions written in C(++). The reason for this is that compiling programs on ms-windows can be a real PITA. Unlike UNIX-like operating systems with Linux, ms-windows doesn't come with development tools as standard. Nor does it have decent package management. Since the official Python installers are built with microsoft tools, you need to use those with C(++) extensions as well. Before 2015, you even had to use exactly the same version of the compiler that Python was built with. That is still a good idea, but no longer strictly necessary. So it is a signigicant amount of work for developers to release binary packages for each supported Python version. It is much easier for them to say "requires Python 3.x".

Is it possible to have users not pip install modules and instead include the modules used in a different folder and then import that?

I want to know if I can create a python script with a folder in the same directory with all the assets of a python module, so when someone wants to use it, they would not have to pip install module, because it would import from the directory.
Yes, you can, but it doesn't mean that you should.
First, ask yourself who is suposed to use that code.
If you plan to give it to consumers, it would be a good idea to use a tool like py2exe and create executable file which would include all modules and not allow for code to be changed.
If you plan to share it with another developer, you might want to look into virtual environments and requirements.txt file.
There are multiple reasons why sharing modules is bad idea:
It is harder to update modules later, at least without upgrading whole project.
It uses more space on version control, which can create issues on huge projects with hundreds of modules and branches
It might be illegal as some licenses specifically forbid including their code in your source code.
The pip install of some module might do different things depending on operating system version or installed packages. The modules on your machine might be suboptimal on someone else's machine, and in some instances might not even work.
And probably more that I can't think of right now.
The only situation where I saw this being unavoidable was when the module didn't support python implementation the application was running on. The module was changed, and its source was put under lib folder with the rest of the libraries.
I think you can add the directory with python modules into PYTHONPATH. Then people want to use those modules just need has this envvar set.
https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH

Python: How do I install packages within my package or repository?

My program requires specific versions of several python packages. I don't want to have to require the user to specifically install the specific version, so I feel that the best solution is to simply install the package within the source repository, and to distribute it along with my package.
What is the simplest way to do this?
(Please be detailed - I'm familiar with pip and easy_install, but they don't seem to do this, at least not by default).
Go for virtualenv. Life will be much easier. MUCH easier. Basically, it allows you to create specific python environments as needed.
There are indeed two ways to get this done.
I usually use buildout (see a post by Jacob from Django: http://jacobian.org/writing/django-apps-with-buildout/) - and have everything from django up installed locally at the project's environment, with pydev and django support. It's very easy since I have projects that use latest versions of open source software and others that use specific versions of the same packages.
Another alternative is, as Charlie says, the virtualenv,which is designed to do just that. Many people recommend it, I've never used it myself as I'm happy with buildout.

Organizing Python projects with shared packages

What is the best way to organize and develop a project composed of many small scripts sharing one (or more) larger Python libraries?
We have a bunch of programs in our repository that all use the same libraries stored in the same repository. So in other words, a layout like
trunk
libs
python
utilities
projects
projA
projB
When the official runs of our programs are done, we want to record what version of the code was used. For our C++ executables, things are simple because as long as the working copy is clean at compile time, everything is fine. (And since we get the version number programmatically, it must be a working copy, not an export.) For Python scripts, things are more complicated.
The problem is that, often one project (e.g. projA) will be running, and projB will need to be updated. This could cause the working copy revision to appear mixed to projA during runtime. (The code takes hours to run, and can be used as inputs for processes that take days to run, hence the strong traceability goal.)
My current workaround is, if necessary, check out another copy of the trunk to a different location, and run off there. But then I need to remember to change my PYTHONPATH to point to the second version of lib/python, not the one in the first tree.
There's not likely to be a perfect answer. But there must be a better way.
Should we be using subversion keywords to store the revision number, which would allow the data user to export files? Should we be using virtualenv? Should we be going more towards a packaging and installation mechanism? Setuptools is the standard, but I've read mixed things about it, and it seems designed for non-developer end users (of which we have none).
The much better solution involves not storing all your projects and their shared dependencies in the same repository.
Use one repository for each project, and externals for the shared libraries.
Make use of tags in the shared library repositories, so consumer projects may use exactly the version they need in their external.
Edit: (just copying this from my comment) use virtualenv if you need to provide isolated runtime environments for the different apps on the same server. Then each environment can contain a unique version of the library it needs.
If I'm understanding your question properly, then you definitely want virtualenv. Add in some virtualenvwrapper goodness to make it that much better.

How might I handle development versions of Python packages without relying on SCM?

One issue that comes up during Pinax development is dealing with development versions of external apps. I am trying to come up with a solution that doesn't involve bringing in the version control systems. Reason being I'd rather not have to install all the possible version control systems on my system (or force that upon contributors) and deal the problems that might arise during environment creation.
Take this situation (knowing how Pinax works will be beneficial to understanding):
We are beginning development on a new version of Pinax. The previous version has a pip requirements file with explicit versions set. A bug comes in for an external app that we'd like to get resolved. To get that bug fix in Pinax the current process is to simply make a minor release of the app assuming we have control of the app. Apps we don't have control we just deal with the release cycle of the app author or force them to make releases ;-) I am not too fond of constantly making minor releases for bug fixes as in some cases I'd like to be working on new features for apps as well. Of course branching the older version is what we do and then do backports as we need.
I'd love to hear some thoughts on this.
Could you handle this using the "==dev" version specifier? If the distribution's page on PyPI includes a link to a .tgz of the current dev version (such as both github and bitbucket provide automatically) and you append "#egg=project_name-dev" to the link, both easy_install and pip will use that .tgz if ==dev is requested.
This doesn't allow you to pin to anything more specific than "most recent tip/head", but in a lot of cases that might be good enough?
I meant to mention that the solution I had considered before asking was to put up a Pinax PyPI and make development releases on it. We could put up an instance of chishop. We are already using pip's --find-links to point at pypi.pinaxproject.com for packages we've had to release ourselves.
Most open source distributors (the Debians, Ubuntu's, MacPorts, et al) use some sort of patch management mechanism. So something like: import the base source code for each package as released, as a tar ball, or as a SCM snapshot. Then manage any necessary modifications on top of it using a patch manager, like quilt or Mercurial's Queues. Then bundle up each external package with any applied patches in a consistent format. Or have URLs to the base packages and URLs to the individual patches and have them applied during installation. That's essentially what MacPorts does.
EDIT: To take it one step further, you could then version control the set of patches across all of the external packages and make that available as a unit. That's quite easy to do with Mercurial Queues. Then you've simplified the problem to just publishing one set of patches using one SCM system, with the patches applied locally as above or available for developers to pull and apply to their copies of the base release packages.
EDIT: I am not sure I am reading your question correctly so the following may not answer your question directly.
Something I've considered, but haven't tested, is using pip's freeze bundle feature. Perhaps using that and distributing the bundle with Pinax would work? My only concern would be how different OS's are handled. For example, I've never used pip on Windows, so I wouldn't know how a bundle would interact there.
The full idea I hope to try is creating a paver script that controls management of the bundles, making it easy for users to upgrade to newer versions. This would require a bit of scaffolding though.
One other option may be you keeping a mirror of the apps you don't control, in a consistent vcs, and then distributing your mirrored versions. This would take away the need for "everyone" to have many different programs installed.
Other than that, it seems the only real solution is what you guys are doing, there isn't a hassle-free way that I've been able to find.

Categories

Resources