Clone a python virtualenv to an offline server

Clone a python virtualenv to an offline server - python

Hi I want to clone a python virtualenv to a server that's not connected to the internet, I searched different forums but didn't find a clear answer. Here are the methods I found and the problems I have with each :
Methode 1 : (safest but most time consuming)
Save all the libraries via a pip freeze > requierments.txt then go download each one manually and store them in a directory. Copy this directory to the offline server, then create a new virtualenv in the offline server, and install all requirements from the files downloaded.
To avoid downloading each one by hand I used pip download -r requirements.txt -d wheelfiles in the source machine, but I couldn't find a way to install all the packages in one command. But I could use a script with a loop to go through each one. The problem is when even the source server doesn't have internet connection to download these packages.
Methode 2 : (less recommended but I didn't understand why)
Is to simply copy the virtualenv directory with all its files to the offline machine, both machines should have apparently the same Python version, and you'll have to manually modify some hardcoded paths for example modifying all files containing sourceserver\user1\dev\virtualenv with targetserver\user4\dev\virtualenv Usually the files to modify start with activate* or pip*.
But this method is said to be not recommended but I don't understand why.
Also if this method work without problems, can I copy the virtualenv folder from a linux server to a windows server and vice versa ?

You can install all the requirements using
pip install -r requirements.txt
which means the options are:
pip freeze > requirements.txt
pip download -r requirements.txt -d wheelfiles
pip install -r requirements.txt --no-index --find-links path/to/wheels
or
Ensure target machine is the same architecture, OS, and Python version
Copy virtual environment
Modify various hardcoded paths in files
It should be clear why the former is preferred, especially as it is completely independent of Python version, machine architecture, OS, etc.
Additionally, the former means that the requirements.txt can be committed to source control in order to recreate the environment on demand on any machine, including by other people and when the original machine or copy of the virtual environment is not available. In terms of size, the requirements.txt file is also significantly smaller than an entire virtual environment.

Related

Transferring a Python project to different computers

I am working on a Python project on both my work computer and my home computer. GitHub has made the experience pretty seamless.
But I'm having a problem with the pyvenv.cfg file in my venv folder. Because my Python SDK has a different file path on my work computer compared to my home computer, I have to manually go into pyvenv.cfg to change the home = C:\Users\myName\... filepath each time I pull the updated version of my project from my other computer, or else the interpreter doesn't work.
Does anyone know a solution to this problem?

As confirmed in the comments, you've added the virtual environment folder to your project and included it in the files that you put on GitHub.
That's generally a bad idea, since it defeats part of the purpose of having a virtual environment in the first place. Your virtual environment will contain packages specific to the platform and configuration of the machine it is on - you could be developing on Linux on one machine and Windows on another and you'd have bigger problems than just one line in a configuration file.
What you should do:
create a virtual environment in a folder outside your project / source folder.
assuming you're using pip, you can run pip freeze > requirements.txt to create a requirements.txt folder which you can then use on the other system with pip install -r requirements.txt to recreate the exact same virtual environment.
All you have to do is keep that requirements.txt up to date and update the virtual environments on either computer whenever it changes, instead of pulling it through GitHub.
In more detail, a simple example (for Windows, very similar for Linux):
create a project folder, e.g. C:\projects\my_project
create a virtual environment for the project, e.g. python -m venv C:\projects\venv\my_project
activate the environment, i.e. C:\projects\venv\my_project\Scripts\activate.bat
install packages, e.g. pip install numpy
save what packages were installed to a file in C:\projects\my_project, called requirements.txt with pip freeze > requirements.txt
store the project in a Git repo, including that file
on another development machine, clone or pull the project, e.g. git clone https://github.com/my_project D:\workwork\projects\my_project
on that machine, create a new virtual environment, e.g. python -m venv D:\workwork\venv\my_project
activate the environment, i.e. D:\workwork\venv\my_project\Scripts\activate.bat
install the packages that are required with pip install -r D:\workwork\projects\my_project\requirements.txt
Since you say you're using PyCharm, it's a lot easier still, just make sure that the environment created by PyCharm sits outside your project folder. I like to keep all my virtual environments in one folder, with venv-names that match the project names.
You can still create a requirements.txt in your project and when you pull the project to another PC with PyCharm, just do the same: create a venv outside the project folder. PyCharm will recognise that it's missing packages from the requirements file and offer to install them for you.

You shouldn't keep the full virtualenv in source control, since more often than not it's much larger than your code, and there may be platform-and-interpreter-version specific bits and bobs in there.
Instead, you should save the packages required -- the tried and tested way is a requirements.txt file (but there are plenty of alternatives such as Pyenv, Poetry, Dephell, ...) -- and recreate the virtualenv on each machine you need to run the project on.
To save a requirements file for a pre-existing project, you can
pip freeze > requirements.txt
when the virtualenv is active.
Then, you can use
pip install -r requirements.txt
to install exactly those packages.
In general, I like to use pip-tools, so I only manage a requirements.in file with package requirement names, and the pip-compile utility then locks those requirements with exact versions into requirements.txt.

"setup.py develop" installs git remote version instead of local version with live changes

Normally when I develop a Python package for personal use, I use python3 setup.py develop, and then perform pip3 install -e <path_to_package> within another virtualenv, allowing me to hack around with both at the same time. When I do gpip3 freeze I see the path to the package on my local machine:
-e /Users/myName/Documents/testpackage
When I store that package on GitHub and clone it back onto a local machine, I expect to be able to use setup.py develop the same way and keep developing the package on my local machine, regardless of whether or when I push back to GitHub. However, when I do gpip3 freeze, I see:
-e git+git#github.com:github_username/repo_name#-----latest_commit's_sha_code-----#egg=repo_name&subdirectory=xx/xx/testpackage
I would like my system to keep track of the local version instead of git's remote.
Note: I know how to commit and push local changes to GitHub and install the egg in local environments. My goal is to quickly test ideas with a development version of the package without continuously integrating.
Note 2: The GitHub address given in gpip3 freeze fails when I try it in an environment (FileNotFoundError: [Errno 2] No such file or directory: '/Users/myName/Documents/testenvironment/src/testpackage/setup.py')
But if I wanted pip3 to install the latest GitHub commit, I wouldn't be bothering with setup.py develop anyway.
Is there a way to signal to setup.py that I want it to ignore the remote in the cloned repo and pay attention only to the local path? Or is always referencing a remote when present the expected behavior of setup tools?
update :
The wording of the output in gpip3 freeze after python3 setup.py develop when a remote isn't present (below) leads me to consider that tracking a remote whenever possible may be the intended behavior :
# Editable Git install with no remote (testpackage ==0.0.1)
-e /Users/myName/Documents/testpackage
I have been working around this by git remote remove origin when I want my local changes to be reflected in local environments without pushing a new commit, though unideal for me.

My question was rooted in a misunderstanding of how to implement python3 setup.py develop.
My original method was :
1) python3 setup.py develop from within the package directory itself, which would install/link the egg globally
2) gpip3 freeze to get (I thought) the link to the egg (seeing all the extra git remote info here was confusing to me)
3) cd to another virtual environment, source bin/activate, then call pip3 install -e <link_copied_from_global_pip_freeze>
In fact there is no need to call python3 setup.py develop from within the package under development, or to use gpip3 freeze to get the egg link.
I can go directly to the virtual env and activate it, then use pip3 install -e <system_path_to_package_directory_containing_setup.py>. This will create an egg link in the package directory if it doesn't already exist. Edits within the package are reflected in the virtual environment as expected, and I can use Git version control freely within the package according to my needs without interference.
I assume there may be times to call python3 setup.py develop directly (setup.py develop --user also exists) but by not doing so I happen to avoid littering my global environment with extra packages.
Related info from 2014 question in the Python Disutils thread :
Questioner writes:
For years, I've been recommending:
$ python setup.py develop
[...]
Having said that, I also notice that:
$ pip install -e .
does the same thing.
Should I be recommending one over the other?
Noah answers :
You should recommend using pip for it, mostly because as you said that will work even with packages that don't use setuptools :-) It also is required when doing a develop install with extras, though that requires a slightly more verbose syntax due to a bug in pip.

What is the use case for `pip install -e`?

When I need to work on one of my pet projects, I simply clone the repository as usual (git clone <url>), edit what I need, run the tests, update the setup.py version, commit, push, build the packages and upload them to PyPI.
What is the advantage of using pip install -e? Should I be using it? How would it improve my workflow?

I find pip install -e extremely useful when simultaneously developing a product and a dependency, which I do a lot.
Example:
You build websites using Django for numerous clients, and have also developed an in-house Django app called locations which you reuse across many projects, so you make it available on pip and version it.
When you work on a project, you install the requirements as usual, which installs locations into site packages.
But you soon discover that locations could do with some improvements.
So you grab a copy of the locations repository and start making changes. Of course, you need to test these changes in the context of a Django project.
Simply go into your project and type:
pip install -e /path/to/locations/repo
This will overwrite the directory in site-packages with a symbolic link to the locations repository, meaning any changes to code in there will automatically be reflected - just reload the page (so long as you're using the development server).
The symbolic link looks at the current files in the directory, meaning you can switch branches to see changes or try different things etc...
The alternative would be to create a new version, push it to pip, and hope you've not forgotten anything. If you have many such in-house apps, this quickly becomes untenable.

For those who don't have time:
If you install your project with an -e flag (e.g. pip install -e mynumpy) and use it in your code (e.g. from mynumpy import some_function), when you make any change to some_function, you should be able to use the updated function without reinstalling it.

pip install -e is how setuptools dependencies are handled via pip.
What you typically do is to install the dependencies:
git clone URL
cd project
run pip install -e . or pip install -e .[dev]*
And now all the dependencies should be installed.
*[dev] is the name of the requirements group from setup.py
Other than setuptools (egg) there is also a wheel system of python installation.
Both these systems are based on promise that no building and compilation is performed.

When would the -e, --editable option be useful with pip install?

When would the -e, or --editable option be useful with pip install?
For some projects the last line in requirements.txt is -e .. What does it do exactly?

As the man page says it:
-e,--editable <path/url>
Install a project in editable mode (i.e. setuptools "develop mode") from a local project path or a VCS url.
So you would use this when trying to install a package locally, most often in the case when you are developing it on your system. It will just link the package to the original location, basically meaning any changes to the original package would reflect directly in your environment.
Some nuggets around the same here and here.
An example run can be:
pip install -e .
or
pip install -e ~/ultimate-utils/ultimate-utils-proj-src/
note the second is the full path to where the setup.py would be at.

Concrete example of using --editable in development
If you play with this test package as in:
cd ~
git clone https://github.com/cirosantilli/vcdvcd
cd vcdvcd
git checkout 5dd4205c37ed0244ecaf443d8106fadb2f9cfbb8
python -m pip install --editable . --user
it outputs:
Obtaining file:///home/ciro/bak/git/vcdvcd
Installing collected packages: vcdvcd
Attempting uninstall: vcdvcd
Found existing installation: vcdvcd 1.0.6
Can't uninstall 'vcdvcd'. No files were found to uninstall.
Running setup.py develop for vcdvcd
Successfully installed vcdvcd-1.0.6
The Can't uninstall 'vcdvcd' is normal: it tried to uninstall any existing vcdvcd to then replace them with the "symlink-like mechanism" that is produced in the following steps, but failed because there were no previous installations.
Then it generates a file:
~/.local/lib/python3.8/site-packages/vcdvcd.egg-link
which contains:
/home/ciro/vcdvcd
.
and acts as a "symlink" to the Python interpreter.
So now, if I make any changes to the git source code under /home/ciro/vcdvcd, it reflects automatically on importers who can from any directory do:
python -c 'import vcdvcd'
Note however that at my pip version at least, binary files installed with --editable, such as the vcdcat script provided by that package via scripts= on setup.py, do not get symlinked, just copied to:
~/.local/bin/vcdcat
just like for regular installs, and therefore updates to the git repository won't directly affect them.
By comparison, a regular non --editable install from the git source:
python -m pip uninstall vcdvcd
python -m pip install --user .
produces a copy of the installed files under:
~/.local/lib/python3.8/site-packages/vcdvcd
Uninstall of an editable package as done above requires a new enough pip as mentioned at: How to uninstall editable packages with pip (installed with -e)
Tested in Python 3.8, pip 20.0.2, Ubuntu 20.04.
Recommendation: develop directly in-tree whenever possible
The editable setup is useful when you are testing your patch to a package through another project.
If however you can fully test your change in-tree, just do that instead of generating an editable install which is more complex.
E.g., the vcdvcd package above is setup in a way that you can just cd into the source and do ./vcdcat without pip installing the package itself (in general, you might need to install dependencies from requirements.txt though), and the import vcdvcd that that executable does (or possibly your own custom test) just finds the package correctly in the same directory it lives in.

From Working in "development" mode:
Although not required, it’s common to locally install your project in
“editable” or “develop” mode while you’re working on it. This allows
your project to be both installed and editable in project form.
Assuming you’re in the root of your project directory, then run:
pip install -e .
Although somewhat cryptic, -e is short for
--editable, and . refers to the current working directory, so together, it means to install the current directory (i.e. your
project) in editable mode.
Some additional insights into the internals of setuptools and distutils from “Development Mode”:
Under normal circumstances, the distutils assume that you are going to
build a distribution of your project, not use it in its “raw” or
“unbuilt” form. If you were to use the distutils that way, you would
have to rebuild and reinstall your project every time you made a
change to it during development.
Another problem that sometimes comes up with the distutils is that you
may need to do development on two related projects at the same time.
You may need to put both projects’ packages in the same directory to
run them, but need to keep them separate for revision control
purposes. How can you do this?
Setuptools allows you to deploy your projects for use in a common
directory or staging area, but without copying any files. Thus, you
can edit each project’s code in its checkout directory, and only need
to run build commands when you change a project’s C extensions or
similarly compiled files. You can even deploy a project into another
project’s checkout directory, if that’s your preferred way of working
(as opposed to using a common independent staging area or the
site-packages directory).
To do this, use the setup.py develop command. It works very similarly
to setup.py install, except that it doesn’t actually install anything.
Instead, it creates a special .egg-link file in the deployment
directory, that links to your project’s source code. And, if your
deployment directory is Python’s site-packages directory, it will also
update the easy-install.pth file to include your project’s source
code, thereby making it available on sys.path for all programs using
that Python installation.

It is important to note that pip uninstall can not uninstall a module that has been installed with pip install -e. So if you go down this route, be prepared for things to get very messy if you ever need to uninstall. A partial solution is to (1) reinstall, keeping a record of files created, as in sudo python3 -m setup.py install --record installed_files.txt, and then (2) manually delete all the files listed, as in e.g. sudo rm -r /usr/local/lib/python3.7/dist-packages/tdc7201-0.1a2-py3.7.egg/ (for release 0.1a2 of module tdc7201). This does not 100% clean everything up however; even after you've done it, importing the (removed!) local library may succeed, and attempting to install the same version from a remote server may fail to do anything (because it thinks your (deleted!) local version is already up to date).

As suggested in previous answers, there is no symlinks that are getting created.
How does '-e' option work? -> It just updates the file "PYTHONDIR/site-packages/easy-install.pth" with the project path specified in the 'command pip install -e'.
So each time python search for a package it will check this directory as well => any changes to the files in this directory is instantly reflected.

How to export virtualenv?

I'm new to virtualenv but I'm writting django app and finally I will have to deploy it somehow.
So lets assume I have my app working on my local virtualenv where I installed all the required libraries. What I want to do now, is to run some kind of script, that will take my virtualenv, check what's installed inside and produce a script that will install all these libraries on fresh virtualenv on other machine. How this can be done? Please help.

You don't copy paste your virtualenv. You export the list of all the packages installed like -
pip freeze > requirements.txt
Then push the requirements.txt file to anywhere you want to deploy the code, and then just do what you did on dev machine -
$ virtualenv <env_name>
$ source <env_name>/bin/activate
(<env_name>)$ pip install -r path/to/requirements.txt
And there you have all your packages installed with the exact version.
You can also look into Fabric to automate this task, with a function like this -
def pip_install():
with cd(env.path):
with prefix('source venv/bin/activate'):
run('pip install -r requirements.txt')

You can install virtualenvwrapper and try cpvirtualenv, but the developers advise caution here:
Warning
Copying virtual environments is not well supported. Each virtualenv
has path information hard-coded into it, and there may be cases where
the copy code does not know it needs to update a particular file. Use
with caution.

If it is going to be on the same path you can tar it and extract it on another machine. If all the same dependencies, libraries etc are available on the target machine it will work.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.