Why does venv break after rsync?

Why does venv break after rsync? - python

I am trying to build a CI/CD process for my Python scripts and applications. I am able to build my venv within the testing container but when I rsync it over to the target server, the version of Python seems to break. This is what I am trying:
- cp -a ./. $APP_DIR
- cd $APP_DIR
- python3 -m venv venv
- source venv/bin/activate
- pip3 install -r requirements.txt
...
- rsync...
All environments involved are running Python 3.6.8
When I activate the venv on the target server and run which python3 I get /usr/bin/python3 which is incorrect.
Why? Why does venv break when deployed to a server via rsync?
I'm new to Python development and the virtual environment process. Should venv's only be created on the server (or container) that they need to run on? Sometimes my target servers don't have python3-venv installed on them. Is it possible to deploy a venv with the code and use it to run my scripts?

When creating an environment via venv, it stores the absolute path of the environment path into bin/activate. Additionally some symlinks are created in the new environment pointing to existing python installation.
As a consequence of this the environment is only valid on the hosts and path venv was executed. This is also stated in the documentation (some parts omitted):
Running this command creates the target directory [...] and places a pyvenv.cfg file in it with a home key pointing to the Python installation from which the command was run. It also creates a bin [...] subdirectory containing a copy/symlink of the Python binary/binaries (as appropriate for the platform or arguments used at environment creation time).
You can easily check this fact by these commands:
mkdir /tmp/example_dir_for_stackoverflow
cd /tmp/example_dir_for_stackoverflow
python3 -m venv venv
grep stackoverflow venv/bin/activate
It will output:
VIRTUAL_ENV="/tmp/example_dir_for_stackoverflow/venv"
If you rsync this environment to another system onto a different path and/or different python installation, the settings in bin/activate don't match and it don't work.

In my opinion yout best bet is to exclude venv folder from rsync with
rsync --exclude 'venv' source/ destination/
The requirements.txt file is your best friend for keeping your dependencies satisfied everywhere.
I also suggest you to install python3-venv package from your Linux distribution if you're satisfied by the provided Python version. Else install another Python version at all (you'll find in the Internet how to install a different Python for your distribution).
By example:
Host 1 (This is where you develop and you may add something to your venv)
cd /tmp/
mkdir app_base # base folder for venv/ and app_code/
cd app_base/
mkdir app_code # base folder for code only
# LOCAL virtual environment creation and activatin
python3 -m venv venv
source venv/bin/activate
# Just an example of whatever you may need
pip install numpy
# Let's say that it could be enough for your app to work.
# Create requirements.txt
pip3 freeze >requirements.txt
Server, Container, whatever remote..
SETUP
This should run once (or at least before rsync). It's the same first 5 lines from the above snippet.
cd /tmp/
mkdir app_base
cd app_base/
mkdir app_code
python3 -m venv venv
Now that you've done the setup on the remote host, let's return to Host 1, where you develop.
You need to rsync your app_code and requirements.txt (and maybe some other stuff), but not the venv folder
Host 1
You can wrap this in a cron job
rsync -xav -e ssh --exclude 'venv' /tmp/app_base/ user#X.X.X.X:/tmp/app_base/
Then, finally, you can keep your server virtual environment up to your needs, directly running this on the server.
Server, Container, whatever remote..
cd /tmp/app_base
source venv/bin/activate
pip3 install -r requirements.txt
Now, on the remote host, you should be able to run (the unit test of) your code.
The 'strict' answer to your bolded question
Why? Why does venv break when deployed to a server via rsync?
is: some Python packages (like the numpy I've used in the example) provide binary routines, for performance reasons. Copying the virtual environment folder will only work in the very same Linux distribution or Windows version, with the very same architecture and Python version. And it's not the purpose virtual environments were created for.

Related

Maintain Same Python Environment through different Makefile goals

I'm running some commands through a Makefile and I have to create a new python virtualenv, activate it and install requirements in one Make recipe/goal and then reuse it before running other recipes/goals, but the problem is, I have to activate that env on every subsequent Make Goal like so
SHELL := bash
# For Lack of a better mechanism will have to activate the venv on every Make recipe because every step is in its own shell
.ONESHELL:
install:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
test: install
source .venv/bin/activate
pytest
synth: install
source .venv/bin/activate
cdk synth --no-color
diff: install
source .venv/bin/activate
cdk diff --no-color
bootstrap: install
source .venv/bin/activate
cdk bootstrap --no-color
deploy: install
source .venv/bin/activate
cdk deploy --no-color
.PHONY: install test synth diff bootstrap deploy
Is there a better way to do this, i.e saves me having to do source .venv/bin/activate on every single goal ?
In other words can I run all make goals in a Makefile in the same SHELL basically ?

You cannot run all goals in the same shell (that cannot work, since the shell must exit after each recipe is complete else make cannot know whether the commands in the recipe were successful or not, and hence whether the rule was successful or not).
You can of course put the sourcing in a variable so you don't have to type it out. You could also put the entire thing in a function, like this:
run = . .venv/bin/activate && $1
then:
install:
python -m venv .venv
$(call run,pip install -r requirements.txt)
test: install
$(call run,pytest)
etc.
If you don't want to do any of that your only choice is to use recursive make invocations. However this is going to be tricky because you only want to do it after the install rule is created. Not sure it will actually lead to a more understandable makefile.

Activating a python virtual environment within a bash script fails with "sudo: source: command not found"

I'm trying to automate the deployment of my Python-Flask app on Ubuntu 18.04 using Bash by going through the motion of preparing all the necessary files/directories and cloning the source code from Github followed by creating the virtual environment, installing the pre-requisite modules and etc.
Now because I have to execute my Bash script using sudo, this means that the entire script will be executed as root except where I specify otherwise using sudo -u myuser and when it comes to activating my virtual environment, I get the following output: sudo: source: command not found and my subsequent pip installs are all installed outside of the virtual environment. Excerpts of my code below:
#!/bin/bash
...
sudo -u "$user" python3 -m venv .env
sudo -u $SUDO_USER source /srv/www/www.mydomain.com/.env/bin/activate
sudo -u "$user" pip install wheel
sudo -u "$user" pip install uwsgi
sudo -u "$user" pip install -r requirements.txt
...
Now for the life of me, I can't figure out how to activate the virtual environment in the context of the virtual environment if this makes any sense.
I've scoured the web and most of the questions/answers I found revolves around how to activate the virtual environment in a Bash script but not how to activate the virtual environment as a separate user within a Bash script that was executed as sudo.

That's because source is not an executable file, but a built-in bash command. It won't work with sudo, since the latter accepts a program name (i.e. executable file) as argument.
P.S. It's not clear why you have to execute the whole script as root. If you need to execute only a number of commands as root (e.g. for starting/stopping a service) and run a remaining majority as a regular user, you can use sudo only for these commands. E.g. the following script
#!/bin/bash
# The `whoami` command outputs the current username. Unlike `source`, this is
# a full-fledged executable file, not a built-in command
whoami
sudo whoami
sudo -u postgres whoami
on my machine outputs
trolley813
root
postgres
P.P.S. You probably don't need to activate an environment as root.

How to copy modules of Python between different machines

I have two machine that one does not have internet access. I want to install modules with anaconda and copy them to offline computer from the other computer that has internet access.
I tried looking for dependencies and install tar. files manually one by one and sent them to the offline machine but it is very time-consuming.
What is the easiest way? Does miniconda helpful ??
P.S: I forgot to mention that I am using anaconda in both machines. So I guess I need to create an env., install packages then export it for offline computer. Are there any other way to install number of packages to offline comp. from a copy <dir> in the online computer ??
Edit: I tried conda install --file C:\Users\myName\Desktop\OfflineInstall\packagelist.txt --channel file://C:\Users\myName\Desktop\OfflineInstall\pkgs2 but offline machine still tried to connect internet. I also used --no-deps
Edit2: For those who stuck on the same problem, I solved using conda install --file C:\Users\myName\Desktop\OfflineInstall\packagelist.txt --channel file:///C:\Users\myName\Desktop\OfflineInstall\pkgs2 --override-channels The tricky way is the file:/// prefix. You need to put ///. Also remember to put --override-channels flag to prevent connection to default channels.

It sounds like Conda-pack is what you are looking for.
Installing:
$ conda install conda-pack
On the source machine:
# Pack environment my_env into my_env.tar.gz
$ conda pack -n my_env
On the target machine:
# Unpack environment into directory `my_env`
$ mkdir -p my_env
$ tar -xzf my_env.tar.gz -C my_env
# Use python without activating or fixing the prefixes. Most python
# libraries will work fine, but things that require prefix cleanups
# will fail.
$ ./my_env/bin/python
# Activate the environment. This adds `my_env/bin` to your path
$ source my_env/bin/activate
# Run python from in the environment
(my_env) $ python
# Cleanup prefixes from in the active environment.
# Note that this command can also be run without activating the environment
# as long as some version of python is already installed on the machine.
(my_env) $ conda-unpack
The caveat being that conda-pack will take the whole environment.

Had this problem the other day, very simple implementation.
First make a .txt file which contains all your python libraries. Now you can just pass this .txt file to whatever machine you want the solution to be installed under and issue the following command :
pip install -r packages.txt
Where "packages" is the name of your .txt file. Hope this helps!
Edit using Conda :
while read requirement; do conda install --yes $requirement; done < requirements.txt

how to launch the new virtualEnv

I am trying to setup a new virtual env to work with django and flask.
installed
sudo pip install virtualenv
sudo pip install virtualenvwrapper
for some raisons, overlapping with anaconda.
This command doesn't work out.
virtualenv newThing
While this command is working out.
virtualenv -p /usr/bin/python2.7 newThing
What should I add to .bash_profile to make it working reguarly ?

That's probably the wrong question, as running a venv by default largely defeats the benefit of creating one.
To answer your question, though, you can enter a venv this way:
source newThing/bin/activate
Once you deploy this code to a server, you'll likely specify the venv to use in your WSGI conf.

If you have installed virtualenvwrapper as you say then you need to add some bits to your bash config:
# Virtualenv
source /usr/local/bin/virtualenvwrapper.sh
export WORKON_HOME="$HOME/.virtualenvs"
This ensures you source the bash script for the wrapper commands to call in bash and sets the location to use to store and access your virtual envs.
Now to create a virtualenv you can run the wrapper commmand mkvirtualenv and then the name of your desired env.
Then to switch to that env to work on your project run workon and then the name of that env.
There are a bunch of other useful wrapper commands like setting your project directory for example - this is useful when you are switching between projects which use different env.
For this try activating an venv using workon and then cd to the working directory for the project and then running setvirtualenvproject - this then remembers that directory to switch to whenever you run workon for that venv.

How to install portia, a python application from Github (Mac)

I am attempting to install Portia, a python app from Github: https://github.com/scrapinghub/portia
I use the following steps at the command line:
set up new virtualenv 'portia' in Mac terminal
git clone https://github.com/scrapinghub/portia.git
follow readme instructions:
cd slyd
pip install -r requirements.txt
run Portia
cd slyd
twistd -n slyd
But every time I attempt the last step to run the program, I get the following error:
ImportError: No module named scrapy
Any idea why this error is occurring? All previous steps seem to install correctly. Is it an error earlier in my install process?
Thanks!

I don't have the rep to upvote Alagappan's answer but he's correct. Also, if you're as inexperienced as I am, you may need further clarity on this.
You have to create, activate and navigate into the virtualenv before installing anything (including cloning portia from github). Here's the whole thing working from start to finish:
1: cd to wherever you’d like to store your project...
and Install virtualenv:
$ pip install virtualenv
2: Create the virtual environment. (I called mine “portia” but this can be anything.):
$ virtualenv portia
3: Activate the virtual environment you created (change the path to reflect the name you used here if not “portia”.):
$ source portia/bin/activate
At this point your terminal should have display the virtualenv name in parenthesis before the standard directory path prompt:  (name-of-virtualenv) [your-machine]:[current-directory]: [user]$
...and if you list the files within your pwd you’ll see the name of you virtualenv there.
4: cd into your virtualenv (“portia” for me):
$ cd portia
5: Now you can clone portia from github into your virtualenv...
$ git clone https://github.com/scrapinghub/portia
6: cd into the cloned portia/slyd...
$ cd portia/slyd
7/8: pip install twisted and Scrapy...
$ pip install twisted
$ pip install Scrapy
You’re virtualenv should still be activated and you should still be in [virtualenv-name]/portia/slyd
9: Install the requirements.txt:
$ pip install -r requirements.txt
10: Run slyd:
$ twistd -n slyd
--- No more scrapy error! ---

Another Installation Method For Portia: Using Vagrant
Here is the method that made me install Portia with ease. Works with Mac, Windows and Linux. With a few commands and clicks, you'll get a fully functional web scraper.
Things Needed:
VirtualBox
Vagrant
Clone the repo for Portia or download the zip file.
Additional Steps To Take:
Install VirtualBox.
Install Vagrant
Open your terminal and navigate to where you cloned the Portia repo or where you've extracted it (in case of a zip file).
Then make a command vagrant up - This will download and setup a VirtualBox Guest VM for you + will install all the necessary requirements for Portia and will install Portia from start to finished.
After the above process, you may now open your browser and navigate to
http://the-virtualbox-ip:8000/static/main.html
And you're setup.

It's quite simple, you just need to install the python module scrapy in the same way that the Twitter API requires setuptools
pip install scrapy

I suppose the issue you are facing is because of the virtualenv. Once you setup a new virtual environment you need to run the activate script in order to start using it. In your case you'll have to run the following command:
$ source portia/bin/activate
On successful activation, your prompt will look like:
(portia) $
Can you check if you activated your virtual environment before you installed the packages using pip? I believe doing so will fix your issue.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.