I have a python package I have installed and modified to my needs stored in a venv folder. I thought using:
RUN source venv/bin/activate
in my Dockerfile (after copying it into the container of course) would solve my problems but the comments to this answer revealed that it doesn't. Afterwards, I came across this article that shows how to set up a new venv inside a docker container but doesn't answer my question. Many other answers sent me on a neverending wild chase so I decided to ask here. Hopefully a good answer will solve my problem and serve those who will face this problem in the future for custom python packages in docker containers.
My question:
How to use a venv copied into a docker container?
In general you can't copy virtual environments anywhere, Docker or otherwise. They tend to be tied to a very specific filesystem path and a pretty specific Python installation. If you knew you had the exact same Python binary, and you copied it to the exact same filesystem path, you could probably COPY it in as-is, but the build system would be extremely fragile.
It's also the case that you usually don't need virtual environments in Docker. A Docker image provides the same sort of isolated Python installation that you'd use a virtual environment for in a non-Docker context. If you'd ordinarily set up a virtual environment by running
python3 -m venv vpy
. vpy/bin/activate
pip install -r requirements.txt
then you can get an equivalent installation with a Dockerfile like
FROM python:3
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
In comments you hint at hand-modifying installed packages. This isn't usually a best practice (what if there's a critical security update in a package you've changed? what if your colleague needs to work on your project but not on your computer?). You can use a tool like diff(1) to make a patch file describing what's changed, comparing your modified file with the original. If you have that, then you can do something like
COPY local.patch /app/
RUN cd $(python3 -c 'import sysconfig; print(sysconfig.get_path("platlib"))') \
&& patch -p0 < /app/local.patch
It's important to note that each RUN command starts a new shell in a new container. So the cd command in this last example only affects this RUN command and nothing later. In your proposed RUN source ... command, the environment variables set by this will be lost at the end of that RUN command. (Also note that source is not a standard shell command and won't work on, for instance, Alpine-based images, but . is equivalent and is standard.)
Related
Here I'm going to use flake8 within a docker container. I installed flake8 using the following command and everything installed successfully.
$ sudo -H pip3 install flake8 // worked fine
and it's location path is-
/usr/local/lib/python3.8/dist-packages
Then I executed the following command but result was not expected.
$ sudo docker-compose run --rm app sh -c "flake8"
It says,
sh: flake8: not found
May be, the flake8 package is not installed in the correct location path. Please help me with this issue.
When you docker-compose run a container, it is in a new container in a new isolated filesystem. If you ran pip install in a debugging shell in another container, it won't be visible there.
As a general rule, never install software into a running container, unless it's for very short-term debugging. This will get lost as soon as the container exits, and it's very routine to destroy and recreate containers.
If you do need a tool like this, you should install it in your Dockerfile instead:
FROM python:3.10
...
RUN pip3 install flake8
However, for purely developer-oriented tools like linters, it may be better to not install them in Docker at all. A Docker image contains a fixed copy of your application code and is intended to be totally separate from anything on your host system. You can do day-to-day development, including unit testing and style checking, in a non-Docker virtual environment, and use docker-compose up --build to get a container-based integration-test setup.
I'm a newbie to python(3), but not to programming in general.
I'd like to distribute a git repo with myprogram consisting of these files:
requirements.txt
myprogram.py
lib/modulea.py
lib/moduleb.py
My question is: What is the best-practice and least surprising way to let users run myprogram.py using the dependencies in requirements.txt? So that after git clone, and some idiomatic installation command(s), ./myprogram.py or /some/path/to/myprogram.py "just works" without having to first set magical venv or python3 environment variables?
I want to be able to run it using the #! shebang so that /path/to/myprogram.py and double-clicking it from the file manager GUI does the correct thing.
I already know I can create a wrapper.sh or make a clever shebang line. But I'm looking for the best-practice approach, since I'm new to python.
More details
I'm guessing that users would
git clone $url workdir
cd workdir
python3 -m venv .
./bin/pip install -r requirements.txt
And from now on this uses the modules from requirements.txt:
./myprogram.py
If I knew that the project directory was always /home/peter/workdir, I could start the myprogram.py with:
#!/home/peter/workdir/bin/python3
but I'd like to avoid hard-coding the project directory in myprogram.py.
This also seems to work in my tiny demo, but clearly this is brittle and not best-practice, but it illustrates what I'm trying to do:
#!/usr/bin/env python3
import os
import sys
print(os.path.join(os.path.dirname(__file__), 'lib', 'python3.10', 'site-packages'))
I'm sure I could come up with some home-grown shebang line that works, but what is the idiomatic way to do this in python3?
Again: After pip install, I absolutely refuse to have to to set any environment variables or call any setup code in future shells before running myprogram.py. (Unless that strongly conflicts with "idiomatic", which I hope isn't the case)...
Expanding #sinoroc's comment into an answer:
I've looked at https://packaging.python.org/en/latest/tutorials/packaging-projects/ and also at "entrypoints", and this is the smallest example I can think of. Create an empty directory with these two files:
pyproject.toml:
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "example_module_pmorch"
version = "0.0.1"
[project.scripts]
runme = "example_module_pmorch:cli_main"
src/example_module_pmorch/__init__.py:
def cli_main():
print("I'm the entrypoint")
Now if I run this:
$ python3 -m venv .
# Adding -e during development is optional
$ ./bin/pip install .
Then ./bin/runme does the right thing and prints I'm the entrypoint.
I do not see why you would need to hardcode anything. From your last snippet it looks like you are forcing the Python import system to include the target directory of the virtual environment you first create.
Based on your explanation, it seems you are using venv as your virtual environment manager. So long as your users install the dependencies in the virtual environment, and then activate the virtual environment before running the script, the dependencies will be ready for your module/script to use them.
This line you share: ./bin/pip install -r requirements.txt manually uses the package manager of the virtual environment you create with python3 -m venv .. Instead, you would want your user to create the environment (python3 -m venv example-env), activate the environment (source example-env/bin/activate) and then run the pip install command: python3 -m pip install -r requirements.txt.
The user of the package has to make sure that the environment is active before running the script.
Hi I want to clone a python virtualenv to a server that's not connected to the internet, I searched different forums but didn't find a clear answer. Here are the methods I found and the problems I have with each :
Methode 1 : (safest but most time consuming)
Save all the libraries via a pip freeze > requierments.txt then go download each one manually and store them in a directory. Copy this directory to the offline server, then create a new virtualenv in the offline server, and install all requirements from the files downloaded.
To avoid downloading each one by hand I used pip download -r requirements.txt -d wheelfiles in the source machine, but I couldn't find a way to install all the packages in one command. But I could use a script with a loop to go through each one. The problem is when even the source server doesn't have internet connection to download these packages.
Methode 2 : (less recommended but I didn't understand why)
Is to simply copy the virtualenv directory with all its files to the offline machine, both machines should have apparently the same Python version, and you'll have to manually modify some hardcoded paths for example modifying all files containing sourceserver\user1\dev\virtualenv with targetserver\user4\dev\virtualenv Usually the files to modify start with activate* or pip*.
But this method is said to be not recommended but I don't understand why.
Also if this method work without problems, can I copy the virtualenv folder from a linux server to a windows server and vice versa ?
You can install all the requirements using
pip install -r requirements.txt
which means the options are:
pip freeze > requirements.txt
pip download -r requirements.txt -d wheelfiles
pip install -r requirements.txt --no-index --find-links path/to/wheels
or
Ensure target machine is the same architecture, OS, and Python version
Copy virtual environment
Modify various hardcoded paths in files
It should be clear why the former is preferred, especially as it is completely independent of Python version, machine architecture, OS, etc.
Additionally, the former means that the requirements.txt can be committed to source control in order to recreate the environment on demand on any machine, including by other people and when the original machine or copy of the virtual environment is not available. In terms of size, the requirements.txt file is also significantly smaller than an entire virtual environment.
What is the purpose of virtualenv inside a docker django application? Python and other dependencies are already installed, but at the same time it's necessary to install lots of packages using pip, so it seems that conflict is still unclear.
Could you please explain the concept?
EDIT: Also, for example. I'v created virtualenv inside docker django app and recently installed pip freeze djangorestframework and added it to installed in settings.py but docker-compose up raises error . No module named rest_framework.Checked, everything is correct.Docker/virtualenv conflict ? May it be?
Docker and containerization might inspire the illusion that you do not need a virtual environment. distutil's Glpyh makes a very compelling argument against this misconception in this pycon talk.
The same fundamental aspects of virtualenv advantages apply for a container as they do for a non-containerized application, because fundamentally you're still running a linux distribution.
Debian and Red Hat are fantastically complex engineering projects.
Integrating billions of lines of C code.For example, you can just apt install libavcodec. Or yum install ffmpeg.
Writing a working build
system for one of those things is a PhD thesis. They integrate
thousands of Python packages simultaneously into one working
environment. They don't always tell you whether their tools use Python
or not.
And so, you might want to docker exec some tools inside a
container, they might be written in Python, if you sudo pip install
your application in there, now it's all broken.
So even in containers, isolate your application code from the system's
Regardless of whether you're using docker or not you should always run you application in a virtual environment.
Now in docker in particular using a virtualenv is a little trickier than it should be. Inside docker each RUN command runs in isolation and no state other than file system changes are kept from line to line. To install to a virutalenv you have to prepend the activation command on every line:
RUN apt-get install -y python-virtualenv
RUN virtualenv /appenv
RUN . /appenv/bin/activate; \
pip install -r requirements.txt
ENTRYPOINT . /appenv/bin/activate; \
run-the-app
A virtualenv is there for isolating the packages to a specific environment. Docker is also there to isolate the settings to a specific environment. So in essence if you use docker there isn't much benefit of using virtualenv too.
Just pip install thing into the docker environment directly it'll do no harm. To pip install the requirements use the dockerfile where you can execute commands.
You can find a pseudo code example below.
FROM /path/to/used/docker/image
RUN pip install -r requirements.txt
I'm new to virtualenv but I'm writting django app and finally I will have to deploy it somehow.
So lets assume I have my app working on my local virtualenv where I installed all the required libraries. What I want to do now, is to run some kind of script, that will take my virtualenv, check what's installed inside and produce a script that will install all these libraries on fresh virtualenv on other machine. How this can be done? Please help.
You don't copy paste your virtualenv. You export the list of all the packages installed like -
pip freeze > requirements.txt
Then push the requirements.txt file to anywhere you want to deploy the code, and then just do what you did on dev machine -
$ virtualenv <env_name>
$ source <env_name>/bin/activate
(<env_name>)$ pip install -r path/to/requirements.txt
And there you have all your packages installed with the exact version.
You can also look into Fabric to automate this task, with a function like this -
def pip_install():
with cd(env.path):
with prefix('source venv/bin/activate'):
run('pip install -r requirements.txt')
You can install virtualenvwrapper and try cpvirtualenv, but the developers advise caution here:
Warning
Copying virtual environments is not well supported. Each virtualenv
has path information hard-coded into it, and there may be cases where
the copy code does not know it needs to update a particular file. Use
with caution.
If it is going to be on the same path you can tar it and extract it on another machine. If all the same dependencies, libraries etc are available on the target machine it will work.