Activate conda env within running Docker container?

Activate conda env within running Docker container? - python

Is there a way to spin up a Docker container and then activate a given conda environment within the container using a Python script? I don't have access to the Dockerfile of the image I'm using.

conda run
Conda includes a conda run command for running arbitrary commands within an environment.
docker run <image> "conda run -n <your_env> python <script.py>"
If the script requires interaction, you may need the --live-stream and/or --no-capture-output arguments (see conda run -h).

I think if you load right docker image there is no need to activate conda in your dockerfile try this image below
FROM continuumio/miniconda3
RUN conda info
This worked for me.

Related

How to install a Python package inside a docker image?

Is there a way to install a python package without rebuilding the docker image? I have tried in this way:
docker compose run --rm web sh -c "pip install requests"
but if I list the packages using
docker-compose run --rm web sh -c "pip freeze"
I don't get the new one.
It looks like that is installed in the container but not in the image.
My question is what is the best way to install a new python package after building the docker image?
Thanks in advance

docker-compose is used to run multi-container applications with Docker.
It seems that in your case you use Docker image with python installed as entrypoint to do some further work.
After building docker image you can run it:
$ docker run -dit -name my_container_name image_name
And then run:
$ docker exec -ti my_container_name bash or
$ docker exec -ti my_container_name sh
in case there is no bash in the docker image.
This will give you shell access to the container you just created. Then if there is pip installed inside your container you can install whatever python package you need like you would do on your OS.
Take note that everything you install is only persisted inside the container you created. If you delete this container, all the things you installed manually will be gone.

I don't know too much about docker but if you execute your commands, the docker engine will spin up a new container based on your web image and runs the pip install requests command. After it executed the command, the container has nothing more to do and will stop. Since you specified the --rm flag, the docker engine will remove your new container after it has stopped such that the whole container and thus also the installed packages are removed.
AFAIK you cannot add packages without rebuilding the image.
I know that you can run the command without removing the container and that you can also make images from your containers. (Those images should include the packages then).

Activate conda environment in singularity container from dockerfile

I'm trying to set up a singularity container from an existing docker image in which a conda environment named "tensorflow" is activated as soon as I run the container. I've found some answers on this topic here. Unfortunately, in this post they only explain how they would set up the the singularity .def file to activate the conda environment by default. However, I want to modify my existing Dockerfile only and then build a singularity image from it.
What I've tried so far is setting up the Dockerfile like this:
FROM opensuse/tumbleweed
ENV PATH /opt/conda/bin:$PATH
ENV PATH /opt/conda/envs/tensorflow/bin:$PATH
# Add conda environment files (.yml)
COPY ["./conda_environments/", "."]
# Install with zypper
RUN zypper install -y sudo wget bzip2 vim tree which util-linux
# Get installation file
RUN wget --quiet https://repo.anaconda.com/archive/Anaconda3-2019.07-Linux-x86_64.sh -O ~/anaconda.sh
# Install anaconda at /opt/conda
RUN /bin/bash ~/anaconda.sh -b -p "/opt/conda"
# Remove installation file
RUN rm ~/anaconda.sh
# Make conda command available to all users
RUN ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
# Create tensorflow environment
RUN conda env create -f tensorflow.yml
# Activate conda environment with interactive bash session
RUN echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
RUN echo "conda activate tensorflow" >> ~/.bashrc
# Default command
CMD ["/bin/bash"]
After building the docker image I run the docker container with:
docker run -t -d --rm --name=my_container opensuse_conda:latest
and enter the container with:
docker exec -it my_container bash
The result is as expected. The shell session is started directly with the "tensorflow" environment being active which is indicated by the (tensorflow) prefix.
To build a singularity image from this docker image I use:
sudo singularity build opensuse_conda.sif docker-daemon://opensuse_conda:latest
and run the container with:
sudo singularity run opensuse_conda.sif
This is where the problem occurs. Instead of the "tensorflow" environment the "base" environment is activated by default. However, I would rather have the "tensorflow" environment being activated when I run the singularity container.
How can I modify my Dockerfile so that when running both the docker container and the singularity container the default environment is "tensorflow"?
Thank you very much for your help!

Your problem is that .bashrc will only be read when you start an interactive shell, but not when the container is running with the default command. See this answer for background information.
There are a bunch of bash startup files where you could put the conda activate tensorflow command in instead. I recommend to define a file of your own, and put the filename into the BASH_ENV environment variable. Both can easily be done from the Dockerfile.

Miniconda with dockerfile, how to use the conda environment

Goal: create a docker image from miniconda that will install all my dependencies and then run some commands for django and other packages. Also every time someone bin/bash into the container it should start with those packages available without me adding an entrypoint and do env hacks there.
Dockerfile:
FROM continuumio/miniconda3
ADD environment.yml /code/
WORKDIR /code/
RUN conda env create -f environment.yml # successful
RUN python test/manage.py 8000 # fails, no dependencies like pandas installed
But now I'm stuck, say I want to run some commands in the created environment:
RUN python manage.py runserver
it doesn't run it in my environment.
Some ugly hacks here: https://github.com/ContinuumIO/docker-images/issues/89 that don't actually work because you're using a new shell session when you enter a container or do another RUN command so you have to concatenate the commands with && (ugly).
Ideally I want to install all my conda packages globally from environment.yml but apparently I can't do that.

You have to tell docker where is the python version managed by conda, and inside docker this is done with conda run
conda run --no-capture-output -n myenv python run.py
Source
https://pythonspeed.com/articles/activate-conda-dockerfile/

run a Python script with Conda dependencies on a Docker container [duplicate]

Scenario
I'm trying to setup a simple docker image (I'm quite new to docker, so please correct my possible misconceptions) based on the public continuumio/anaconda3 container.
The Dockerfile:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda env list \
&& conda create -n testenv pip -y \
&& source activate testenv \
&& conda env list
Building and image from this by docker build -t test . ends with the error:
/bin/sh: 1: source: not found
when activating the new virtual environment.
Suggestion 1:
Following this answer I tried:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda env list \
&& conda create -y -n testenv pip \
&& /bin/bash -c "source activate testenv" \
&& conda env list
This seems to work at first, as it outputs: prepending /opt/conda/envs/testenv/bin to PATH, but conda env list as well ass echo $PATH clearly show that it doesn't:
[...]
# conda environments:
#
testenv /opt/conda/envs/testenv
root * /opt/conda
---> 80a77e55a11f
Removing intermediate container 33982c006f94
Step 3 : RUN echo $PATH
---> Running in a30bb3706731
/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
The docker files work out of the box as a MWE.
I appreciate any ideas. Thanks!

Using the docker ENV instruction it is possible to add the virtual environment path persistently to PATH. Although this does not solve the selected environment listed under conda env list.
See the MWE:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda create -y -n testenv pip
ENV PATH /opt/conda/envs/testenv/bin:$PATH
RUN echo $PATH
RUN conda env list

Method 1: use SHELL with a custom entrypoint script
EDIT: I have developed a new, improved approach which better than the "conda", "run" syntax.
Sample dockerfile available at this gist. It works by leveraging a custom entrypoint script to set up the environment before execing the arguments of the RUN stanza.
Why does this work?
A shell is (put very simply) a process which can act as an entrypoint for arbitrary programs. exec "$#" allows us to launch a new process, inheriting all of the environment of the parent process. In this case, this means we activate conda (which basically mangles a bunch of environment variables), then run /bin/bash -c CONTENTS_OF_DOCKER_RUN.
Method 2: SHELL with arguments
Here is my previous approach, courtesy of Itamar Turner-Trauring; many thanks to them!
# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml
# Set the default docker build shell to run as the conda wrapped process
SHELL ["conda", "run", "-n", "vigilant_detect", "/bin/bash", "-c"]
# Set your entrypoint to use the conda environment as well
ENTRYPOINT ["conda", "run", "-n", "myenv", "python", "run.py"]
Modifying ENV may not be the best approach since conda likes to take control of environment variables itself. Additionally, your custom conda env may activate other scripts to further modulate the environment.
Why does this work?
This leverages conda run to "add entries to PATH for the environment and run any activation scripts that the environment may contain" before starting the new bash shell.
Using conda can be a frustrating experience, since both tools effectively want to monopolize the environment, and theoretically, you shouldn't ever need conda inside a container. But deadlines and technical debt being a thing, sometimes you just gotta get it done, and sometimes conda is the easiest way to provision dependencies (looking at you, GDAL).

Piggybacking on ccauet's answer (which I couldn't get to work), and Charles Duffey's comment about there being more to it than just PATH, here's what will take care of the issue.
When activating an environment, conda sets the following variables, as well as a few that backup default values that can be referenced when deactivating the environment. These variables have been omitted from the Dockerfile, as the root conda environment need never be used again. For reference, these are CONDA_PATH_BACKUP, CONDA_PS1_BACKUP, and _CONDA_SET_PROJ_LIB. It also sets PS1 in order to show (testenv) at the left of the terminal prompt line, which was also omitted. The following statements will do what you want.
ENV PATH /opt/conda/envs/testenv/bin:$PATH
ENV CONDA_DEFAULT_ENV testenv
ENV CONDA_PREFIX /opt/conda/envs/testenv
In order to shrink the number of layers created, you can combine these commands into a single ENV command setting all the variables at once as well.
There may be some other variables that need to be set, based on the package. For example,
ENV GDAL_DATA /opt/conda/envs/testenv/share/gdal
ENV CPL_ZIP_ENCODING UTF-8
ENV PROJ_LIB /opt/conda/envs/testenv/share/proj
The easy way to get this information is to call printenv > root_env.txt in the root environment, activate testenv, then call printenv > test_env.txt, and examine
diff root_env.txt test_env.txt.

How to install a python package with all the dependencies into a Docker image?

I'm working in Ubuntu 15.10 with the Docker container for Pyspark jupyter/pyspark-notebook. I need to install folium with all it's dependencies and run a Pyspark script into the container. I successfully installed Docker, pulled the image and run it with the command
docker run -d -p 8888:8888 -p 4040:4040 -v /home/$MYUSER/$MYPROJECT:/home/jovyan/work jupyter/pyspark-notebook
Then, I execute the code example without any issues
import pyspark
sc = pyspark.SparkContext('local[*]')
# do something to prove it works
rdd = sc.parallelize(range(1000))
rdd.takeSample(False, 5)
I looked for the conda environment in /opt/conda (as it says in the documentation) but there is no conda in my /opt folder. Then, I installed miniconda3 and folium with all the dependencies as a normal Python package (no Docker involved).
It doesn't work. When I run the image and try to import the package with import folium it doesn't find the folium package:
ImportErrorTraceback (most recent call last)
<ipython-input-1-af6e4f19ef00> in <module>()
----> 1 import folium
ImportError: No module named 'folium'
So the problem can be reduced to two questions:
Where is the container's conda?
How can I install the Python package I need into the container?

To answer the first question Where is the conda environment? we just need to execute in console $ docker my_containers_name ls /opt/conda.
Second question has two options:
We can open the containers console by executing the command
$ docker exec -it my_containers_name /bin/bash
and install the package like a normal conda package
conda install --channel https://conda.anaconda.org/conda-forge folium
We can modify the Dockerfile of the Docker image or create a new one extending the previous one. To create a new Dockerfile and add the lines
FROM jupyter/minimal-notebook
USER jovyan
RUN conda install --quiet --yes --channel https://conda.anaconda.org/conda-forge folium && conda clean -tipsy
And build our new image. If we want to modify the original Dockerfile we must skip the first line.
I create my own Dockerfile by forking the original project.
Thanks warmoverflow and ShanShan for your comments

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.