Activate conda environment in singularity container from dockerfile - python

I'm trying to set up a singularity container from an existing docker image in which a conda environment named "tensorflow" is activated as soon as I run the container. I've found some answers on this topic here. Unfortunately, in this post they only explain how they would set up the the singularity .def file to activate the conda environment by default. However, I want to modify my existing Dockerfile only and then build a singularity image from it.
What I've tried so far is setting up the Dockerfile like this:
FROM opensuse/tumbleweed
ENV PATH /opt/conda/bin:$PATH
ENV PATH /opt/conda/envs/tensorflow/bin:$PATH
# Add conda environment files (.yml)
COPY ["./conda_environments/", "."]
# Install with zypper
RUN zypper install -y sudo wget bzip2 vim tree which util-linux
# Get installation file
RUN wget --quiet https://repo.anaconda.com/archive/Anaconda3-2019.07-Linux-x86_64.sh -O ~/anaconda.sh
# Install anaconda at /opt/conda
RUN /bin/bash ~/anaconda.sh -b -p "/opt/conda"
# Remove installation file
RUN rm ~/anaconda.sh
# Make conda command available to all users
RUN ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh
# Create tensorflow environment
RUN conda env create -f tensorflow.yml
# Activate conda environment with interactive bash session
RUN echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
RUN echo "conda activate tensorflow" >> ~/.bashrc
# Default command
CMD ["/bin/bash"]
After building the docker image I run the docker container with:
docker run -t -d --rm --name=my_container opensuse_conda:latest
and enter the container with:
docker exec -it my_container bash
The result is as expected. The shell session is started directly with the "tensorflow" environment being active which is indicated by the (tensorflow) prefix.
To build a singularity image from this docker image I use:
sudo singularity build opensuse_conda.sif docker-daemon://opensuse_conda:latest
and run the container with:
sudo singularity run opensuse_conda.sif
This is where the problem occurs. Instead of the "tensorflow" environment the "base" environment is activated by default. However, I would rather have the "tensorflow" environment being activated when I run the singularity container.
How can I modify my Dockerfile so that when running both the docker container and the singularity container the default environment is "tensorflow"?
Thank you very much for your help!

Your problem is that .bashrc will only be read when you start an interactive shell, but not when the container is running with the default command. See this answer for background information.
There are a bunch of bash startup files where you could put the conda activate tensorflow command in instead. I recommend to define a file of your own, and put the filename into the BASH_ENV environment variable. Both can easily be done from the Dockerfile.

Related

Activate conda env within running Docker container?

Is there a way to spin up a Docker container and then activate a given conda environment within the container using a Python script? I don't have access to the Dockerfile of the image I'm using.
conda run
Conda includes a conda run command for running arbitrary commands within an environment.
docker run <image> "conda run -n <your_env> python <script.py>"
If the script requires interaction, you may need the --live-stream and/or --no-capture-output arguments (see conda run -h).
I think if you load right docker image there is no need to activate conda in your dockerfile try this image below
FROM continuumio/miniconda3
RUN conda info
This worked for me.

Your shell has not been properly configured to use 'conda activate' on dockerfile

I am making anaconda3 environment with docker.
However it shows the error like this below.
I guess it is related with some shell problem.. but I can't fixed yet.
CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run
$ conda init <SHELL_NAME>
Currently supported shells are:
- bash
- fish
- tcsh
- xonsh
- zsh
- powershell
See 'conda init --help' for more information and options.
IMPORTANT: You may need to close and restart your shell after running 'conda init'.
My dockerfile is here.
FROM ubuntu:18.04
RUN apt-get -y update
RUN apt-get -y install emacs wget
RUN wget https://repo.continuum.io/archive/Anaconda3-2019.07-Linux-x86_64.sh
RUN /bin/bash Anaconda3-2019.07-Linux-x86_64.sh -b -p $HOME/anaconda3
RUN echo 'export PATH=/root/anaconda3/bin:$PATH' >> /root/.bashrc
#RUN source /root/.bashrc
RUN . /root/.bashrc
RUN /root/anaconda3/bin/conda init bash
RUN /root/anaconda3/bin/conda create -n py37 python=3.7 anaconda
RUN /root/anaconda3/bin/conda activate py37
I believe your issue may be that you are sourcing your .bashrc on a separate line from the commands that rely on it. From the Dockerfile documentation:
The RUN instruction will execute any commands in a new layer on top of the current image and commit the results. The resulting committed image will be used for the next step in the Dockerfile.
This means that your are sourcing your .bashrc in one layer (the first RUN line), then RUNning the conda command in a new layer, one which doesn't know anything about environment in the previous layer.
Try something like this:
RUN . /root/.bashrc && \
/root/anaconda3/bin/conda init bash && \
/root/anaconda3/bin/conda create -n py37 python=3.7 anaconda && \
/root/anaconda3/bin/conda activate py37
By running them all on one line, you're running them in a single layer.
You can also do this in quite a bit easier way, if you place just one command SHELL ... in your Dockerfile after creation your venv like this:
FROM continuumio/anaconda3
WORKDIR /usr/src/app
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
COPY ./environment.yml .
RUN conda env create -f environment.yml
SHELL ["conda", "run", "-n", "venv", "/bin/bash", "-c"]
COPY . .
This stuff helps you if you want to use conda with the docker-compose util.

Miniconda with dockerfile, how to use the conda environment

Goal: create a docker image from miniconda that will install all my dependencies and then run some commands for django and other packages. Also every time someone bin/bash into the container it should start with those packages available without me adding an entrypoint and do env hacks there.
Dockerfile:
FROM continuumio/miniconda3
ADD environment.yml /code/
WORKDIR /code/
RUN conda env create -f environment.yml # successful
RUN python test/manage.py 8000 # fails, no dependencies like pandas installed
But now I'm stuck, say I want to run some commands in the created environment:
RUN python manage.py runserver
it doesn't run it in my environment.
Some ugly hacks here: https://github.com/ContinuumIO/docker-images/issues/89 that don't actually work because you're using a new shell session when you enter a container or do another RUN command so you have to concatenate the commands with && (ugly).
Ideally I want to install all my conda packages globally from environment.yml but apparently I can't do that.
You have to tell docker where is the python version managed by conda, and inside docker this is done with conda run
conda run --no-capture-output -n myenv python run.py
Source
https://pythonspeed.com/articles/activate-conda-dockerfile/

run a Python script with Conda dependencies on a Docker container [duplicate]

Scenario
I'm trying to setup a simple docker image (I'm quite new to docker, so please correct my possible misconceptions) based on the public continuumio/anaconda3 container.
The Dockerfile:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda env list \
&& conda create -n testenv pip -y \
&& source activate testenv \
&& conda env list
Building and image from this by docker build -t test . ends with the error:
/bin/sh: 1: source: not found
when activating the new virtual environment.
Suggestion 1:
Following this answer I tried:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda env list \
&& conda create -y -n testenv pip \
&& /bin/bash -c "source activate testenv" \
&& conda env list
This seems to work at first, as it outputs: prepending /opt/conda/envs/testenv/bin to PATH, but conda env list as well ass echo $PATH clearly show that it doesn't:
[...]
# conda environments:
#
testenv /opt/conda/envs/testenv
root * /opt/conda
---> 80a77e55a11f
Removing intermediate container 33982c006f94
Step 3 : RUN echo $PATH
---> Running in a30bb3706731
/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
The docker files work out of the box as a MWE.
I appreciate any ideas. Thanks!
Using the docker ENV instruction it is possible to add the virtual environment path persistently to PATH. Although this does not solve the selected environment listed under conda env list.
See the MWE:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda create -y -n testenv pip
ENV PATH /opt/conda/envs/testenv/bin:$PATH
RUN echo $PATH
RUN conda env list
Method 1: use SHELL with a custom entrypoint script
EDIT: I have developed a new, improved approach which better than the "conda", "run" syntax.
Sample dockerfile available at this gist. It works by leveraging a custom entrypoint script to set up the environment before execing the arguments of the RUN stanza.
Why does this work?
A shell is (put very simply) a process which can act as an entrypoint for arbitrary programs. exec "$#" allows us to launch a new process, inheriting all of the environment of the parent process. In this case, this means we activate conda (which basically mangles a bunch of environment variables), then run /bin/bash -c CONTENTS_OF_DOCKER_RUN.
Method 2: SHELL with arguments
Here is my previous approach, courtesy of Itamar Turner-Trauring; many thanks to them!
# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml
# Set the default docker build shell to run as the conda wrapped process
SHELL ["conda", "run", "-n", "vigilant_detect", "/bin/bash", "-c"]
# Set your entrypoint to use the conda environment as well
ENTRYPOINT ["conda", "run", "-n", "myenv", "python", "run.py"]
Modifying ENV may not be the best approach since conda likes to take control of environment variables itself. Additionally, your custom conda env may activate other scripts to further modulate the environment.
Why does this work?
This leverages conda run to "add entries to PATH for the environment and run any activation scripts that the environment may contain" before starting the new bash shell.
Using conda can be a frustrating experience, since both tools effectively want to monopolize the environment, and theoretically, you shouldn't ever need conda inside a container. But deadlines and technical debt being a thing, sometimes you just gotta get it done, and sometimes conda is the easiest way to provision dependencies (looking at you, GDAL).
Piggybacking on ccauet's answer (which I couldn't get to work), and Charles Duffey's comment about there being more to it than just PATH, here's what will take care of the issue.
When activating an environment, conda sets the following variables, as well as a few that backup default values that can be referenced when deactivating the environment. These variables have been omitted from the Dockerfile, as the root conda environment need never be used again. For reference, these are CONDA_PATH_BACKUP, CONDA_PS1_BACKUP, and _CONDA_SET_PROJ_LIB. It also sets PS1 in order to show (testenv) at the left of the terminal prompt line, which was also omitted. The following statements will do what you want.
ENV PATH /opt/conda/envs/testenv/bin:$PATH
ENV CONDA_DEFAULT_ENV testenv
ENV CONDA_PREFIX /opt/conda/envs/testenv
In order to shrink the number of layers created, you can combine these commands into a single ENV command setting all the variables at once as well.
There may be some other variables that need to be set, based on the package. For example,
ENV GDAL_DATA /opt/conda/envs/testenv/share/gdal
ENV CPL_ZIP_ENCODING UTF-8
ENV PROJ_LIB /opt/conda/envs/testenv/share/proj
The easy way to get this information is to call printenv > root_env.txt in the root environment, activate testenv, then call printenv > test_env.txt, and examine
diff root_env.txt test_env.txt.

Store modules in project folder

Some times i need to use modules which are not a part of default python installation and some times even packages like Anaconda or Canopy does not include them. So every time I move my project to another machine or just reinstall python i need to download them again. So my question is. Is there a way to store necessory modules in the project folder and use them from it without moving to default python installation folder.
You can use virtual environment or docker to install the required modules in your project dir so it is isolated from your system Python installation. In fact, you don't need Python installed on your machine when using docker.
Here is my workflow when developing Django web app with Docker. If your project dir is in /Projects/sampleapp, change the current working directory to the project dir and run the following.
Run a docker container from your terminal:
docker run \
-it --rm \
--name django_app \
-v ${PWD}:/app \
-w /app \
-e PYTHONUSERBASE=/app/.vendors \
-p 8000:8000 \
python:3.5 \
bash -c "export PATH=\$PATH:/app/.vendors/bin && bash"
# Command expalanation:
#
# docker run Run a docker container
# -it Set interactive and allocate a pseudo-TTY
# -rm Remove the container on exit
# --name django_app Set the container name
# -v ${PWD}:/app Mount current dir as /app in the container
# -w /app Set the current working directory to /app
# -e PYTHONUSERBASE=/app/.vendors pip will install packages to /app/.vendors
# -p 8000:8000 Open port 8000
# python:3.5 Use the Python:3.5 docker image
# bash -c "..." Add /app/.vendors/bin to PATH and open the shell
On the container's shell, install the required packages:
pip install django celery django-allauth --user
pip freeze > requirements.txt
The --user options along with the PYTHONUSERBASE environment variable will make pip installs the packages in /app/.vendors.
Create the django project and develop the app as usual:
django-admin startproject sampleapp
cd sampleapp
python manage.py runserver 0.0.0.0:8000
The directory structure will look like this:
Projects/
sampleapp/
requirements.txt
.vendors/ # Note: don't add this dir to your VCS
sampleapp/
manage.py
...
This configuration enables you to install the packages in your project dir, isolated from your system. Note that you need to add requirements.txt to your VCS, but remember to exclude .vendors/ dir.
When you need to move and run the project on another machine, run the docker command above and reinstall the required packages on the container's shell:
pip install -r requirements.txt --user

Categories

Resources