bulding docker image for several script

bulding docker image for several script - python

I have 2 python script and one R script but the main scripts to run are the python script (I call the R script in one of the python scripts). I have to dockerize all these script. to do so I have made Dockerfile which is here:
FROM python:3.7
WORKDIR /opt/app/
ADD ./ ./
RUN pip3.7 install -r ./requirements.txt
CMD python3.7 qc.py
CMD python3.7 cano.py
So, I have 2 questions:
1- shall I include the R script in the Dockerfile? (that is myscript.r)
2- before running the docker image I need to build the image. if I had one script (qc.py) to run I will use the following command to build image:
sudo docker build -t qc .
but what would be the command to build the image for the Dockerfile with more than one script?

The docker image produced when calling docker build should stay separate from the execution of the scripts.
To execute something that's inside of an image, you can use docker run.
Using your example:
This is the directory with your Dockerfile in it:
$ tree .
├── Dockerfile
├── cano.py
├── myscript.r
├── qc.py
└── requirements.txt
0 directories, 5 files
We want to build a docker image that has all of the R and Python scripts in it, and all of the dependencies to execute those scripts, but we don't necessarily want to run them yet.
In your Dockerfile, you don't have the dependencies needed to run myscript.r because the base image (FROM python:3.7) doesn't have the required packages installed. I looked up what was required to run an R script in the r-base repo on docker hub and in the repo on github, and then added it to the Dockerfile.
FROM python:3.7
# Install the dependencies for R
RUN apt-get update && apt-get install -y r-base r-base-dev r-recommended
# Add all of the scripts to the /opt/app/ path inside of the image
ADD . /opt/app/
# Change the working directory inside of the image to /opt/app/
WORKDIR /opt/app/
# Install the python dependencies in /opt/app/requirements.txt using pip
RUN pip3.7 install -r ./requirements.txt
# This command just shows info about the contents of the image. It doesn't run any
# scripts, since that will be done _AFTER_ the image is built.
CMD pwd && ls -AlhF ./
Notice that the default CMD doesn't run any of the scripts. Instead we can do that using the docker run command from the terminal:
# The --rm removes the container after executing, and the -it makes the container interactive
$ docker run --rm -it qc python cano.py
Hello world! (from cano.py)
Now, putting it all together:
# Starting in the directory with your Dockerfile in it
$ ls .
Dockerfile cano.py myscript.r qc.py requirements.txt
# Build the docker image, and tag it as "qc"
$ docker build -t qc .
Sending build context to Docker daemon 6.656kB
Step 1/6 : FROM python:3.7
---> fbf9f709ca9f
Step 2/6 : RUN apt-get update && apt-get install -y r-base r-base-dev r-recommended
# ...lots of output...
Successfully tagged qc:latest
# Run the scripts
$ docker run --rm -it qc python cano.py
Hello world! (from cano.py)
$ docker run --rm -it qc python qc.py
Hello world! (from qc.py)
$ docker run --rm -it qc Rscript myscript.r
[1] "Hello world! (from myscript.r)"
I've collected all of the example code in this github gist to make it easier to see everything in one place.

Related

How to run python files kept in separate folders from a Dockerfile?

i have a dockerfile which looks like this:
FROM python:3.7-slim-stretch
ENV PIP pip
RUN \
$PIP install --upgrade pip && \
$PIP install scikit-learn && \
$PIP install scikit-image && \
$PIP install rasterio && \
$PIP install geopandas && \
$PIP install matplotlib
COPY sentools sentools
COPY data data
COPY vegetation.py .
Now in my project i have two python files vegetation and forest. i have kept each of them in separate folders. How can i create separate docker images for both python files and execute the containers for them separately?

If the base code is same, and only the container is supposed to run up with different Python Script, So then I will suggest using single Docker and you will not worry about the management of two docker image.
Set vegetation.py to default, when container is up without passing ENV it will run vegetation.py and if the ENV FILE_TO_RUN override during run time, the specified file will be run.
FROM python:3.7-alpine3.9
ENV FILE_TO_RUN="/vegetation.py"
COPY vegetation.py /vegetation.py
CMD ["sh", "-c", "python $FILE_TO_RUN"]
Now, if you want to run forest.py then you can just pass the path file to ENV.
docker run -it -e FILE_TO_RUN="/forest.py" --rm my_image
or
docker run -it -e FILE_TO_RUN="/anyfile_to_run.py" --rm my_image
updated:
You can manage with args+env in your docker image.
FROM python:3.7-alpine3.9
ARG APP="default_script.py"
ENV APP=$APP
COPY $APP /$APP
CMD ["sh", "-c", "python /$APP"]
Now build with ARGs
docker build --build-arg APP="vegetation.py" -t app_vegetation .
or
docker build --build-arg APP="forest.py" -t app_forest .
Now good to run
docker run --rm -it app_forest
copy both
FROM python:3.7-alpine3.9
# assign some default script name to args
ARG APP="default_script.py"
ENV APP=$APP
COPY vegetation.py /vegetation.py
COPY forest.py /forest.py
CMD ["sh", "-c", "python /$APP"]

If you insist in creating separate images, you can always use the ARG command.
FROM python:3.7-slim-stretch
ARG file_to_copy
ENV PIP pip
RUN \
$PIP install --upgrade pip && \
$PIP install scikit-learn && \
$PIP install scikit-image && \
$PIP install rasterio && \
$PIP install geopandas && \
$PIP install matplotlib
COPY sentools sentools
COPY data data
COPY $file_to_copy .
And the build the image like that:
docker build --buid-arg file_to_copy=vegetation.py .
or like that
docker build --buid-arg file_to_copy=forest.py .

When you start a Docker container, you can specify what command to run at the end of the docker run command. So you can build a single image that contains both scripts and pick which one runs when you start the container.
The scripts should be "normally" executable: they need to have the executable permission bit set, and they need to start with a line like
#!/usr/bin/env python3
and you should be able to locally (outside of Docker) run
. some_virtual_environment/bin/activate
./vegetation.py
Once you've gotten through this, you can copy the content into a Docker image
FROM python:3.7-slim-stretch
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY sentools sentools
COPY data data
COPY vegetation.py forest.py .
CMD ["./vegetation.py"]
Then you can build and run this image with either script.
docker build -t trees .
docker run --rm trees ./vegetation.py
docker run --rm trees ./forest.py
If you actually want this to be two separate images, you can create two separate Dockerfiles that differ only in their final COPY and CMD lines, and use the docker build -f option to pick which one to use.
$ tail -2 Dockerfile.vegetation
COPY vegetation.py ./
CMD ["./vegetation.py"]
$ docker build -t vegetation -f Dockerfile.vegetation .
$ docker run --rm vegetation

How to run a python program using Singularity from a docker container?

I have created a docker container for my pure python program and have set python main.py to be executed when the container is run. Running the container works as expected on my local machine. However, I want to run the container on my institution's high-performance cluster. The cluster machines use Singularity, which I am using to pull my docker image hosted on Dockerhub (the repo is darshank11/ga_paci_final). However, when I try to run the Singularity container, I get the following error: python3: can't open file 'main.py': [Errno 2] No such file or directory.
I've tried to change the base image in the Dockerfile, for example from FROM python:latest to FROM ubuntu:latest. I've made sure the docker container worked on my local machine, and then got one of my co-workers to pull the container from Dockerhub and run it too. Everything works fine until I get to Singularity.
Here is my docker file:
FROM ubuntu:16.04
RUN apt-get update -y && \
apt-get install -y python3-pip python3-dev
RUN mkdir src
WORKDIR /src
COPY . /src
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
CMD ["python3", "-u", "main.py"]

You're getting that error because the execution context is not what you're expecting. The run path in singularity is the current directory on the host OS (e.g., ~/ga_paci_final), which has been mounted into the singularity image.
As mentioned in the comments, one solution is to give the full path to the python file in the docker CMD statement. Another option is to modify the %runscript block of singularity definition file to something like:
%runscript
cd /src
python3 -u main.py
That way you ensure the run environment is identical between Docker and Singularity.

How to start the container in Docker in my case

I am trying to setup my python environment in docker.
My docker image is like this:
FROM python:2.7
# updating repository
RUN apt-get update
RUN mkdir /usr/src/app
WORKDIR /usr/src/app
COPY requirements.txt requirements.txt
RUN pip install --no-cache -r requirements.txt
EXPOSE 8888
COPY . .
CMD ["python", "test.py"]
with this build command:
docker build -t ml-python-2.7 .
After image is built,
I ran
docker run -it --name ml-container -v ${PWD}:/usr/src/app ml-python-2.7 python test.py
My sample test.py
print('test here')
It works when I first run this command and update the output every time I changed my test.py
The problem is if I want to keep the container and remove the --rm option, the container quit and I can't run
docker run -it --name ml-container -v ${PWD}:/usr/src/app ml-python-2.7 python test.py
anymore because it says there is a container name conflict. How do I keep the container and run the test.py again after that file is updated? Thanks!

After the container has exited, you can start it again using docker start. More information here: How to continue a docker which is exited

How to setup python environment in docker in my case?

I am trying to setup my python environment in docker.
My docker image is like this:
FROM python:2.7
# updating repository
RUN apt-get update
RUN mkdir /usr/src/app
WORKDIR /usr/src/app
COPY requirements.txt requirements.txt
RUN pip install --no-cache -r requirements.txt
EXPOSE 8888
COPY . .
CMD ["python", "test.py"]
with this build command:
docker build -t ml-python-2.7 .
After image is built,
I ran
docker run -it --rm --name ml-container ml-python-2.7 python test.py
My sample test.py
print('test here')
It works when I first run this command.
docker run -it --rm --name ml-container ml-python-2.7 python test.py
but after I change the test.py to print('second test')
and run the above command again, it still output test here.
How do I make sure it updates automatically or if there is more elegant way to do this?
Thanks!

Docker does not store the changes you are making to files inside the container unless you commit it. If you want it to do so, you need to do a Docker Commit like:
docker commit <CONTAINER NAME HERE>
Or you could mount a local folder to the docker image like this:
docker run -ti -v ~/folder_in_host:/var/log/folder_in_container <IMAGE NAME HERE>

Docker. No such file or directory

I have some files which I want to move them to a docker container.
But at the end docker can't find a file..
The folder with the files on local machine are at /home/katalonne/flask4
File Structure if it matters:
The Dockerfile:
#
# First Flask App Dockerfile
#
#
# Pull base image.
FROM centos:7.0.1406
# Build commands
RUN yum install -y python-setuptools mysql-connector mysql-devel gcc python-devel
RUN easy_install pip
RUN mkdir /opt/flask4
WORKDIR /opt/flask4
ADD requirements.txt /opt/flask4
RUN pip install -r requirements.txt
ADD . /opt/flask4
# Define deafult command.
CMD ["python","hello.py"]
# Expose ports.
EXPOSE 5000
So I built the image with this command :
docker build -t flask4 .
I ran the container with volume by :
docker run -d -p 5000:5000 -v /home/Katalonne/flask4:/opt/flask4 --name web flask4
And when I want to run the file on the container :
docker logs -f web
I get this error that it can not find my hello.py file :
python: can't open file 'hello.py': [Errno 2] No such file or directory
What is my fault?
P.S. : I'm a Docker and Linux partially-noob.

The files and directories that are located in the same location as your Dockerfile are indeed available (temporarily) to your docker build. But, after the docker build, unless you have used ADD or COPY to move those files permanently to the docker container, they will not be available to your docker container after the build is done. This file context is for the build, but you want to move them to the container.
You can add the following command:
...
ADD . /opt/flask4
ADD . .
# Define deafult command.
CMD ["python","hello.py"]
The line ADD . . should copy over all the things in your temporary build context to the container. The location that these files will go to is where your WORKDIR is pointing to (/opt/flask4).
If you only wanted to add hello.py to your container, then use
ADD hello.py hello.py
So, when you run CMD ["python","hello.py"], the pwd that you will be in is /opt/flask4, and hello.py should be in there, and running the command python hello.py in that directory should work.
HTH.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.