how to use multiple dockerfiles for single application

how to use multiple dockerfiles for single application - python

I am using a centos os base image and installing python3 with the following dockerfile
FROM centos:7
ENV container docker
ARG USER=dsadmin
ARG HOMEDIR=/home/${USER}
RUN yum clean all \
&& yum update -q -y -t \
&& yum install file -q -y
RUN useradd -s /bin/bash -d ${HOMEDIR} ${USER}
RUN export LC_ALL=en_US.UTF-8
# install Development Tools to get gcc
RUN yum groupinstall -y "Development Tools"
# install python development so that pip can compile packages
RUN yum -y install epel-release && yum clean all \
&& yum install -y python34-setuptools \
&& yum install -y python34-devel
# install pip
RUN easy_install-3.4 pip
# install virtualenv or virtualenvwrapper
RUN pip3 install virtualenv \
&& pip3 install virtualenvwrapper \
&& pip3 install pandas
# # install django
# RUN pip3 install django
USER ${USER}
WORKDIR ${HOMEDIR}
I build and tag the above as follows:
docker build . --label validation --tag validation
I then need to add a .tar.gz file to the home directory. This file contains all the python scripts I maintain. This file will change frequently. If I add it to the dockerfile above, python is installed every time I change the .gz file. This adds a lot of time to development. As a workaround, I tried creating a second dockerfile file that uses the above image as the base and then just adds the .tar.gz file on it.
FROM validation:latest
ARG USER=dsadmin
ARG HOMEDIR=/home/${USER}
ADD code/validation_utility.tar.gz ${HOMEDIR}/.
USER ${USER}
WORKDIR ${HOMEDIR}
After that if I run docker image and do an ls, all the files in the folder have a owner of games.
-rw-r--r-- 1 501 games 35785 Nov 2 21:24 Validation_utility.py
To fix the above, I added the following lines to the second docker file:
ADD code/validation_utility.tar.gz ${HOMEDIR}/.
RUN chown -R ${USER}:${USER} ${HOMEDIR} \
&& chmod +x ${HOMEDIR}/Validation_utility.py
but I get the error:
chown: changing ownership of '/home/dsadmin/Validation_utility.py': Operation not permitted
The goal is to have two docker files. The users will run the first docker file to install centos and python dependencies. The second dockerfile will install the custom python scripts. If the scripts change, they will just run the second docker file again. Is that the right way to think about docker? Thank you.

Is that the right way to think about docker?
This is the easy part of your question. Yes. You're thinking about the proper way to structure your Dockerfiles, reuse them, and keep your image builds efficient. Good job.
As for the error you're receiving, I'm less confident in answering why the ADD command is un-tarballing your tar.gz as the games user. I'm not nearly as familiar with CentOS. That's the start of the problem. dsadmin, as a regular non-privileged user, can't change ownership of files he doesn't own. Since this un-tarballed script is owned by games, the chown command fails.
I used your Dockerfiles and got the same issue on MacOS.
You can get around this by, well, not using ADD. Which is funny because local tarball extraction is the one use case where Docker thinks you should prefer ADD over COPY.
COPY code/validation_utility.tar.gz ${HOMEDIR}/.
RUN tar -xvf validation_utility.tar.gz
Properly extracts the tarball and, since dsadmin was the user at the time, the contents come out properly owned by dsadmin.
(An uglier route might be to switch the USER to root to set permissions, then set it back to dsadmin. I think this is icky, but it's an option.)

Related

Python script with numpy in dockerfile on Raspberry PI - libf77blas.so.3 error

I want to run a Docker container on my Raspberry PI 2 with a Python script that uses numpy. For this I have the following Dockerfile:
FROM python:3.7
COPY numpy_script.py /
RUN pip install numpy
CMD ["python", "numpy_script.py"]
But when I want to import numpy, I get the error message that libf77blas.so.3 was not found.
I have also tried to install numpy with a wheel from www.piwheels.org, but the same error occurs.
The Google search revealed that I need to install the liblapack3. How do I need to modify my Dockerfile for this?
Inspired by the answer of om-ha, this worked for me:
FROM python:3.7
COPY numpy_script.py /
RUN apt-get update \
&& apt-get -y install libatlas-base-dev \
&& pip install numpy
CMD ["python", "numpy_script.py"]

Working Dockerfile
# Python image (debian-based)
FROM python:3.7
# Create working directory
WORKDIR /app
# Copy project files
COPY numpy_script.py numpy_script.py
# RUN command to update packages & install dependencies for the project
RUN apt-get update \
&& apt-get install -y \
&& pip install numpy
# Commands to run within the container
CMD ["python", "numpy_script.py"]
Explanation
You have an extra trailing \ in your dockerfile, this is used for multi-line shell commands actually. You can see this used in-action here. I used this in my answer above. Beware the last shell command (in this case pip) does not need a trailing \, a mishap that was done in the code you showed.
You should probably use a working directory via WORKDIR /app
Run apt-get update just to be sure everything is up-to-date.
It's recommended to group multiple shell commands within ONE RUN directive using &&. See best practices and this discussion.
Resources sorted by order of use in this dockerfile
FROM
WORKDIR
COPY
RUN
CMD

Can I copy a directory from some location outside the docker area to my dockerfile?

I have installed a library called fastai==1.0.59 via requirements.txt file inside my Dockerfile.
But the purpose of running the Django app is not achieved because of one error. To solve that error, I need to manually edit the files /site-packages/fastai/torch_core.py and site-packages/fastai/basic_train.py inside this library folder which I don't intend to.
Therefore I'm trying to copy the fastai folder itself from my host machine to the location inside docker image.
source location: /Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/fastai/
destination location: ../venv/lib/python3.6/site-packages/ which is inside my docker image.
being new to docker, I tried this using COPY command like:
COPY /Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/fastai/ ../venv/lib/python3.6/site-packages/
which gave me an error:
ERROR: Service 'app' failed to build: COPY failed: stat /var/lib/docker/tmp/docker-builder583041406/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/fastai: no such file or directory.
I tried referring this: How to include files outside of Docker's build context?
but seems like it bounced off my head a bit..
Please help me tackling this. Thanks.
Dockerfile:
FROM python:3.6-slim-buster AS build
MAINTAINER model1
ENV PYTHONUNBUFFERED 1
RUN python3 -m venv /venv
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y git && \
apt-get install -y build-essential && \
apt-get install -y awscli && \
apt-get install -y unzip && \
apt-get install -y nano && \
apt-get install -y libsm6 libxext6 libxrender-dev
RUN apt-cache search mysql-server
RUN apt-cache search libmysqlclient-dev
RUN apt-get install -y libpq-dev
RUN apt-get install -y postgresql
RUN apt-cache search postgresql-server-dev-9.5
RUN apt-get install -y libglib2.0-0
RUN mkdir -p /model/
COPY . /model/
WORKDIR /model/
RUN pip install --upgrade awscli==1.14.5 s3cmd==2.0.1 python-magic
RUN pip install -r ./requirements.txt
EXPOSE 8001
RUN chmod -R 777 /model/
COPY /Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/fastai/ ../venv/lib/python3.6/site-packages/
CMD python3 -m /venv/activate
CMD /model/my_setup.sh development
CMD export API_ENV = development
CMD cd server && \
python manage.py migrate && \
python manage.py runserver 0.0.0.0:8001

Short Answer
No
Long Answer
When you run docker build the current directory and all of its contents (subdirectories and all) are copied into a staging area called the 'build context'. When you issue a COPY instruction in the Dockerfile, docker will copy from the staging area into a layer in the image's filesystem.
As you can see, this procludes copying files from directories outside the build context.
Workaround
Either download the files you want from their golden-source directly into the image during the build process (this is why you often see a lot of curl statements in Dockerfiles), or you can copy the files (dirs) you need into the build-tree and check them into source control as part of your project. Which method you choose is entirely dependent on the nature of your project and the files you need.
Notes
There are other workarounds documented for this, all of them without exception break the intent of 'portability' of your build. The only quality solutions are those documented here (though I'm happy to add to this list if I've missed any that preserve portability).

'python3.5': No such file or directory

I try to install my machine learning environment ( install all the library that I need) using Dockerfile :
Here is the dockerfile :
# Build an image that can do training and inference in SageMaker
# This is a Python 2 image that uses the nginx, gunicorn, flask stack
# for serving inferences in a stable way.
FROM ubuntu:16.04
MAINTAINER Amazon AI <sage-learner#amazon.com>
RUN apt-get -y update && apt-get install -y --no-install-recommends \
wget \
python \
nginx \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Here we get all python packages.
# There's substantial overlap between scipy and numpy that we eliminate by
# linking them together. Likewise, pip leaves the install caches populated which uses
# a significant amount of space. These optimizations save a fair amount of space in the
# image, which reduces start up time.
RUN wget https://bootstrap.pypa.io/get-pip.py && python get-pip.py && \
pip install numpy scipy scikit-learn pandas flask gevent gunicorn && \
(cd /usr/local/lib/python2.7/dist-packages/scipy/.libs; rm *; ln ../../numpy/.libs/* .) && \
rm -rf /root/.cache
# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering our standard
# output stream, which means that logs can be delivered to the user quickly. PYTHONDONTWRITEBYTECODE
# keeps Python from writing the .pyc files which are unnecessary in this case. We also update
# PATH so that the train and serve programs are found when the container is invoked.
ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"
# Set up the program in the image
COPY xgboost /opt/program
WORKDIR /opt/program
But I get this error :
/usr/bin/env: 'python3.5': No such file or directory
Can you help me to solve this problem please?
Thank you

apt-get install python will always install Python 2. If you want Python 3, you need to apt-get install python3.
(You might want to specify a versioned dependency if you require e.g. Python 3.5 specifically.)

In run command provide python version 3.5 explicitly
If this do not work. Please keep note that this issue is because environment variables are not correctly set. Try to set them via hard coded Command using CMD or EXECUTE <"Linux Command">

How to build Docker images quicker

I'm currently building a docker image and running the container to run some tests in it for a Python application I'm working on. Currently the Dockerfile copies the files over from the host machine, sets the working directory to those copied files, runs a sudo apt-get and installs pip, and finally runs the tests from setup.py. The Dockerfile can be seen below.
FROM ubuntu
ADD . /home/dev/ProjectName
WORKDIR /home/dev/ProjectName
RUN apt-get update && \
apt-get install -y python3-pip && \
python3 setup.py test
I was curious if there were a more conventional way to avoid having to run the apt-get and apt-get install pip every time I'd like to run a test. The main idea I had was to build an image with pip already on it, and then build this image from that one.

Docker builds using cached layers if it can. By adding files you have changed it invalidates the cache for all subsequent rules. Put the apt commands first and those will only be run the first time you build. See this blog for more info.

Create base docker centos image with python 2.7.8

I've found this which walks you through creating a base bare-metal centos image. I want to however install some additional yum packages, download Python 2.7.8 and build it.
I had this in a dockerfile and already working like:
# Set the base image to Ubuntu
FROM centos:7
# File Author / Maintainer
MAINTAINER Sam Mohamed
# Update the sources list
RUN yum -y update
RUN yum install -y zlib-dev openssl-devel sqlite-devel bzip2-devel xz-libs gcc g++ build-essential make
# Install Python 2.7.8
RUN curl -o /root/Python-2.7.9.tar.xz https://www.python.org/ftp/python/2.7.9/Python-2.7.9.tar.xz
RUN tar -xf /root/Python-2.7.9.tar.xz -C /root
RUN cd /root/Python-2.7.9 && ./configure --prefix=/usr/local && make && make altinstall
# Copy the application folder inside the container
ADD `pwd` /opt/iws_project
# Download Setuptools and install pip and virtualenv
RUN wget https://bootstrap.pypa.io/ez_setup.py -O - | /usr/local/bin/python2.7
RUN /usr/local/bin/easy_install-2.7 pip
RUN /usr/local/bin/pip2.7 install virtualenv
# Create virtualenv and install requirements:
RUN /usr/local/bin/virtualenv /opt/iws_project/venv && source /opt/iws_project/bin/activate && pip install -r /opt/iws_project/requirements.txt
How can I convert the above into a base image?

You are probably better off building the given Dockerfile and using the resulting image as the base for future images. This is much easier to maintain and doesn't really cost anything in terms of resource use.
But if you really want to create a single-layer "base image", the steps are as follows:
Install everything you want into some directory (docker-centos-65/ in the linked tutorial).
You can modify the febootstrap command from the tutorial you linked to install additional yum packages by specifying more -i flags.
You can perform any other custom installs (e.g. Python) manually, just make sure everything ends up in the same root directory
Create a tar archive of the directory where everything is installed, and pipe this to the docker import command:
tar c -C docker-centos-65/ . | docker import - my-base-image

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.