Docker create volumes in Dockerfile and compile app on it - python

In my docker django project i need for read/write purpose to create a volumes in my Dockerile and install/run app on it.
i found this article : DockerFile on StackOverflow but sincerly i don't understand more about it.
Here my Dockerfile:
FROM python:3.6-alpine
EXPOSE 8000
RUN apk update
RUN apk add --no-cache make linux-headers libffi-dev jpeg-dev zlib-dev
RUN apk add postgresql-dev gcc python3-dev musl-dev
RUN mkdir /Code
VOLUME /var/lib/cathstudio/data
WORKDIR /Code
COPY ./requirements.txt .
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
ENV PYTHONUNBUFFERED 1
COPY . /Code/
ENTRYPOINT python /Code/core/manage.py runserver 0.0.0.0:8000
at my original file i add the VOLUME /var/lib/cathstudio/data instruction, but after that how can i say to the rest of my code to use that volumes for WORKDIR, install requirements.txt, copy code and run app?
i don't what to specify it in RUN statement with -v directive after build, i would integrate the volume creation and manage directly in dockerfile.
So many thanks in advance

for anything expect pip you may specify workdir once:
WORKDIR /var/lib/cathstudio/data
for pip use -t or --target:
pip install -t /var/lib/cathstudio/data
-t, --target
Install packages into <dir>. By default this will not replace existing files/folders in <dir>. Use --upgrade to replace existing
packages in with new versions

Related

how to copy flask dependencies from one stage to the next one in a Dockerfile?

I am learning about Docker and I have a Dockerfile with a simple app such as this:
FROM python:3.8-alpine
WORKDIR /code
ENV FLASK_APP App.py
ENV FLASK_RUN_HOST 0.0.0.0
ENV FLASK_RUN_PORT :3001
RUN apk update \
&& apk add --virtual build-deps gcc python3-dev musl-dev \
&& apk add --no-cache mariadb-dev
COPY ./myapp/requirements.txt requirements.txt
RUN pip install --no-cache-dir -vv -r requirements.txt
ADD ./myapp .
EXPOSE 3001
CMD ["flask", "run"]
I want to use multistage to have a smaller image, so checking this https://pythonspeed.com/articles/multi-stage-docker-python/ I have change my Dockerfile to this:
FROM python:3.8-alpine as builder
COPY ./myapp/requirements.txt requirements.txt
RUN apk update \
&& apk add --virtual build-deps gcc python3-dev musl-dev \
&& apk add --no-cache mariadb-dev
RUN pip install --user -r requirements.txt
FROM python:3.8-alpine
ADD ./myapp .
COPY --from=builder /root/.local /root/.local
ENV PATH=/root/.local:$PATH
ENV FLASK_APP App.py
ENV FLASK_RUN_HOST 0.0.0.0
ENV FLASK_RUN_PORT 3000
CMD ["python", "-m", "flask", "run"]
But when running the container I get an error telling me the MySQL dp dependecy is not installed (it is in requirements.txt), but it is within the requirements.txt file and in the first Dockerfile works, so I do not know what I am missing as if I get it right the COPY step in the second stage should copy the dependencies installed in the first stage right?. This is the output I get when trying to spin the container:
Traceback (most recent call last):
File "/root/.local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 18, in <module>
from . import _mysql
ImportError: Error loading shared library libmariadb.so.3: No such file or directory (needed by /root/.local/lib/python3.8/site-packages/MySQLdb/_mysql.cpython-38-x86_64-linux-gnu.so)
apk add --no-cache mariadb-dev also install MariaDB libraries, which you don't install in the final image. Their lack is the cause of the errors you get.
Is mysql getting installed from requirements.txt or is it installed by apk MariahDb? If the latter then that’s what is missing in the second image; it’s not pip installed —-user under .local it’s installed systemwide in the first image but not in the second.

Problem with multi-stage Dockerfile (Python - venv)

I'm trying to create a Python webapp docker image using multi-stage, to shrink the image size... right now it's around 300mb... it's also using virtual enviroment.
The docker image builds and runs fine up untill the point I need to add multi-stage so I know something is going wrong after that.... Could you help me out identifying what's wrong?
FROM python:3.8.3-alpine AS origin
RUN apk update && apk add git
RUN apk --no-cache add py3-pip build-base
RUN pip install -U pip
RUN pip install virtualenv
RUN virtualenv venv
RUN source venv/bin/activate
WORKDIR /opt/app
COPY . .
RUN pip install -r requirements.txt
## Works fine until this point ""
FROM alpine:latest
WORKDIR /opt/app
COPY --from=origin /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" VIRTUAL_ENV="/opt/venv"
COPY . /opt/app/
CMD [ "file.py" ]
ENTRYPOINT ["python"]
Without the VENV it looks something like this (still throwing error "sh: python: not found"):
FROM python:3.8.3-alpine AS origin
WORKDIR /opt/app
RUN apk update && apk add git
RUN apk --no-cache add py3-pip build-base
RUN pip install -U pip
COPY . .
RUN pip install -r requirements.txt
FROM alpine:latest
WORKDIR /home
COPY --from=origin /opt/app .
CMD sh -c 'python file.py'
You still need pyhton in your runtime container, since you changed your last image to just alpine it wouldn't work. Just a tip, combine your CMD and ENTRYPOINT under one of them, there is generally no need for having two of them. Try to use only ENTRYPOINT since you can pass CMD easily in runtime for example to activate debug mode more easily.
EDIT: Please stay away from alpine for python apps as you can get some weird issues about it. You can use "python_version-slim-buster" images, they are small enough.

How minimize python3.7 application docker image

I would like to dockerize python program with this Dockerfile:
FROM python:3.7-alpine
COPY requirements.pip ./requirements.pip
RUN python3 -m pip install --upgrade pip
RUN pip install -U setuptools
RUN apk update
RUN apk add --no-cache --virtual .build-deps gcc python3-dev musl-dev openssl-dev libffi-dev g++ && \
python3 -m pip install -r requirements.pip --no-cache-dir && \
apk --purge del .build-deps
ARG APP_DIR=/app
RUN mkdir -p ${APP_DIR}
WORKDIR ${APP_DIR}
COPY app .
ENTRYPOINT [ "python3", "run.py" ]
and this is my requirements.pip file:
pysher~=0.5.0
redis~=2.10.6
flake8~=3.5.0
pandas==0.23.4
Because of pandas, the docker image has 461MB, without pandas 131MB.
I was thinking how to make it smaller, so I build binary file from my applicaiton using:
pyinstaller run.py --onefile
It build 38M binary file. When I run it, it works fine. So I build docker image from Dockerfile:
FROM alpine:3.4
ARG APP_DIR=/app
RUN mkdir -p ${APP_DIR}
WORKDIR ${APP_DIR}
COPY app/dist/run run
ENTRYPOINT [ "/bin/sh", "/app/run" ]
Basicaly, just copied my run binary file into /app directory. It looks fine, image has just 48.8MB. When I run the container, I receive error:
$ docker run --rm --name myapp myminimalimage:latest
/app/run: line 1: syntax error: unexpected "("
Then I was thinking, maybe there is problem with sh, so I installed bash, so I added 3 lines into Dockerfile:
RUN apk update
RUN apk upgrade
RUN apk add bash
Image was built, but when I run it there is error again:
$ $ docker run --rm --name myapp myminimalimage:latest
/app/run: /app/run: cannot execute binary file
My questions:
Why is the image in the first step so big? Can I minimize the size
somehow ? Like choose what to install from pandas package?
Why is my binary file working fine on my system (Kubuntu 18.10) but I
cant run it from alpine:3.4, should I use another image or install
something to run it?
What is the best way to build minimalistic image with my app? One of
mentioned above or is there other ways?
On sizes, make sure you always pass --no-cache-dir when using pip (you use it once, but not in other cases). Similarly, combine uses of apk and make sure the last step is to clear the apk cache so it never gets frozen in an image layer, e.g. replace your three separate RUNs with RUN apk update && apk upgrade && apk add bash && rm -rf /var/cache/apk/*; achieves the same effect in a single layer, that doesn't keep the apk cache around.
Example:
FROM python:3.7-alpine
COPY requirements.pip ./requirements.pip
# Avoid pip cache, use consistent command line with other uses, and merge simple layers
RUN python3 -m pip install --upgrade --no-cache-dir pip && \
python3 -m pip install --upgrade --no-cache-dir setuptools
# Combine update and add into same layer, clear cache explicitly at end
RUN apk update && apk add --no-cache --virtual .build-deps gcc python3-dev musl-dev openssl-dev libffi-dev g++ && \
python3 -m pip install -r requirements.pip --no-cache-dir && \
apk --purge del .build-deps && rm -rf /var/cache/apk/*
Don't expect it to do much (you already used --no-cache-dir on the big pip operation), but it's something. pandas is a huge monolithic package, dependent on other huge monolithic packages; there is a limit to what you can accomplish here.
Keep in mind that if you don't use Alpine, you won't need a compiler, since you can just use wheels. This makes everything simpler... e.g. you don't need to install and then uninstall compilers. Slightly bigger, but only slightly.
(See here for more about why I'm not a fan of Alpine Linux: https://pythonspeed.com/articles/base-image-python-docker-images/)

Integrating Python Poetry with Docker

Can you give me an example of a Dockerfile in which I can install all the packages I need from poetry.lock and pyproject.toml into my image/container from Docker?
There are several things to keep in mind when using poetry together with docker.
Installation
Official way to install poetry is via:
curl -sSL https://install.python-poetry.org | python3 -
This way allows poetry and its dependencies to be isolated from your dependencies. But, in my point of view, it is not a very good thing for two reasons:
poetry version might get an update and it will break your build. In this case you can specify POETRY_VERSION environment variable. Installer will respect it
I do not like the idea to pipe things from the internet into my containers without any protection from possible file modifications
So, I use pip install 'poetry==$POETRY_VERSION'. As you can see, I still recommend to pin your version.
Also, pin this version in your pyproject.toml as well:
[build-system]
# Should be the same as `$POETRY_VERSION`:
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
It will protect you from version mismatch between your local and docker environments.
Caching dependencies
We want to cache our requirements and only reinstall them when pyproject.toml or poetry.lock files change. Otherwise builds will be slow. To achieve working cache layer we should put:
COPY poetry.lock pyproject.toml /code/
After the poetry is installed, but before any other files are added.
Virtualenv
The next thing to keep in mind is virtualenv creation. We do not need it in docker. It is already isolated. So, we use poetry config virtualenvs.create false setting to turn it off.
Development vs Production
If you use the same Dockerfile for both development and production as I do, you will need to install different sets of dependencies based on some environment variable:
poetry install $(test "$YOUR_ENV" == production && echo "--no-dev")
This way $YOUR_ENV will control which dependencies set will be installed: all (default) or production only with --no-dev flag.
You may also want to add some more options for better experience:
--no-interaction not to ask any interactive questions
--no-ansi flag to make your output more log friendly
Result
You will end up with something similar to:
FROM python:3.6.6-alpine3.7
ARG YOUR_ENV
ENV YOUR_ENV=${YOUR_ENV} \
PYTHONFAULTHANDLER=1 \
PYTHONUNBUFFERED=1 \
PYTHONHASHSEED=random \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_VERSION=1.0.0
# System deps:
RUN pip install "poetry==$POETRY_VERSION"
# Copy only requirements to cache them in docker layer
WORKDIR /code
COPY poetry.lock pyproject.toml /code/
# Project initialization:
RUN poetry config virtualenvs.create false \
&& poetry install $(test "$YOUR_ENV" == production && echo "--no-dev") --no-interaction --no-ansi
# Creating folders, and files for a project:
COPY . /code
You can find a fully working real-life example here: wemake-django-template
Update on 2019-12-17
Update poetry to 1.0
Update on 2022-11-24
Update curl command to use modern poetry installation script
Multi-stage Docker build with Poetry and venv
Do not disable virtualenv creation. Virtualenvs serve a purpose in Docker builds, because they provide an elegant way to leverage multi-stage builds. In a nutshell, your build stage installs everything into the virtualenv, and the final stage just copies the virtualenv over into a small image.
Use poetry export and install your pinned requirements first, before copying your code. This will allow you to use the Docker build cache, and never reinstall dependencies just because you changed a line in your code.
Do not use poetry install to install your code, because it will perform an editable install. Instead, use poetry build to build a wheel, and then pip-install that into your virtualenv. (Thanks to PEP 517, this whole process could also be performed with a simple pip install ., but due to build isolation you would end up installing another copy of Poetry.)
Here's an example Dockerfile installing a Flask app into an Alpine image, with a dependency on Postgres. This example uses an entrypoint script to activate the virtualenv. But generally, you should be fine without an entrypoint script because you can simply reference the Python binary at /venv/bin/python in your CMD instruction.
Dockerfile
FROM python:3.7.6-alpine3.11 as base
ENV PYTHONFAULTHANDLER=1 \
PYTHONHASHSEED=random \
PYTHONUNBUFFERED=1
WORKDIR /app
FROM base as builder
ENV PIP_DEFAULT_TIMEOUT=100 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
PIP_NO_CACHE_DIR=1 \
POETRY_VERSION=1.0.5
RUN apk add --no-cache gcc libffi-dev musl-dev postgresql-dev
RUN pip install "poetry==$POETRY_VERSION"
RUN python -m venv /venv
COPY pyproject.toml poetry.lock ./
RUN poetry export -f requirements.txt | /venv/bin/pip install -r /dev/stdin
COPY . .
RUN poetry build && /venv/bin/pip install dist/*.whl
FROM base as final
RUN apk add --no-cache libffi libpq
COPY --from=builder /venv /venv
COPY docker-entrypoint.sh wsgi.py ./
CMD ["./docker-entrypoint.sh"]
docker-entrypoint.sh
#!/bin/sh
set -e
. /venv/bin/activate
while ! flask db upgrade
do
echo "Retry..."
sleep 1
done
exec gunicorn --bind 0.0.0.0:5000 --forwarded-allow-ips='*' wsgi:app
wsgi.py
import your_app
app = your_app.create_app()
This is a minor revision to the answer provided by #Claudio, which uses the new poetry install --no-root feature as described by #sobolevn in his answer.
In order to force poetry to install dependencies into a specific virtualenv, one needs to first enable it.
. /path/to/virtualenv/bin/activate && poetry install
Therefore adding these into #Claudio's answer we have
FROM python:3.10-slim as base
ENV PYTHONFAULTHANDLER=1 \
PYTHONHASHSEED=random \
PYTHONUNBUFFERED=1
WORKDIR /app
FROM base as builder
ENV PIP_DEFAULT_TIMEOUT=100 \
PIP_DISABLE_PIP_VERSION_CHECK=1 \
PIP_NO_CACHE_DIR=1 \
POETRY_VERSION=1.3.1
RUN pip install "poetry==$POETRY_VERSION"
COPY pyproject.toml poetry.lock README.md ./
# if your project is stored in src, uncomment line below
# COPY src ./src
# or this if your file is stored in $PROJECT_NAME, assuming `myproject`
# COPY myproject ./myproject
RUN poetry config virtualenvs.in-project true && \
poetry install --only=main --no-root && \
poetry build
FROM base as final
COPY --from=builder /app/.venv ./.venv
COPY --from=builder /app/dist .
COPY docker-entrypoint.sh .
RUN ./.venv/bin/pip install *.whl
CMD ["./docker-entrypoint.sh"]
If you need to use this for development purpose, you add or remove the --no-dev by replacing this line
RUN . /venv/bin/activate && poetry install --no-dev --no-root
to something like this as shown in #sobolevn's answer
RUN . /venv/bin/activate && poetry install --no-root $(test "$YOUR_ENV" == production && echo "--no-dev")
after adding the appropriate environment variable declaration.
The example uses debian-slim's as base, however, adapting this to alpine-based image should be a trivial task.
TL;DR
I have been able to set up poetry for a Django project using postgres. After doing some research, I ended up with the following Dockerfile:
FROM python:slim
# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE 1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED 1
# Install and setup poetry
RUN pip install -U pip \
&& apt-get update \
&& apt install -y curl netcat \
&& curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
ENV PATH="${PATH}:/root/.poetry/bin"
WORKDIR /usr/src/app
COPY . .
RUN poetry config virtualenvs.create false \
&& poetry install --no-interaction --no-ansi
# run entrypoint.sh
ENTRYPOINT ["/usr/src/app/entrypoint.sh"]
This is the content of entrypoint.sh:
#!/bin/sh
if [ "$DATABASE" = "postgres" ]
then
echo "Waiting for postgres..."
while ! nc -z $SQL_HOST $SQL_PORT; do
sleep 0.1
done
echo "PostgreSQL started"
fi
python manage.py migrate
exec "$#"
Detailed Explanation
Some points to notice:
I have decide to use slim instead of alpine as tag for the python image because even though alpine images are supposed to reduce the size of Docker images and speed up the build, with Python, you can actually end up with a bit larger image and that takes a while to build (read this article for more info).
Using this configuration builds containers faster than using the alpine image because I do not need to add some extra packages to install Python packages properly.
I am installing poetry directly from the URL provided in the documentation. I am aware of the warnings provided by sobolevn. However, I consider that it is better in the long term to use the lates version of poetry by default than relying on an environment variable that I should update periodically.
Updating the environment variable PATH is crucial. Otherwise, you will get an error saying that poetry was not found.
Dependencies are installed directly in the python interpreter of the container. It does not create poetry to create a virtual environment before installing the dependencies.
In case you need the alpine version of this Dockerfile:
FROM python:alpine
# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE 1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED 1
# Install dev dependencies
RUN apk update \
&& apk add curl postgresql-dev gcc python3-dev musl-dev openssl-dev libffi-dev
# Install poetry
RUN pip install -U pip \
&& curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
ENV PATH="${PATH}:/root/.poetry/bin"
WORKDIR /usr/src/app
COPY . .
RUN poetry config virtualenvs.create false \
&& poetry install --no-interaction --no-ansi
# run entrypoint.sh
ENTRYPOINT ["/usr/src/app/entrypoint.sh"]
Notice that the alpine version needs some dependencies postgresql-dev gcc python3-dev musl-dev openssl-dev libffi-dev to work properly.
That's minimal configuration that works for me:
FROM python:3.7
ENV PIP_DISABLE_PIP_VERSION_CHECK=on
RUN pip install poetry
WORKDIR /app
COPY poetry.lock pyproject.toml /app/
RUN poetry config virtualenvs.create false
RUN poetry install --no-interaction
COPY . /app
Note that it is not as safe as #sobolevn's configuration.
As a trivia I'll add that if editable installs will be possible for pyproject.toml projects, a line or two could be deleted:
FROM python:3.7
ENV PIP_DISABLE_PIP_VERSION_CHECK=on
WORKDIR /app
COPY poetry.lock pyproject.toml /app/
RUN pip install -e .
COPY . /app
Here's a stripped example where first a layer with the dependencies (that is only build when these changed) and then one with the full source code is added to an image. Setting poetry to install into the global site-packages leaves a configuration artifact that could also be removed.
FROM python:alpine
WORKDIR /app
COPY poetry.lock pyproject.toml ./
RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir poetry \
\
&& poetry config settings.virtualenvs.create false \
&& poetry install --no-dev \
\
&& pip uninstall --yes poetry \
COPY . ./
Use docker multiple stage build and python slim image, export poetry lock to requirements.txt, then install via pip inside virtualenv.
It has smallest size, not require poetry in runtime image, pin the versions of everything.
FROM python:3.9.7 as base
ENV PIP_DISABLE_PIP_VERSION_CHECK=1
WORKDIR /app
FROM base as poetry
RUN pip install poetry==1.1.12
COPY poetry.lock pyproject.toml /app/
RUN poetry export -o requirements.txt
FROM base as build
COPY --from=poetry /app/requirements.txt /tmp/requirements.txt
RUN python -m venv .venv && \
.venv/bin/pip install 'wheel==0.36.2' && \
.venv/bin/pip install -r /tmp/requirements.txt
FROM python:3.9.7-slim as runtime
ENV PIP_DISABLE_PIP_VERSION_CHECK=1
WORKDIR /app
ENV PATH=/app/.venv/bin:$PATH
COPY --from=build /app/.venv /app/.venv
COPY . /app
My Dockerfile based on #lmiguelvargasf's answer. Do refer to his post for a more detailed explanation. The only significant changes I have are the following:
I am now using the latest official installer install-poetry.py instead of the deprecated get-poetry.py as recommended in their official documentation. I'm also installing a specific version using the --version flag but you can alternatively use the environment variable POETRY_VERSION. More info on their official docs!
The PATH I use is /root/.local/bin:$PATH instead of ${PATH}:/root/.poetry/bin from OP's Dockerfile
FROM python:3.10.4-slim-buster
ENV PYTHONDONTWRITEBYTECODE 1 \
PYTHONUNBUFFERED 1
RUN apt-get update \
&& apt-get install curl -y \
&& curl -sSL https://install.python-poetry.org | python - --version 1.1.13
ENV PATH="/root/.local/bin:$PATH"
WORKDIR /usr/app
COPY pyproject.toml poetry.lock ./
RUN poetry config virtualenvs.create false \
&& poetry install --no-dev --no-interaction --no-ansi
COPY ./src ./
EXPOSE 5000
CMD [ "poetry", "run", "gunicorn", "-b", "0.0.0.0:5000", "test_poetry.app:create_app()" ]
I've created a solution using a lock package (package which depends on all versions in the lock file). This results in a clean pip-only install without requirements files.
Steps are: build the package, build the lock package, copy both wheels into your container, install both wheels with pip.
Installation is: poetry add --dev poetry-lock-package
Steps outside of docker build are:
poetry build
poetry run poetry-lock-package --build
Then your Dockerfile should contain:
FROM python:3-slim
COPY dist/*.whl /
RUN pip install --no-cache-dir /*.whl \
&& rm -rf /*.whl
CMD ["python", "-m", "entry_module"]
I see all the answers here are using the pip way to install Poetry to avoid version issue.
The official way to install poetry read POETRY_VERSION env variable if defined to install the most appropriate version.
There is an issue in github here and I think the solution from this ticket is quite interesting:
# `python-base` sets up all our shared environment variables
FROM python:3.8.1-slim as python-base
# python
ENV PYTHONUNBUFFERED=1 \
# prevents python creating .pyc files
PYTHONDONTWRITEBYTECODE=1 \
\
# pip
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
\
# poetry
# https://python-poetry.org/docs/configuration/#using-environment-variables
POETRY_VERSION=1.0.3 \
# make poetry install to this location
POETRY_HOME="/opt/poetry" \
# make poetry create the virtual environment in the project's root
# it gets named `.venv`
POETRY_VIRTUALENVS_IN_PROJECT=true \
# do not ask any interactive question
POETRY_NO_INTERACTION=1 \
\
# paths
# this is where our requirements + virtual environment will live
PYSETUP_PATH="/opt/pysetup" \
VENV_PATH="/opt/pysetup/.venv"
# prepend poetry and venv to path
ENV PATH="$POETRY_HOME/bin:$VENV_PATH/bin:$PATH"
# `builder-base` stage is used to build deps + create our virtual environment
FROM python-base as builder-base
RUN apt-get update \
&& apt-get install --no-install-recommends -y \
# deps for installing poetry
curl \
# deps for building python deps
build-essential
# install poetry - respects $POETRY_VERSION & $POETRY_HOME
RUN curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python
# copy project requirement files here to ensure they will be cached.
WORKDIR $PYSETUP_PATH
COPY poetry.lock pyproject.toml ./
# install runtime deps - uses $POETRY_VIRTUALENVS_IN_PROJECT internally
RUN poetry install --no-dev
# `development` image is used during development / testing
FROM python-base as development
ENV FASTAPI_ENV=development
WORKDIR $PYSETUP_PATH
# copy in our built poetry + venv
COPY --from=builder-base $POETRY_HOME $POETRY_HOME
COPY --from=builder-base $PYSETUP_PATH $PYSETUP_PATH
# quicker install as runtime deps are already installed
RUN poetry install
# will become mountpoint of our code
WORKDIR /app
EXPOSE 8000
CMD ["uvicorn", "--reload", "main:app"]
# `production` image used for runtime
FROM python-base as production
ENV FASTAPI_ENV=production
COPY --from=builder-base $PYSETUP_PATH $PYSETUP_PATH
COPY ./app /app/
WORKDIR /app
CMD ["gunicorn", "-k", "uvicorn.workers.UvicornWorker", "main:app"]
There are two projects where you can see how to do it properly, or you can use these ones to build your own images upon as they are just base images:
https://github.com/max-pfeiffer/uvicorn-poetry
https://github.com/max-pfeiffer/uvicorn-gunicorn-poetry
Dockerfile of base image: https://github.com/max-pfeiffer/uvicorn-poetry/blob/main/build/Dockerfile
ARG OFFICIAL_PYTHON_IMAGE
FROM ${OFFICIAL_PYTHON_IMAGE}
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=off \
PIP_DISABLE_PIP_VERSION_CHECK=on \
PIP_DEFAULT_TIMEOUT=100 \
POETRY_VERSION=1.1.11 \
POETRY_HOME="/opt/poetry" \
POETRY_VIRTUALENVS_IN_PROJECT=true \
PYTHONPATH=/application_root \
VIRTUAL_ENVIRONMENT_PATH="/application_root/.venv"
ENV PATH="$POETRY_HOME/bin:$VIRTUAL_ENVIRONMENT_PATH/bin:$PATH"
# https://python-poetry.org/docs/#osx--linux--bashonwindows-install-instructions
RUN apt-get update \
&& apt-get install --no-install-recommends -y \
build-essential \
curl \
&& curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python - \
&& apt-get purge --auto-remove -y \
build-essential \
curl
COPY ./scripts/start_uvicorn.sh /application_server/
RUN chmod +x /application_server/start_uvicorn.sh
COPY ./scripts/pytest_entrypoint.sh ./scripts/black_entrypoint.sh /entrypoints/
RUN chmod +x /entrypoints/pytest_entrypoint.sh
RUN chmod +x /entrypoints/black_entrypoint.sh
EXPOSE 80
CMD ["/application_server/start_uvicorn.sh"]
Dockerfile of sample project image: https://github.com/max-pfeiffer/uvicorn-poetry/blob/main/examples/fast_api_multistage_build/Dockerfile
ARG BASE_IMAGE_NAME_AND_TAG=pfeiffermax/uvicorn-poetry:1.0.1-python3.9.8-slim-bullseye
FROM ${BASE_IMAGE_NAME_AND_TAG} as base-image
WORKDIR /application_root
# install [tool.poetry.dependencies]
# this will install virtual environment into /.venv because of POETRY_VIRTUALENVS_IN_PROJECT=true
# see: https://python-poetry.org/docs/configuration/#virtualenvsin-project
COPY ./poetry.lock ./pyproject.toml /application_root/
RUN poetry install --no-interaction --no-root --no-dev
FROM base-image as test-base-image
ENV LOG_LEVEL="debug"
COPY --from=base-image $VIRTUAL_ENVIRONMENT_PATH $VIRTUAL_ENVIRONMENT_PATH
# install [tool.poetry.dev-dependencies]
RUN poetry install --no-interaction --no-root
COPY /app /application_root/app/
COPY /tests /application_root/tests/
# image for running pep8 checks
FROM test-base-image as black-test-image
ENTRYPOINT /entrypoints/black_entrypoint.sh $0 $#
CMD ["--target-version py39", "--check", " --line-length 80", "app"]
# image for running unit tests
FROM test-base-image as unit-test-image
ENTRYPOINT /entrypoints/pytest_entrypoint.sh $0 $#
# You need to use pytest-cov as pytest plugin. Makes life very simple.
# tests directory is configured in pyproject.toml
# https://github.com/pytest-dev/pytest-cov
CMD ["--cov=app", "--cov-report=xml:/test_coverage_reports/unit_tests_coverage.xml"]
FROM base-image as development-image
ENV RELOAD="true" \
LOG_LEVEL="debug"
COPY --from=base-image $VIRTUAL_ENVIRONMENT_PATH $VIRTUAL_ENVIRONMENT_PATH
# install [tool.poetry.dev-dependencies]
RUN poetry install --no-interaction --no-root
COPY . /application_root/
FROM base-image as production-image
COPY --from=base-image $VIRTUAL_ENVIRONMENT_PATH $VIRTUAL_ENVIRONMENT_PATH
# This RUN statement fixes an issue while running the tests with GitHub Actions.
# Tests work reliable locally on my machine or running GitHub Actions using act.
# There is a bug with multistage builds in GitHub Actions which I can also reliable reproduce
# see: https://github.com/moby/moby/issues/37965
# Will also check if I can fix that annoying issue with some tweaks to docker build args
# see: https://gist.github.com/UrsaDK/f90c9632997a70cfe2a6df2797731ac8
RUN true
COPY /app /application_root/app/
Here's a different approach that leaves Poetry intact so you can still use poetry add etc. This is good if you're using a VS Code devcontainer.
In short, install Poetry, let Poetry create the virtual environment, then enter the virtual environment every time you start a new shell by modifying .bashrc.
FROM ubuntu:20.04
RUN apt-get update && apt-get install -y python3 python3-pip curl
# Use Python 3 for `python`, `pip`
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 1 \
&& update-alternatives --install /usr/bin/pip pip /usr/bin/pip3 1
# Install Poetry
RUN curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/install-poetry.py | python3 -
ENV PATH "$PATH:/root/.local/bin/"
# Install Poetry packages (maybe remove the poetry.lock line if you don't want/have a lock file)
COPY pyproject.toml ./
COPY poetry.lock ./
RUN poetry install --no-interaction
# Provide a known path for the virtual environment by creating a symlink
RUN ln -s $(poetry env info --path) /var/my-venv
# Clean up project files. You can add them with a Docker mount later.
RUN rm pyproject.toml poetry.lock
# Hide virtual env prompt
ENV VIRTUAL_ENV_DISABLE_PROMPT 1
# Start virtual env when bash starts
RUN echo 'source /var/my-venv/bin/activate' >> ~/.bashrc
Reminder that there's no need to avoid the virtualenv. It doesn't affect performance and Poetry isn't really designed to work without them.
EDIT: #Davos points out that this doesn't work unless you already have a pyproject.toml and poetry.lock file. If you need to handle that case, you might be able to use this workaround which should work whether or not those files exist.
COPY pyproject.toml* ./
COPY poetry.lock* ./
RUN poetry init --no-interaction; (exit 0) # Does nothing if pyproject.toml exists
RUN poetry install --no-interaction
Dockerfile for my python apps looks like this -
FROM python:3.10-alpine
RUN apk update && apk upgrade
RUN pip install -U pip poetry==1.1.13
WORKDIR /app
COPY . .
RUN poetry export --without-hashes --format=requirements.txt > requirements.txt
RUN pip install -r requirements.txt
EXPOSE 8000
ENTRYPOINT [ "python" ]
CMD ["main.py"]

Docker re-build time

We are trying to create a Docker container for a python application. The Dockerfile installs dependencies using "pip install". The Dockerfile looks like
FROM ubuntu:latest
RUN apt-get update -y
RUN apt-get install -y git wget python3-pip
RUN mkdir /app
COPY . /app
RUN pip3 install asn1crypto
RUN pip3 install cffi==1.10.0
RUN pip3 install click==6.7
RUN pip3 install conda==4.3.16
RUN pip3 install Flask==0.12.2
RUN pip3 install Flask-SSLify==0.1.5
RUN pip3 install Flask-SSLify==0.1.5
RUN pip3 install flask-restful==0.3.6
WORKDIR /app
ENTRYPOINT ["python3"]
CMD [ "X.py", "/app/Y.yml" ]
The docker gets created successfully the issue is on the rebuild time.
If nothing is changed in the dockerfile above
If a line is changed in the dockerfile which is after pip install the docker daemon still runs all the commands in pip install, downloading all the packages though not installing them.
Is there a way to optimize the rebuild?
Thx
Below is what i would like to do momentarily with the Dockerfile for optimization -
FROM ubuntu:latest
RUN apt-get update -y && apt-get install -y \
git \
wget \
python3-pip \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY ./requirements.txt .
RUN pip3 install -r requirements.txt
COPY . /app
ENTRYPOINT ["python3"]
CMD [ "X.py", "/app/Y.yml" ]
Reduce the layers by integrating multiple commands into a single one specifically when they are interdependent. This helps reducing the image size.
Always try to use the COPY at the end since a regular source code change may invalidate the next layer caching.
Use a single requirements.txt file for installation through pip. Also define separate steps in case you have lots of packages to install, don't let a normal source code change force packages installation on every build.
Always cleanup the intermediate things which are not required in the final image.
Ref- https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/

Categories

Resources