How minimize python3.7 application docker image - python

I would like to dockerize python program with this Dockerfile:
FROM python:3.7-alpine
COPY requirements.pip ./requirements.pip
RUN python3 -m pip install --upgrade pip
RUN pip install -U setuptools
RUN apk update
RUN apk add --no-cache --virtual .build-deps gcc python3-dev musl-dev openssl-dev libffi-dev g++ && \
python3 -m pip install -r requirements.pip --no-cache-dir && \
apk --purge del .build-deps
ARG APP_DIR=/app
RUN mkdir -p ${APP_DIR}
WORKDIR ${APP_DIR}
COPY app .
ENTRYPOINT [ "python3", "run.py" ]
and this is my requirements.pip file:
pysher~=0.5.0
redis~=2.10.6
flake8~=3.5.0
pandas==0.23.4
Because of pandas, the docker image has 461MB, without pandas 131MB.
I was thinking how to make it smaller, so I build binary file from my applicaiton using:
pyinstaller run.py --onefile
It build 38M binary file. When I run it, it works fine. So I build docker image from Dockerfile:
FROM alpine:3.4
ARG APP_DIR=/app
RUN mkdir -p ${APP_DIR}
WORKDIR ${APP_DIR}
COPY app/dist/run run
ENTRYPOINT [ "/bin/sh", "/app/run" ]
Basicaly, just copied my run binary file into /app directory. It looks fine, image has just 48.8MB. When I run the container, I receive error:
$ docker run --rm --name myapp myminimalimage:latest
/app/run: line 1: syntax error: unexpected "("
Then I was thinking, maybe there is problem with sh, so I installed bash, so I added 3 lines into Dockerfile:
RUN apk update
RUN apk upgrade
RUN apk add bash
Image was built, but when I run it there is error again:
$ $ docker run --rm --name myapp myminimalimage:latest
/app/run: /app/run: cannot execute binary file
My questions:
Why is the image in the first step so big? Can I minimize the size
somehow ? Like choose what to install from pandas package?
Why is my binary file working fine on my system (Kubuntu 18.10) but I
cant run it from alpine:3.4, should I use another image or install
something to run it?
What is the best way to build minimalistic image with my app? One of
mentioned above or is there other ways?

On sizes, make sure you always pass --no-cache-dir when using pip (you use it once, but not in other cases). Similarly, combine uses of apk and make sure the last step is to clear the apk cache so it never gets frozen in an image layer, e.g. replace your three separate RUNs with RUN apk update && apk upgrade && apk add bash && rm -rf /var/cache/apk/*; achieves the same effect in a single layer, that doesn't keep the apk cache around.
Example:
FROM python:3.7-alpine
COPY requirements.pip ./requirements.pip
# Avoid pip cache, use consistent command line with other uses, and merge simple layers
RUN python3 -m pip install --upgrade --no-cache-dir pip && \
python3 -m pip install --upgrade --no-cache-dir setuptools
# Combine update and add into same layer, clear cache explicitly at end
RUN apk update && apk add --no-cache --virtual .build-deps gcc python3-dev musl-dev openssl-dev libffi-dev g++ && \
python3 -m pip install -r requirements.pip --no-cache-dir && \
apk --purge del .build-deps && rm -rf /var/cache/apk/*
Don't expect it to do much (you already used --no-cache-dir on the big pip operation), but it's something. pandas is a huge monolithic package, dependent on other huge monolithic packages; there is a limit to what you can accomplish here.

Keep in mind that if you don't use Alpine, you won't need a compiler, since you can just use wheels. This makes everything simpler... e.g. you don't need to install and then uninstall compilers. Slightly bigger, but only slightly.
(See here for more about why I'm not a fan of Alpine Linux: https://pythonspeed.com/articles/base-image-python-docker-images/)

Related

Installing python in Dockerfile without using python image as base

I have a python script that uses DigitalOcean tools (doctl and kubectl) I want to containerize. This means my container will need python, doctl, and kubectl installed. The trouble is, I figure out how to install both python and DigitalOcean tools in the dockerfile.
I can install python using the base image "python:3" and I can also install the DigitalOcean tools using the base image "alpine/doctl". However, the rule is you can only use one base image in a dockerfile.
So I can include the python base image and install the DigitalOcean tools another way:
FROM python:3
RUN <somehow install doctl and kubectl>
RUN pip install firebase-admin
COPY script.py
CMD ["python", "script.py"]
Or I can include the alpine/doctl base image and install python3 another way.
FROM alpine/doctl
RUN <somehow install python>
RUN pip install firebase-admin
COPY script.py
CMD ["python", "script.py"]
Unfortunately, I'm not sure how I would do this. Any help in how I can get all these tools installed would be great!
just add this with any other thing you want to apt-get install:
RUN apt-get update && apt-get install -y \
python3.6 &&\
python3-pip &&\
in alpine it should be something like:
RUN apk add --update --no-cache python3 && ln -sf python3 /usr/bin/python &&\
python3 -m ensurepip &&\
pip3 install --no-cache --upgrade pip setuptools &&\
This Dockerfile worked for me:
FROM alpine/doctl
ENV PYTHONUNBUFFERED=1
RUN apk add --update --no-cache python3 && ln -sf python3 /usr/bin/python
RUN python3 -m ensurepip
RUN pip3 install --no-cache --upgrade pip setuptools
This answer comes from here:(https://stackoverflow.com/a/62555259/7479816; I don't have enough street cred to comment)
You can try multi-stage build as shown below.
Also check your copy statement, you need to define where you want script.py file to be copied as second parameter. "." will copy it to root directory
FROM alpine/doctl
FROM python:3.6-slim-buster
ENV PYTHONUNBUFFERED 1
RUN pip install firebase-admin
COPY script.py .
CMD ["python", "script.py"]

how to copy flask dependencies from one stage to the next one in a Dockerfile?

I am learning about Docker and I have a Dockerfile with a simple app such as this:
FROM python:3.8-alpine
WORKDIR /code
ENV FLASK_APP App.py
ENV FLASK_RUN_HOST 0.0.0.0
ENV FLASK_RUN_PORT :3001
RUN apk update \
&& apk add --virtual build-deps gcc python3-dev musl-dev \
&& apk add --no-cache mariadb-dev
COPY ./myapp/requirements.txt requirements.txt
RUN pip install --no-cache-dir -vv -r requirements.txt
ADD ./myapp .
EXPOSE 3001
CMD ["flask", "run"]
I want to use multistage to have a smaller image, so checking this https://pythonspeed.com/articles/multi-stage-docker-python/ I have change my Dockerfile to this:
FROM python:3.8-alpine as builder
COPY ./myapp/requirements.txt requirements.txt
RUN apk update \
&& apk add --virtual build-deps gcc python3-dev musl-dev \
&& apk add --no-cache mariadb-dev
RUN pip install --user -r requirements.txt
FROM python:3.8-alpine
ADD ./myapp .
COPY --from=builder /root/.local /root/.local
ENV PATH=/root/.local:$PATH
ENV FLASK_APP App.py
ENV FLASK_RUN_HOST 0.0.0.0
ENV FLASK_RUN_PORT 3000
CMD ["python", "-m", "flask", "run"]
But when running the container I get an error telling me the MySQL dp dependecy is not installed (it is in requirements.txt), but it is within the requirements.txt file and in the first Dockerfile works, so I do not know what I am missing as if I get it right the COPY step in the second stage should copy the dependencies installed in the first stage right?. This is the output I get when trying to spin the container:
Traceback (most recent call last):
File "/root/.local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 18, in <module>
from . import _mysql
ImportError: Error loading shared library libmariadb.so.3: No such file or directory (needed by /root/.local/lib/python3.8/site-packages/MySQLdb/_mysql.cpython-38-x86_64-linux-gnu.so)
apk add --no-cache mariadb-dev also install MariaDB libraries, which you don't install in the final image. Their lack is the cause of the errors you get.
Is mysql getting installed from requirements.txt or is it installed by apk MariahDb? If the latter then that’s what is missing in the second image; it’s not pip installed —-user under .local it’s installed systemwide in the first image but not in the second.

Problem with multi-stage Dockerfile (Python - venv)

I'm trying to create a Python webapp docker image using multi-stage, to shrink the image size... right now it's around 300mb... it's also using virtual enviroment.
The docker image builds and runs fine up untill the point I need to add multi-stage so I know something is going wrong after that.... Could you help me out identifying what's wrong?
FROM python:3.8.3-alpine AS origin
RUN apk update && apk add git
RUN apk --no-cache add py3-pip build-base
RUN pip install -U pip
RUN pip install virtualenv
RUN virtualenv venv
RUN source venv/bin/activate
WORKDIR /opt/app
COPY . .
RUN pip install -r requirements.txt
## Works fine until this point ""
FROM alpine:latest
WORKDIR /opt/app
COPY --from=origin /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" VIRTUAL_ENV="/opt/venv"
COPY . /opt/app/
CMD [ "file.py" ]
ENTRYPOINT ["python"]
Without the VENV it looks something like this (still throwing error "sh: python: not found"):
FROM python:3.8.3-alpine AS origin
WORKDIR /opt/app
RUN apk update && apk add git
RUN apk --no-cache add py3-pip build-base
RUN pip install -U pip
COPY . .
RUN pip install -r requirements.txt
FROM alpine:latest
WORKDIR /home
COPY --from=origin /opt/app .
CMD sh -c 'python file.py'
You still need pyhton in your runtime container, since you changed your last image to just alpine it wouldn't work. Just a tip, combine your CMD and ENTRYPOINT under one of them, there is generally no need for having two of them. Try to use only ENTRYPOINT since you can pass CMD easily in runtime for example to activate debug mode more easily.
EDIT: Please stay away from alpine for python apps as you can get some weird issues about it. You can use "python_version-slim-buster" images, they are small enough.

How do I install python on alpine linux?

How do I install python3 and python3-pip on an alpine based image (without using a python image)?
$ apk add --update python3.8 python3-pip
ERROR: unsatisfiable constraints:
python3-pip (missing):
required by: world[python3-pip]
python3.8 (missing):
required by: world[python3.8]
This is what I use in a Dockerfile for an alpine image:
# Install python/pip
ENV PYTHONUNBUFFERED=1
RUN apk add --update --no-cache python3 && ln -sf python3 /usr/bin/python
RUN python3 -m ensurepip
RUN pip3 install --no-cache --upgrade pip setuptools
Take a look at the alpine package repo: https://pkgs.alpinelinux.org/packages
So what you are looking for are the python3 and py3-pip packages.
A suitable command to use inside a dockerfile/etc. would be:
apk add --no-cache python3 py3-pip
Explanation of the --no-cache flag
Note however, that you need to add the community repository since py3-pip is not present on main.
instead of python3-pip install py3-pip
apk add --update python3 py3-pip
You can try this command:
apk add python3
Additional option is to build python during image build:
FROM alpine:latest
# you can specify python version during image build
ARG PYTHON_VERSION=3.9.9
# install build dependencies and needed tools
RUN apk add \
wget \
gcc \
make \
zlib-dev \
libffi-dev \
openssl-dev \
musl-dev
# download and extract python sources
RUN cd /opt \
&& wget https://www.python.org/ftp/python/${PYTHON_VERSION}/Python-${PYTHON_VERSION}.tgz \
&& tar xzf Python-${PYTHON_VERSION}.tgz
# build python and remove left-over sources
RUN cd /opt/Python-${PYTHON_VERSION} \
&& ./configure --prefix=/usr --enable-optimizations --with-ensurepip=install \
&& make install \
&& rm /opt/Python-${PYTHON_VERSION}.tgz /opt/Python-${PYTHON_VERSION} -rf
# rest of the image, python3 and pip3 commands will be available
This snippet downloads and builds python of specified version from sources (together with pip). It may be an overkill but sometimes it may come in handy.
You may use the python official image which offers alpine tags as well. You will probably get the most state-of-the-art python install:
e.g.:
FROM python:3-alpine
It looks like you're trying to install a specific minor version of Python3 (3.8), you can do this in Alpine by using semver like this which will install a version of python3>=3.8.0 <3.9.0-0:
apk add python3=~3.8

I have getting 'apt-get upgrade' command failed error while building Python3.6-buster container

There was no problem yesterday when I build my Python Flask application on python:3.6-buster image. But today I am getting this error.
Calculating upgrade...
The following packages will be upgraded: libgnutls-dane0 libgnutls-openssl27 libgnutls28-dev libgnutls30 libgnutlsxx28
5 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 2859 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] Abort.
ERROR: Service 'gateway' failed to build: The command '/bin/sh -c apt-get upgrade' returned a non-zero code: 1
My Dockerfile:
FROM python:3.6-buster
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
RUN echo $TZ > /etc/timezone
RUN apt-get update
RUN apt-get upgrade
RUN apt-get -y install gcc musl-dev libffi-dev
COPY requirements.txt requirements.txt
RUN python3 -m pip install -r requirements.txt
COPY . /application
WORKDIR /application
EXPOSE 7000
I couldn't find any related question. I guest this is about a new update but I don't know actually. Is there any advice or solution for this problem?
I guess that apt is waiting for user input in order to confirm the upgrade. The Docker builder doesn't can't deal with these interactive dialogs without hacky solutions. Therefore, it fails.
The most straight forward solution is to add the -y flag to your commands as you do it on the install command.
FROM python:3.6-buster
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
RUN echo $TZ > /etc/timezone
RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get -y install gcc musl-dev libffi-dev
COPY requirements.txt requirements.txt
RUN python3 -m pip install -r requirements.txt
COPY . /application
WORKDIR /application
EXPOSE 7000
However... do you actually need to update your existing packages? That might be not required in your case. In addition, I might recommend that you check out the Docker Best Practices to write statements including apt commands. In order to keep your image size small, you should consider squashing these commands in a single RUN statement. In addition, you should delete the apt cache afterwards to minimize the changes between your two layers:
FROM python:3.6-buster
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
RUN echo $TZ > /etc/timezone
RUN apt-get update \
&& apt-get -y install gcc musl-dev libffi-dev \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt requirements.txt
RUN python3 -m pip install -r requirements.txt
COPY . /application
WORKDIR /application
EXPOSE 7000

Categories

Resources