Pypi / pip blocks while downloading grpcio-reflection - python

Installing grpcio-reflection with pip takes a really long time.
It is strange because the pip package is only 8KB in PyPI but downloading takes more than a minute while other packages that are in the megabytes are downloaded really fast.
UPDATE:
It was not downloading, there is a lot of compilation going on. It seems to be that the feature is still in alpha so the package is not precompiled like standard grpcio
UPDATE2: Repro steps
I have just opened an issue here: https://github.com/grpc/grpc/issues/12992 and I am copying the repro steps here for completion.
It seems that grpci-reflection package installation freezes depending on other packages in the same command line
This can be easily reproduced by these two docker different containers:
Dockerfile.fast - Container creation time ~1m 23s
#Download base ubuntu image
FROM ubuntu:16.04
RUN apt-get update && \
apt-get -y install ca-certificates curl
# Prepare pip
RUN apt-get -y install python-pip
RUN pip install -U pip
RUN pip install grpcio grpcio-tools
RUN pip install grpcio-reflection # Two lines is FAST
Dockerfile.slow - Container creation time 5m 20s
#Download base ubuntu image
FROM ubuntu:16.04
RUN apt-get update && \
apt-get -y install ca-certificates curl
# Prepare pip
RUN apt-get -y install python-pip
RUN pip install -U pip
RUN pip install grpcio grpcio-tools grpcio-reflection # Single line is SLOW
Timing containers build time:
time docker build --rm --no-cache -f Dockerfile.fast -t repro_reflbug_fast:latest .
......
real 1m22.295s
user 0m0.060s
sys 0m0.040s
time docker build --rm --no-cache -f Dockerfile.slow -t repro_reflbug_slow:latest .
.....
real 6m28.290s
user 0m0.052s
sys 0m0.052s
.....
I didn't have time yet to investigate but the second case blocks for a long time while the first one doesnt.

It turns out that this issue was accepted as valid in the corresponding GitHub repo. It is now being discussed here:
https://github.com/grpc/grpc/issues/12992

Related

Is there an option to install Python3.8.13 slim version on an RHEL8 based Docker?

I want the exact version of Python 3.8.13 to run on my container, but so far when I used the below, it generated a very large Docker image:
RUN yum update -y && yum install -y python3.8 python38-pip && yum clean all
The command "yum install python3.8" installs 3.8.13 and this is fine, but as mentioned, the end result (with other required elements) is a bit above 2 GB once built. I would like to make the image smaller and I am wondering if I can use a slim or alpine version of Python 3.8.13.
I was trying with the following commands:
yum install -y python3.8.13-slim
yum install -y python3.8.13-slim-buster
yum install -y python3.8-slim
Did not succeed, yum does not recognize these as valid packages.
Is there a workaround for this?

How can we use opencv in a multistage docker image?

I recently learned about the concept of building docker images based on a multi-staged Dockerfile.
I have been trying simple examples of multi-staged Dockerfiles, and they were working fine. However, when I tried implementing the concept for my own application, I was facing some issues.
My application is about object detection in videos, so I use python and Tensorflow.
Here is my Dockerfile:
FROM python:3-slim AS base
WORKDIR /objectDetector
COPY detect_objects.py .
COPY detector.py .
COPY requirements.txt .
ADD data /objectDetector/data/
ADD models /objectDetector/models/
RUN apt-get update && \
apt-get install protobuf-compiler -y && \
apt-get install ffmpeg libsm6 libxext6 -y && \
apt-get install gcc -y
RUN pip3 install update && python3 -m pip install --upgrade pip
RUN pip3 install tensorflow-cpu==2.9.1
RUN pip3 install opencv-python==4.6.0.66
RUN pip3 install opencv-contrib-python
WORKDIR /objectDetector/models/research
RUN protoc object_detection/protos/*.proto --python_out=.
RUN cp object_detection/packages/tf2/setup.py .
RUN python -m pip install .
RUN python object_detection/builders/model_builder_tf2_test.py
WORKDIR /objectDetector/models/research
RUN pip3 install wheel && pip3 wheel . --wheel-dir=./wheels
FROM python:3-slim
RUN pip3 install update && python3 -m pip install --upgrade pip
COPY --from=base /objectDetector /objectDetector
WORKDIR /objectDetector
RUN pip3 install --no-index --find-links=/objectDetector/models/research/wheels -r requirements.txt
When I try to run my application in the final stage of the container, I receive the following error:
root#3f062f9a5d64:/objectDetector# python detect_objects.py
Traceback (most recent call last):
File "/objectDetector/detect_objects.py", line 3, in <module>
import cv2
ModuleNotFoundError: No module named 'cv2'
So per my understanding, it seems that opencv-python is not successfully moved from the 1st stage to the 2nd.
I have been searching around, and I found some good blogs and questions tackling the issue of multi-staging Dockerfiles, specifically for python libraries. However, it seems I missing something here.
Here are some references that I have been following to solve the issue:
How do I reduce a python (docker) image size using a multi-stage build?
Multi-stage build usage for cuda,cudnn,opencv and ffmpeg #806
So my question is: How can we use opencv in a multistage docker image?

How do I install dateinfer inside my Docker image

Some background : I'm new to understanding docker images and containers and how to write DOCKERFILE. I currently have a Dockerfile which installs all the dependencies that I want through PIP install command and so, it was very simple to build and deploy images.
But I currently have a new requirement to use the Dateinfer module and that cannot be installed through the pip install command.
The repo has to be first cloned and then has to be installed and I'm having difficulty achieving this through a DOCKERFILE. The current work around I've been following for now is to run the container and install it manually in the directory with all the other dependencies and Committing the changes with dateinfer installed.But this is a very tedious and time consuming process and I want to achieve the same by just mentioning it in the DOCKERFILE along with all my other dependencies.
This is what my Dockerfile looks like:
FROM ubuntu:20.04
RUN apt update
RUN apt upgrade -y
RUN apt-get install -y python3
RUN apt-get install -y python3-pip
RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata
RUN apt-get install -y libenchant1c2a
RUN apt install git -y
RUN pip3 install argparse
RUN pip3 install boto3
RUN pip3 install numpy==1.19.1
RUN pip3 install scipy
RUN pip3 install pandas
RUN pip3 install scikit-learn
RUN pip3 install matplotlib
RUN pip3 install plotly
RUN pip3 install kaleido
RUN pip3 install fpdf
RUN pip3 install regex
RUN pip3 install pyenchant
RUN pip3 install openpyxl
ADD core.py /
ENTRYPOINT [ "/usr/bin/python3.8", "/core.py”]
So when I try to install Dateinfer like this:
RUN git clone https://github.com/nedap/dateinfer.git
RUN cd dateinfer
RUN pip3 install .
It throws the following error :
ERROR: Directory '.' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.
The command '/bin/sh -c pip3 install .' returned a non-zero code: 1
How do I solve this?
Each RUN directive in a Dockerfile runs in its own subshell. If you write something like this:
RUN cd dateinfer
That is a no-op: it starts a new shell, changes directory, and then the shell exits. When the next RUN command executes, you're back in the / directory.
The easiest way of resolving this is to include your commands in a single RUN statement:
RUN git clone https://github.com/nedap/dateinfer.git && \
cd dateinfer && \
pip3 install .
In fact, you would benefit from doing this with your other pip install commands as well; rather than a bunch of individual RUN
commands, consider instead:
RUN pip3 install \
argparse \
boto3 \
numpy==1.19.1 \
scipy \
pandas \
scikit-learn \
matplotlib \
plotly \
kaleido \
fpdf \
regex \
pyenchant \
openpyxl
That will generally be faster because pip only needs to resolve
dependencies once.
Rather than specifying all the packages individually on the command
line, you could also put them into a requirements.txt file, and then
use pip install -r requirements.txt.

How to run SSL_library_init() from Python 3.7 docker image

Up until recently I've been using openssl library within python:3.6.6-jessie docker image and thing worked as intented.
I'm using very basic Dockerfile configuration to install all necessary dependencies:
FROM python:3.6.6-jessie
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
WORKDIR /code
RUN apt-get -qq update
RUN apt-get install openssl
RUN apt-get upgrade -y openssl
ADD requirements.txt /code/
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
And access and initialize the library itself with these 2 lines:
openssl = cdll.LoadLibrary("libssl.so")
openssl.SSL_library_init()
Things were working great with this approach.
This week I was doing upgrade of python and libraries and as result I switched to newer docker image:
FROM python:3.7.5
...
This immediatelly caused openssl to stop working because of this exception:
AttributeError: /usr/lib/x86_64-linux-gnu/libssl.so.1.1: undefined symbol: SSL_library_init
From this error I can understand that libssl no longer provides SSL_library_init method (or so it seems to be) which is rather weird issue because the initializer name in openssl documentation is the same.
I also tried to resolve this using -stretch and -buster distributions but the issue remains.
What is the correct approach to run SSL_library_init in those newer distributions? Maybe some additional dockerfile configuration is required?
I think you need to install libssl1.0-dev
RUN apt-get install -y libssl1.0-dev

Docker re-build time

We are trying to create a Docker container for a python application. The Dockerfile installs dependencies using "pip install". The Dockerfile looks like
FROM ubuntu:latest
RUN apt-get update -y
RUN apt-get install -y git wget python3-pip
RUN mkdir /app
COPY . /app
RUN pip3 install asn1crypto
RUN pip3 install cffi==1.10.0
RUN pip3 install click==6.7
RUN pip3 install conda==4.3.16
RUN pip3 install Flask==0.12.2
RUN pip3 install Flask-SSLify==0.1.5
RUN pip3 install Flask-SSLify==0.1.5
RUN pip3 install flask-restful==0.3.6
WORKDIR /app
ENTRYPOINT ["python3"]
CMD [ "X.py", "/app/Y.yml" ]
The docker gets created successfully the issue is on the rebuild time.
If nothing is changed in the dockerfile above
If a line is changed in the dockerfile which is after pip install the docker daemon still runs all the commands in pip install, downloading all the packages though not installing them.
Is there a way to optimize the rebuild?
Thx
Below is what i would like to do momentarily with the Dockerfile for optimization -
FROM ubuntu:latest
RUN apt-get update -y && apt-get install -y \
git \
wget \
python3-pip \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY ./requirements.txt .
RUN pip3 install -r requirements.txt
COPY . /app
ENTRYPOINT ["python3"]
CMD [ "X.py", "/app/Y.yml" ]
Reduce the layers by integrating multiple commands into a single one specifically when they are interdependent. This helps reducing the image size.
Always try to use the COPY at the end since a regular source code change may invalidate the next layer caching.
Use a single requirements.txt file for installation through pip. Also define separate steps in case you have lots of packages to install, don't let a normal source code change force packages installation on every build.
Always cleanup the intermediate things which are not required in the final image.
Ref- https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/

Categories

Resources