can't write dockerfile with installations of ubuntu, numpy, opencv - python

the plan is to deploy pretrained face recog-n model. But before i need to install some libs.
The idea behind docker is that it brings all the needed libs and builds entire 'env' without much overhead. One can just start dockerfile and it runs all other scripts in turn.
libs to install:
Ubuntu 16.04.6 LTS
Python 3.6.10 (3.5.x should be fine also)
OpenCV 3.3.
NumPy
imutils https://github.com/jrosebr1/imutils
dlib http://dlib.net/
face_recognition https://github.com/ageitgey/face_recognition
i m trying to use curl to download pkgs from URLs, but it's not working.
my dockerfile:
FROM ubuntu:16.04.6
RUN apt-get update && apt-get install -y curl bzip2
curl -o numpy
&& sudo apt-get install numpy
&& curl install imutils https://github.com/jrosebr1/imutils
&& curl install dlib https://dlib.net
&& sudo git clone https://github.com/ageitgey/face_recognition.git
&& curl python-opencv https://opencv.org/
&& echo 'export PATH="~/anaconda3/bin:$PATH"' >> ~/.bashrc \
&& ~/anaconda3/bin/conda update -n base conda \
&& rm miniconda_install.sh \
&& rm -rf /var/lib/apt/lists/* \
&& /bin/bash -c "source ~/.bashrc"
ENV PATH="~/anaconda3/bin:${PATH}"
##################################################
# Setup env for current project:
##################################################
EXPOSE 8000
RUN /bin/bash -c "conda create -y -n PYMODEL3.6"
ADD requirements.txt /tmp/setup/requirements.txt
RUN /bin/bash -c "source activate PYMODEL3.6 && pip install -r /tmp/setup/requirements.txt"
WORKDIR /Service
ADD Service /Service
ENTRYPOINT ["/bin/bash", "-c", "source activate PYMODEL3.6 && ./run.sh"]
the face model is pretrained.
there are 2 python files that do actual detection, 128d encoding and recognition.
the usage is like this:
#detect face, if there is face - encode it, return pickle
python3 encode.py --dataset dataset_id --encodings encodings.pickle
--confidence 0.9
#recognize using pickle
python3 face_recognizer.py --encodings encodings.pickle --image
dataset_webcam/3_1.jpg --confidence 0.9 --tolerance 0.5
should I include them in the dockerfile?

I would propose you to use a Dockerfile like the following, assuming you have all your requirements (numpy, imutils, etc...) inside your requirements.txt file, and your encode.py and face_recognizer.py files in your Service folder:
FROM python:3.6.10
RUN mkdir /tmp/setup
ADD requirements.txt /tmp/setup/requirements.txt
RUN pip install --no-cache-dir --upgrade setuptools && \
pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /tmp/setup/requirements.txt
WORKDIR /Service
ADD Service /Service/
CMD ["./run.sh"]
EXPOSE 8000

Related

pip - is it possible to preinstall a lib, so that any requirements.txt that is run containing that lib won't need to re-build it?

So my scenario is that I'm trying to create a Dockerfile that I can build on my Mac for running Spacy in production. The production server contains a Nvidia GPU with CUDA. To get Spacy to use GPU, I need the lib cupy-cuda117. That lib won't build on my Mac because it can't find the CUDA GPU. So what I'm trying to do is create an image from the Linux server that has the CUDA GPU, that's already pre-build cupy-cuda117 on it. I'll then use that as the parent image for Docker, as all other libs in my requirements.txt will build on my Mac.
My goal at the moment is to build that lib into the server, but I'm not sure the right path forward. Is it sudo pip3 intall cupy-cuda117? Or should I create a venv, and pip3 install cupy-cuda117? Basically my goal is later to add all the other app code and full requirements.txt, and when pip3 install -r requirements.txt is run by Docker, it'll download/build/install everything, but not cupy-cuda117, because hopefully it'll see that it's already been built.
FYI the handling of using GPU on the prod server and CPU on the dev computer i've already got sorted, it's just the building of that one package I'm stuck on. I basically just need it not to try and rebuild on my Mac. Thanks!
FROM "debian:bullseye-20210902-slim" as builder
# install build dependencies
RUN apt-get update -y && apt-get install --no-install-recommends -y build-essential git locales \
&& apt-get clean && rm -f /var/lib/apt/lists/*_*
# Set the locale
RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8
WORKDIR "/app"
RUN apt update -y && apt upgrade -y && apt install -y sudo
# Install Python 3.9 reqs
RUN sudo apt install -y --no-install-recommends wget libxml2 libstdc++6 zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libsqlite3-dev libreadline-dev libffi-dev curl libbz2-dev
# Install Python 3.9
RUN wget --no-check-certificate https://www.python.org/ftp/python/3.9.1/Python-3.9.1.tgz && \
tar -xf Python-3.9.1.tgz && \
cd Python-3.9.1 && \
./configure --enable-optimizations && \
make -j $(nproc) && \
sudo make altinstall && \
cd .. && \
sudo rm -rf Python-3.9.1 && \
sudo rm -rf Python-3.9.1.tgz
# Install CUDA
RUN wget --no-check-certificate https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda_11.7.1_515.65.01_linux.run && \
sudo chmod +x cuda_11.7.1_515.65.01_linux.run && \
sudo ./cuda_11.7.1_515.65.01_linux.run --silent --override --toolkit --samples --toolkitpath=/usr/local/cuda-11.7 --samplespath=/usr/local/cuda --no-opengl-libs && \
sudo ln -s /usr/local/cuda-11.7 /usr/local/cuda && \
sudo rm -rf cuda_11.7.1_515.65.01_linux.run
## Add NVIDIA CUDA to PATH and LD_LIBRARY_PATH ##
RUN echo 'case ":${PATH}:" in\n\
*:"/usr/local/cuda-11.7/lib64":*)\n\
;;\n\
*)\n\
if [ -z "${PATH}" ] ; then\n\
PATH=/usr/local/cuda-11.7/bin\n\
else\n\
PATH=/usr/local/cuda-11.7/bin:$PATH\n\
fi\n\
esac\n\
case ":${LD_LIBRARY_PATH}:" in\n\
*:"/usr/local/cuda-11.7/lib64":*)\n\
;;\n\
*)\n\
if [ -z "${LD_LIBRARY_PATH}" ] ; then\n\
LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64\n\
else\n\
LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH\n\
fi\n\
esac\n\
export PATH LD_LIBRARY_PATH\n\
export GLPATH=/usr/lib/x86_64-linux-gnu\n\
export GLLINK=-L/usr/lib/x86_64-linux-gnu\n\
export DFLT_PATH=/usr/lib\n'\
>> ~/.bashrc
ENV PATH="$PATH:/usr/local/cuda-11.7/bin"
ENV LD_LIBRARY_PATH="/usr/local/cuda-11.7/lib64"
ENV GLPATH="/usr/lib/x86_64-linux-gnu"
ENV GLLINK="-L/usr/lib/x86_64-linux-gnu"
ENV DFLT_PATH="/usr/lib"
RUN python3.9 -m pip install -U wheel setuptools
RUN sudo pip3.9 install torch torchvision torchaudio
RUN sudo pip3.9 install -U 'spacy[cuda117,transformers]'
# set runner ENV
ENV ENV="prod"
CMD ["bash"]
My local Dockerfile is this:
FROM myacct/myimg:latest
ENV ENV=prod
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
COPY ./requirements /code/requirements
RUN pip3 install --no-cache-dir -r /code/requirements.txt
COPY ./app /code/app
ENV ENV=prod
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]

Dockerfile installs newer python insteam of FROM source

I'm trying to make a new image for python dockerfile, but keeps installing Python 3.10 instead of Python 3.8.
My Dockerfile looks like this:
FROM python:3.8.16
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
## follow installation from xtb-python github:
ENV CONDA_DIR /opt/conda
RUN apt-get update && apt-get install -y \
wget \
&& rm -rf /var/lib/apt/lists/*
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh \
&& /bin/bash ~/miniconda.sh -b -p /opt/conda
ENV PATH=$CONDA_DIR/bin:$PATH
RUN conda config --add channels conda-forge \
&& conda install -y -c conda-forge ...
I don't know much about conda (but have to use it).
It is Conda messing with the my Python or I did it something wrong?

How to use GITHUB_TOKEN in pip's requirements.txt without setting it as env variable in Dockerfile?

I have a private repos that can be installable via python's pip:
requirements.txt
git+https://${GITHUB_TOKEN}#github.com/MY_ACCOUNT/MY_REPO.git
And a Dockerfile:
Dockerfile
FROM python:3.8.11
RUN apt-get update && \
apt-get -y install gcc curl && \
rm -rf /var/lib/apt/lists/*
ARG GITHUB_TOKEN
COPY ./requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
It worked perfectly when i build up an image:
$ docker build . --build-arg GITHUB_TOKEN=THIS_IS_MY_GITHUB_TOKEN -t wow/my_app:latest
But when I inspected image, it shows GITHUB_TOKEN in Cmd section:
$ docker image inspect wow/my_app:latest
...
"ContainerConfig": {
...
"Cmd": [
"|1",
"GITHUB_TOKEN=THIS_IS_MY_GITHUB_TOKEN", # Here!
"/bin/sh",
"-c",
"pip install -r /tmp/requirements.txt"
],
...
},
...
I think this could lead to a security problem. How can I solve this so that anything credential info not appear in docker inspect?
If you build your image using BuildKit, you can take advantage of Docker build secrets.
You would structure your Dockerfile something like this:
FROM python:3.8.11
RUN apt-get update && \
apt-get -y install gcc curl && \
rm -rf /var/lib/apt/lists/*
COPY ./requirements.txt /tmp/requirements.txt
RUN --mount=type=secret,id=GITHUB_TOKEN \
GITHUB_TOKEN=$(cat /run/secrets/GITHUB_TOKEN) \
pip install -r /tmp/requirements.txt
And then if you have a GITHUB_TOKEN environment variable in your local environment, you could run:
docker buildx build --secret id=GITHUB_TOKEN -t myimage .
Or if you have the value in a file, you could run:
docker buildx build \
--secret id=GITHUB_TOKEN,src=github_token.txt \
-t myimage .
In either case, the setting will not be baked into the resulting image. See the linked documentation for more information.

How can I setup a persistence Python virtual environment in a Dockerfile?

I'm building Python 3.7.4 (It's a hard requirement for other software) on a base Ubuntu 20.04 image using a Dockerfile. I'm following this guide.
Everything works fine if I run the image and follow the guide, but I want to setup my virtual environment in the Dockerfile and have the pip requirements persistent when running the image.
Here's the relevant part of my Dockerfile:
...
RUN echo =============== Building and Install Python =============== \
&& cd /tmp \
&& wget https://www.python.org/ftp/python/3.7.4/Python-3.7.4.tgz \
&& tar xvf ./Python-3.7.4.tgz \
&& cd Python-3.7.4 \
&& ./configure --enable-optimizations --with-ensurepip=install \
&& make -j 8 \
&& sudo make install
ENV VIRTUAL_ENV=/opt/python-3.7.4
ENV PATH="$VIRTUAL_ENV:$PATH"
COPY "./hourequirements.txt" /usr/local/
RUN echo =============== Setting up Python Virtual Environment =============== \
&& python3 -m venv $VIRTUAL_ENV \
&& source $VIRTUAL_ENV/bin/activate \
&& pip install --upgrade pip \
&& pip install --no-input -r /usr/local/hourequirements.txt
...
The Dockerfile builds without errors, but when I run the image the environment doesn't exist and python 3.7.4 doesn't show any of the installed requirements.
How can I install Python modules in the virtual environment using PIP in the Dockerfile and have them persist when the docker image runs?
Usual find answer just after post.
I changed:
ENV PATH="$VIRTUAL_ENV:$PATH"
to:
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
in Dockerfile and started working correctly.

How can I activate conda venv in Dockerfile? (pip not found)

I'm trying to build a docker image like
FROM ubuntu:latest
RUN apt update && apt upgrade -y && \
apt install -y git wget libsuitesparse-dev gcc g++ swig && \
cd ~ && wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
sh Miniconda3-latest-Linux-x86_64.sh -b && rm Miniconda3-latest-Linux-x86_64.sh && \
PATH=$PATH:~/miniconda3/condabin && \
conda init bash && conda upgrade -y conda && /bin/bash -c "source ~/.bashrc" && \
pip install numpy scipy matplotlib scikit_umfpack
However, /bin/bash -c "source ~/.bashrc" does not work... so I got /bin/sh: 1: pip: not found
How can I build a docker image installing miniconda and python requirements using pip at the same time?
I would recommend using a pre-existing Docker image that already has Anaconda installed. For example, this link has a Docker image endorsed by Anaconda itself. There may be others on Dockerhub that also have Anaconda installed already. In the case you already tried an image with Anaconda and it didn't meet your needs, let me know.

Categories

Resources