Dockerized Python (Streamlit) app uses wrong folder for python libraries

Dockerized Python (Streamlit) app uses wrong folder for python libraries - python

I try to dockerize a Streamlit App. Creating the image works, but when I try to start the app, python seems to try the wrong path for my packages.
The App should run on openshift with python 3.6.
FROM registry.access.redhat.com/redhat-openjdk-18/openjdk18-openshift
USER root
ADD content /
RUN yum -y update \
&& yum -y --enablerepo "*" install bzip2 \
python36-pip \
python36 \
python36-devel \
openssl \
&& yum clean all -y
RUN mkdir -p /usr/local/lib/python3.6/site-packages \
&& python3 -m ensurepip
ENV PIP_CONFIG_FILE=/opt/pip/pip.conf
RUN python3 -m pip install --upgrade pip
RUN python3 -m pip install -r /opt/pip/requirements.txt
ENV LC_ALL=en_US.utf-8
ENV LANG=en_US.utf-8
RUN useradd -rm -d /home/usdlmod -s /bin/bash -g root -u 1001 usdlmod -p "$(openssl passwd -1 usdlmod)"
RUN chgrp root /etc/passwd && chmod ug+rw /etc/passwd
USER usdlmod
CMD ["python", "-m", "streamlit.cli", "run", "main.py", "--server.port=8080"]
EXPOSE 8080
On Openshift I get the following error: /usr/bin/python: No module named streamlit
How can I solve this error?

It might be that you both have python 2 and python 3 installed. Changing the cmd to python3 will solve it.

Related

How can I setup a persistence Python virtual environment in a Dockerfile?

I'm building Python 3.7.4 (It's a hard requirement for other software) on a base Ubuntu 20.04 image using a Dockerfile. I'm following this guide.
Everything works fine if I run the image and follow the guide, but I want to setup my virtual environment in the Dockerfile and have the pip requirements persistent when running the image.
Here's the relevant part of my Dockerfile:
...
RUN echo =============== Building and Install Python =============== \
&& cd /tmp \
&& wget https://www.python.org/ftp/python/3.7.4/Python-3.7.4.tgz \
&& tar xvf ./Python-3.7.4.tgz \
&& cd Python-3.7.4 \
&& ./configure --enable-optimizations --with-ensurepip=install \
&& make -j 8 \
&& sudo make install
ENV VIRTUAL_ENV=/opt/python-3.7.4
ENV PATH="$VIRTUAL_ENV:$PATH"
COPY "./hourequirements.txt" /usr/local/
RUN echo =============== Setting up Python Virtual Environment =============== \
&& python3 -m venv $VIRTUAL_ENV \
&& source $VIRTUAL_ENV/bin/activate \
&& pip install --upgrade pip \
&& pip install --no-input -r /usr/local/hourequirements.txt
...
The Dockerfile builds without errors, but when I run the image the environment doesn't exist and python 3.7.4 doesn't show any of the installed requirements.
How can I install Python modules in the virtual environment using PIP in the Dockerfile and have them persist when the docker image runs?

Usual find answer just after post.
I changed:
ENV PATH="$VIRTUAL_ENV:$PATH"
to:
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
in Dockerfile and started working correctly.

Error as:-ModuleNotFoundError: No module named ‘pyspark’ While running Pyspark in docker

Getting the error as:
Traceback (most recent call last): File “/opt/application/main.py”,
line 6, in
from pyspark import SparkConf, SparkContext ModuleNotFoundError: No module named ‘pyspark’
While running pyspark in docker.
And my dockerfile is as follows:
FROM centos
ENV DAEMON_RUN=true
ENV SPARK_VERSION=2.4.7
ENV HADOOP_VERSION=2.7
WORKDIR /opt/application
RUN yum -y install python36
RUN yum -y install wget
ENV PYSPARK_PYTHON python3.6
ENV PYSPARK_DRIVER_PYTHON python3.6
RUN ln -s /usr/bin/python3.6 /usr/local/bin/python
RUN wget https://bootstrap.pypa.io/get-pip.py
RUN python get-pip.py
RUN pip3.6 install numpy
RUN pip3.6 install pandas
RUN wget --no-verbose http://apache.mirror.iphh.net/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz && tar -xvzf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz \
&& mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION} spark \
&& rm spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
ENV SPARK_HOME=/usr/local/bin/spark
RUN yum -y install java-1.8.0-openjdk
ENV JAVA_HOME /usr/lib/jvm/jre
COPY main.py .
RUN chmod +x /opt/application/main.py
CMD ["/opt/application/main.py"]

You forgot to install pyspark in your dockerfile.
FROM centos
ENV DAEMON_RUN=true
ENV SPARK_VERSION=2.4.7
ENV HADOOP_VERSION=2.7
WORKDIR /opt/application
RUN yum -y install python36
RUN yum -y install wget
ENV PYSPARK_PYTHON python3.6
ENV PYSPARK_DRIVER_PYTHON python3.6
RUN ln -s /usr/bin/python3.6 /usr/local/bin/python
RUN wget https://bootstrap.pypa.io/get-pip.py
RUN python get-pip.py
RUN pip3.6 install numpy
RUN pip3.6 install pandas
RUN pip3.6 install pyspark # add this line.
RUN wget --no-verbose http://apache.mirror.iphh.net/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz && tar -xvzf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz \
&& mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION} spark \
&& rm spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
ENV SPARK_HOME=/usr/local/bin/spark
RUN yum -y install java-1.8.0-openjdk
ENV JAVA_HOME /usr/lib/jvm/jre
COPY main.py .
RUN chmod +x /opt/application/main.py
CMD ["/opt/application/main.py"]
Edit: dockerfile improvment:
FROM centos
ENV DAEMON_RUN=true
ENV SPARK_VERSION=2.4.7
ENV HADOOP_VERSION=2.7
WORKDIR /opt/application
RUN yum -y install python36 wget java-1.8.0-openjdk # you could install python36 and wget in once
ENV PYSPARK_PYTHON python3.6
ENV PYSPARK_DRIVER_PYTHON python3.6
RUN ln -s /usr/bin/python3.6 /usr/local/bin/python
RUN wget https://bootstrap.pypa.io/get-pip.py \
&& python get-pip.py \
&& pip3.6 install numpy==1.19 pandas==1.1.5 pyspark==3.0.2 # you should also pin the version you need, pandas 1.2.x does not support python 3.6
RUN wget --no-verbose http://apache.mirror.iphh.net/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz && tar -xvzf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz \
&& mv spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION} spark \
&& rm spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
ENV SPARK_HOME=/usr/local/bin/spark
ENV JAVA_HOME /usr/lib/jvm/jre
COPY main.py .
RUN chmod +x /opt/application/main.py
CMD ["/opt/application/main.py"]

Installation of R libraries in the Conda environment

I need to create a Conda environment and install dependencies (Python, R) in this environment.
All libraries - Python and R - are installed well, as far as I see in logs. No errors or warnings.
But it looks like R dependencies from file r_requirements.R are not installed in the same environment (myenvpython).
When I build and use the Docker image, I can use the installed Python libraries in the envirnment, but loading of R libraries fails.
How can I fix it?
FROM conda/miniconda3
COPY code/ci_dependencies.yml /setup/
COPY code/r_requirements.R /setup/
# activate environment
ENV PATH /usr/local/envs/myenvpython/bin:$PATH
RUN apt-get update && \
apt-get -y install sudo
# RUN useradd -m docker && echo "docker:docker" | chpasswd && adduser docker sudo
RUN conda update -n base -c defaults conda && \
conda install python=3.7.5 && \
conda env create -f /setup/ci_dependencies.yml && \
/bin/bash -c "source activate myenvpython" && \
az --version && \
chmod -R 777 /usr/local/envs/myenvpython/lib/python3.7
RUN apt-get install -y libssl-dev libsasl2-dev
RUN Rscript /setup/r_requirements.R

can't write dockerfile with installations of ubuntu, numpy, opencv

the plan is to deploy pretrained face recog-n model. But before i need to install some libs.
The idea behind docker is that it brings all the needed libs and builds entire 'env' without much overhead. One can just start dockerfile and it runs all other scripts in turn.
libs to install:
Ubuntu 16.04.6 LTS
Python 3.6.10 (3.5.x should be fine also)
OpenCV 3.3.
NumPy
imutils https://github.com/jrosebr1/imutils
dlib http://dlib.net/
face_recognition https://github.com/ageitgey/face_recognition
i m trying to use curl to download pkgs from URLs, but it's not working.
my dockerfile:
FROM ubuntu:16.04.6
RUN apt-get update && apt-get install -y curl bzip2
curl -o numpy
&& sudo apt-get install numpy
&& curl install imutils https://github.com/jrosebr1/imutils
&& curl install dlib https://dlib.net
&& sudo git clone https://github.com/ageitgey/face_recognition.git
&& curl python-opencv https://opencv.org/
&& echo 'export PATH="~/anaconda3/bin:$PATH"' >> ~/.bashrc \
&& ~/anaconda3/bin/conda update -n base conda \
&& rm miniconda_install.sh \
&& rm -rf /var/lib/apt/lists/* \
&& /bin/bash -c "source ~/.bashrc"
ENV PATH="~/anaconda3/bin:${PATH}"
##################################################
# Setup env for current project:
##################################################
EXPOSE 8000
RUN /bin/bash -c "conda create -y -n PYMODEL3.6"
ADD requirements.txt /tmp/setup/requirements.txt
RUN /bin/bash -c "source activate PYMODEL3.6 && pip install -r /tmp/setup/requirements.txt"
WORKDIR /Service
ADD Service /Service
ENTRYPOINT ["/bin/bash", "-c", "source activate PYMODEL3.6 && ./run.sh"]
the face model is pretrained.
there are 2 python files that do actual detection, 128d encoding and recognition.
the usage is like this:
#detect face, if there is face - encode it, return pickle
python3 encode.py --dataset dataset_id --encodings encodings.pickle
--confidence 0.9
#recognize using pickle
python3 face_recognizer.py --encodings encodings.pickle --image
dataset_webcam/3_1.jpg --confidence 0.9 --tolerance 0.5
should I include them in the dockerfile?

I would propose you to use a Dockerfile like the following, assuming you have all your requirements (numpy, imutils, etc...) inside your requirements.txt file, and your encode.py and face_recognizer.py files in your Service folder:
FROM python:3.6.10
RUN mkdir /tmp/setup
ADD requirements.txt /tmp/setup/requirements.txt
RUN pip install --no-cache-dir --upgrade setuptools && \
pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r /tmp/setup/requirements.txt
WORKDIR /Service
ADD Service /Service/
CMD ["./run.sh"]
EXPOSE 8000

How to run a bash script through pycharm?

I have the following set of bash commands
docker pull mcr.microsoft.com/mssql/server:2017-latest
docker run -e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD=my_password' \
--name 'sql1' -p 1401:1433 \
-v "my_space":/opt/project \
-d mcr.microsoft.com/mssql/server:2017-latest
winpty docker exec -it sql1 bash
mkdir -p /var/opt/mssql/backup
cp my_db /var/opt/mssql/backup/
/opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P "my_password" -i /opt/project/scripts/database_import/sql_script.sql
apt-get update -y
apt-get install python3-pip -y
python3 -m pip install pymssql
python3 -m pip install pandas==0.19.2
python3 -m pip install time
python3 -m pip install sqlalchemy
python3 -m pip install sqlalchemy_utils
cd /opt/project/
python3 scripts/database_import/import_database.py
What this set of commands is essentially doing is that it pulls the mssql server, restores a database, installs some python packages and it runs a python script inside the mssql docker container.
Is there a way to run this bash script, from pycharm ?

Sure thing. If you are using the latest version, then Shell Script should be available https://www.jetbrains.com/help/idea/shell-scripts.html
So, for example, I have test.sh file:
I can just click this green run button and PyCharm will run it, or create Run Configuration for it (see the link above).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Dockerized Python (Streamlit) app uses wrong folder for python libraries - python

It might be that you both have python 2 and python 3 installed. Changing the cmd to python3 will solve it.

Related

How can I setup a persistence Python virtual environment in a Dockerfile?

Error as:-ModuleNotFoundError: No module named ‘pyspark’ While running Pyspark in docker

Installation of R libraries in the Conda environment

can't write dockerfile with installations of ubuntu, numpy, opencv

How to run a bash script through pycharm?

Categories

Resources