I have a Python program which is to be executed in the Azure Kubernetes.
Below is my docker file - I have Python installed
#Ubuntu Base image with openjdk8 with TomEE
FROM demo.azurecr.io/ubuntu/tomee/openjdk8:8.0.x
RUN apt-get update && apt-get install -y telnet && apt-get install -y ksh && apt-get install -y python2.7.x && apt-get -y clean && rm -rf /var/lib/apt/lists/*
however I don't know how to install PIP and related dependent libraries (eg: pymssql)?
Best option is installing miniconda on docker image. I used it always when I need to have python on docker image without python or pip.
Here is part for installing minicinda in my simple docker image
FROM debian
RUN apt-get update && apt-get install -y curl wget
RUN rm -rf /var/lib/apt/lists/*
RUN wget \
https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& mkdir /root/.conda \
&& bash Miniconda3-latest-Linux-x86_64.sh -b \
&& rm -f Miniconda3-latest-Linux-x86_64.sh
RUN conda --version
Related
So my scenario is that I'm trying to create a Dockerfile that I can build on my Mac for running Spacy in production. The production server contains a Nvidia GPU with CUDA. To get Spacy to use GPU, I need the lib cupy-cuda117. That lib won't build on my Mac because it can't find the CUDA GPU. So what I'm trying to do is create an image from the Linux server that has the CUDA GPU, that's already pre-build cupy-cuda117 on it. I'll then use that as the parent image for Docker, as all other libs in my requirements.txt will build on my Mac.
My goal at the moment is to build that lib into the server, but I'm not sure the right path forward. Is it sudo pip3 intall cupy-cuda117? Or should I create a venv, and pip3 install cupy-cuda117? Basically my goal is later to add all the other app code and full requirements.txt, and when pip3 install -r requirements.txt is run by Docker, it'll download/build/install everything, but not cupy-cuda117, because hopefully it'll see that it's already been built.
FYI the handling of using GPU on the prod server and CPU on the dev computer i've already got sorted, it's just the building of that one package I'm stuck on. I basically just need it not to try and rebuild on my Mac. Thanks!
FROM "debian:bullseye-20210902-slim" as builder
# install build dependencies
RUN apt-get update -y && apt-get install --no-install-recommends -y build-essential git locales \
&& apt-get clean && rm -f /var/lib/apt/lists/*_*
# Set the locale
RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8
WORKDIR "/app"
RUN apt update -y && apt upgrade -y && apt install -y sudo
# Install Python 3.9 reqs
RUN sudo apt install -y --no-install-recommends wget libxml2 libstdc++6 zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libsqlite3-dev libreadline-dev libffi-dev curl libbz2-dev
# Install Python 3.9
RUN wget --no-check-certificate https://www.python.org/ftp/python/3.9.1/Python-3.9.1.tgz && \
tar -xf Python-3.9.1.tgz && \
cd Python-3.9.1 && \
./configure --enable-optimizations && \
make -j $(nproc) && \
sudo make altinstall && \
cd .. && \
sudo rm -rf Python-3.9.1 && \
sudo rm -rf Python-3.9.1.tgz
# Install CUDA
RUN wget --no-check-certificate https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda_11.7.1_515.65.01_linux.run && \
sudo chmod +x cuda_11.7.1_515.65.01_linux.run && \
sudo ./cuda_11.7.1_515.65.01_linux.run --silent --override --toolkit --samples --toolkitpath=/usr/local/cuda-11.7 --samplespath=/usr/local/cuda --no-opengl-libs && \
sudo ln -s /usr/local/cuda-11.7 /usr/local/cuda && \
sudo rm -rf cuda_11.7.1_515.65.01_linux.run
## Add NVIDIA CUDA to PATH and LD_LIBRARY_PATH ##
RUN echo 'case ":${PATH}:" in\n\
*:"/usr/local/cuda-11.7/lib64":*)\n\
;;\n\
*)\n\
if [ -z "${PATH}" ] ; then\n\
PATH=/usr/local/cuda-11.7/bin\n\
else\n\
PATH=/usr/local/cuda-11.7/bin:$PATH\n\
fi\n\
esac\n\
case ":${LD_LIBRARY_PATH}:" in\n\
*:"/usr/local/cuda-11.7/lib64":*)\n\
;;\n\
*)\n\
if [ -z "${LD_LIBRARY_PATH}" ] ; then\n\
LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64\n\
else\n\
LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH\n\
fi\n\
esac\n\
export PATH LD_LIBRARY_PATH\n\
export GLPATH=/usr/lib/x86_64-linux-gnu\n\
export GLLINK=-L/usr/lib/x86_64-linux-gnu\n\
export DFLT_PATH=/usr/lib\n'\
>> ~/.bashrc
ENV PATH="$PATH:/usr/local/cuda-11.7/bin"
ENV LD_LIBRARY_PATH="/usr/local/cuda-11.7/lib64"
ENV GLPATH="/usr/lib/x86_64-linux-gnu"
ENV GLLINK="-L/usr/lib/x86_64-linux-gnu"
ENV DFLT_PATH="/usr/lib"
RUN python3.9 -m pip install -U wheel setuptools
RUN sudo pip3.9 install torch torchvision torchaudio
RUN sudo pip3.9 install -U 'spacy[cuda117,transformers]'
# set runner ENV
ENV ENV="prod"
CMD ["bash"]
My local Dockerfile is this:
FROM myacct/myimg:latest
ENV ENV=prod
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
COPY ./requirements /code/requirements
RUN pip3 install --no-cache-dir -r /code/requirements.txt
COPY ./app /code/app
ENV ENV=prod
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]
I have followed the tutorial for the Azure Functions using python.
everything wen smooth.
for the next step I need to add a C compiled dependency.
I just added the C compiler + the dependency script rows.
I have edited the Docker file and it now looks like this:
FROM mcr.microsoft.com/azure-functions/python:3.0-python3.7
ENV AzureWebJobsScriptRoot=/home/site/wwwroot \
AzureFunctionsJobHost__Logging__Console__IsEnabled=true
COPY requirements.txt /
RUN pip install -r /requirements.txt
COPY . /home/site/wwwroot
FROM julia:1.3
RUN apt-get update && apt-get install -y gcc g++ && rm -rf /var/lib/apt/lists/*
FROM python:3.7
RUN pip install numpy
RUN wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz && \
tar -xvzf ta-lib-0.4.0-src.tar.gz && \
cd ta-lib/ && \
./configure --prefix=/usr && \
make && \
make install
RUN rm -R ta-lib ta-lib-0.4.0-src.tar.gz
When I build this docker file it look good.
but when I run it it just opens up a GCC promp.
What am I doing wrong?
Thanks
I found an issue with your multi stage FROM statements. Also, you needed to add apt-get install make.
The following works:
FROM mcr.microsoft.com/azure-functions/python:3.0-python3.7
ENV AzureWebJobsScriptRoot=/home/site/wwwroot \
AzureFunctionsJobHost__Logging__Console__IsEnabled=true
COPY requirements.txt /
RUN pip install -r /requirements.txt
COPY . /home/site/wwwroot
# Adding "apt-get install make" here
RUN apt-get update && apt-get install make && apt-get install -y gcc g++ && rm -rf /var/lib/apt/lists/*
RUN pip install numpy
RUN wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz && \
tar -xvzf ta-lib-0.4.0-src.tar.gz && \
cd ta-lib/ && \
./configure --prefix=/usr && \
make && \
make install
RUN rm -R ta-lib ta-lib-0.4.0-src.tar.gz
I'm trying to build a docker image like
FROM ubuntu:latest
RUN apt update && apt upgrade -y && \
apt install -y git wget libsuitesparse-dev gcc g++ swig && \
cd ~ && wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
sh Miniconda3-latest-Linux-x86_64.sh -b && rm Miniconda3-latest-Linux-x86_64.sh && \
PATH=$PATH:~/miniconda3/condabin && \
conda init bash && conda upgrade -y conda && /bin/bash -c "source ~/.bashrc" && \
pip install numpy scipy matplotlib scikit_umfpack
However, /bin/bash -c "source ~/.bashrc" does not work... so I got /bin/sh: 1: pip: not found
How can I build a docker image installing miniconda and python requirements using pip at the same time?
I would recommend using a pre-existing Docker image that already has Anaconda installed. For example, this link has a Docker image endorsed by Anaconda itself. There may be others on Dockerhub that also have Anaconda installed already. In the case you already tried an image with Anaconda and it didn't meet your needs, let me know.
I'm using Cloudera Hive ODBC driver in my code and I'm trying to containerize the app.
Below is my Dockerfile,
FROM ubuntu:18.04
FROM continuumio/anaconda3
FROM node:10
RUN conda update -n base -c defaults conda
RUN conda create -n env python=3.7
RUN echo "conda activate env" > ~/.bashrc
ENV PATH /opt/conda/envs/env/bin:$PATH
RUN apt-get update && apt-get install -y \
curl apt-utils apt-transport-https debconf-utils gcc build-essential \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get install -y \
python-pip python-dev python-setuptools \
--no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip
RUN pip install pyyaml pandas numpy pymysql sqlalchemy schedule tornado
RUN apt-get update && apt-get install -y --no-install-recommends git unzip unixodbc unixodbc-dev
RUN conda install -c conda-forge turbodbc=3.1.1
RUN apt-get update && apt-get install -y gettext nano vim -y
RUN yarn install --modules-folder ./static
WORKDIR /app
COPY entry.sh /usr/local/bin/
COPY . /app/
ENV SSH_PASSWD "root:Docker!"
RUN apt-get update \
&& apt-get install -y --no-install-recommends dialog \
&& apt-get update \
&& apt-get install -y --no-install-recommends openssh-server \
&& echo "$SSH_PASSWD" | chpasswd
COPY sshd_config /etc/ssh/
COPY entry.sh /usr/local/bin/
RUN chmod u+x /usr/local/bin/entry.sh
EXPOSE 5000 2222 22 80 8000
CMD ["entry.sh"]
Building Image is getting successful, but I see when I run the docker image, I see below error
Traceback (most recent call last):
File "app.py", line 14, in <module>
from abc_scheduler import scheduler_main
File "/app/abc_scheduler.py", line 5, in <module>
from methods import Methods
File "/app/methods.py", line 10, in <module>
from utils import *
File "/app/utils.py", line 2, in <module>
from turbodbc import connect, make_options
ModuleNotFoundError: No module named 'turbodbc'
I have tried many other ODBC's inside my Dockerfile, but no luck. Any help would be great.
As suggested by #DavidMaze, I managed create a successful Dockerfile & is shown below
FROM ubuntu:latest
FROM continuumio/anaconda3
FROM node:10
RUN conda update -n base -c defaults conda
RUN conda create -n env python=3.7
RUN echo 'conda init bash' >/.bashrc
RUN echo "conda activate env" > ~/.bashrc
ENV PATH /opt/conda/envs/env/bin:$PATH
RUN apt-get update && apt-get install -y \
curl apt-utils apt-transport-https debconf-utils gcc build-essential \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get install -y \
python-pip python-dev python-setuptools \
--no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip
# ==================TURBODBC========================
RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get dist-upgrade -y
RUN apt-get install -y alien # optional
COPY ClouderaHiveODBC-2.6.1.1001-1.x86_64.rpm /opt/cloudera/
RUN alien /opt/cloudera/ClouderaHiveODBC-2.6.1.1001-1.x86_64.rpm
RUN dpkg -i clouderahiveodbc_2.6.1.1001-2_amd64.deb
# ==================END=============================
RUN conda install --name env -c conda-forge turbodbc=4.1.1 tornado=6.0.4 pyyaml pymysql schedule sqlalchemy pyarrow numpy=1.19.3\
pandas=1.1.4 pybind11 pyarrow
COPY odbc.ini /etc/
RUN apt-get update && apt-get install -y gettext nano vim -y
RUN yarn install --modules-folder ./static
WORKDIR /app
COPY . /app/
ENV SSH_PASSWD "root:Docker!"
RUN apt-get update \
&& apt-get install -y --no-install-recommends dialog \
&& apt-get update \
&& apt-get install -y --no-install-recommends openssh-server \
&& echo "$SSH_PASSWD" | chpasswd
COPY sshd_config /etc/ssh/
COPY entry.sh /usr/local/bin/
RUN chmod u+x /usr/local/bin/entry.sh
EXPOSE 9988 2222 22 80 8000
CMD ["entry.sh"]
Keep a copy of ClouderaHiveODBC-2.6.1.1001-1.x86_64.rpm in the current directory
Keep the below files as well :
odbc.ini - which has the DB info
entry.sh - which is shell script and has a command - python app.py
ssh_config - file without any extension has the information as shown below
Port 2222
ListenAddress 0.0.0.0
LoginGraceTime 180
X11Forwarding yes
Ciphers aes128-cbc,3des-cbc,aes256-cbc
MACs hmac-sha1,hmac-sha1-96
StrictModes yes
SyslogFacility DAEMON
PrintMotd no
IgnoreRhosts no
#deprecated option
#RhostsAuthentication no
RhostsRSAAuthentication yes
RSAAuthentication no
PasswordAuthentication yes
PermitEmptyPasswords no
PermitRootLogin yes
I want to expand the answer by showing an approach that works without conda being necessary. In other words, a full-pip minimum viable docker setup for installing turbodbc. I've fully documented the solution in this Github comment in the official turbodbc repo.
I'm trying to install graph-tool for Anaconda Python 3.5 on Ubuntu 14.04 (x64), but it turns out that's a real trick.
I tried this approach, but run into the problem:
The following specifications were found to be in conflict:
- graph-tool
Use "conda info <package>" to see the dependencies for each package.
Digging through the dependencies led to a dead-end at gobject-introspection
So I tried another approach:
Installed boost with conda, then tried to ./configure, make, and make install graph-tool... which got about as far as ./configure:
===========================
Using python version: 3.5.2
===========================
checking for boostlib >= 1.54.0... yes
checking whether the Boost::Python library is available... yes
checking whether boost_python is the correct library... no
checking whether boost_python-py27 is the correct library... no
checking whether boost_python-py27 is the correct library... (cached) no
checking whether boost_python-py27 is the correct library... (cached) no
checking whether boost_python-py35 is the correct library... yes
checking whether the Boost::IOStreams library is available... yes
configure: error: Could not link against boost_python-py35 !
I know this is something about environment variables for the ./configure command and conda installing libboost to Anaconda's weird place, I just don't know what to do, and my Google-fu is failing me. So this is another dead end.
Can anyone who's had to install graph-tool recently in linux-64 give me a walkthrough? It's a fresh VM running in VMWare Workstation 10.0.7
For those that run into similar issues, try changing the order of conda channels first with:
$ conda config --add channels ostrokach
$ conda config --add channels defaults
$ conda config --add channels conda-forge
then:
$ conda install graph-tool
Installing graph-tool 2.26 for Anaconda Python 3.5, Ubuntu 14.04.
Note: as of me writing this, the ostrokach channel conda install of graph-tool was only at version 2.18.
Here's the docker file I use to install graph-tool 2.26. There's likely a cleaner way, but so far this is the only thing I've managed to cobble together that actually works.
NOTE: If you're unfamiliar with docker files and you'd just like to do the install from the terminal, ignore the first line (starting with FROM), ignore every occurrence of the word RUN, and what you're left with is a series of commands to execute in a terminal.
FROM [your 14.04 base image]
RUN conda upgrade -y conda
RUN conda upgrade -y matplotlib
RUN \
add-apt-repository -y ppa:ubuntu-toolchain-r/test && \
apt-get update -y && \
apt-get install -y gcc-5 g++-5 && \
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 60 --slave /usr/bin/g++ g++ /usr/bin/g++-5
RUN wget https://github.com/CGAL/cgal/archive/releases/CGAL-4.10.2.tar.gz && \
tar xzf CGAL-4.10.2.tar.gz && \
cd cgal-releases-CGAL-4.10.2/ && \
cmake . && \
make && \
make install
RUN cd /tmp && \
# note: master branch of repo appears relatively stable, has not been updated since 2016
git clone https://github.com/sparsehash/sparsehash.git && \
cd sparsehash && \
./configure && \
make && \
make install
RUN apt-get update
RUN apt-get install -y build-essential g++ python-dev autotools-dev libicu-dev build-essential libbz2-dev libboost-all-dev
RUN apt-get install -y autogen autoconf libtool shtool
# install boost
RUN cd /tmp && \
wget https://dl.bintray.com/boostorg/release/1.66.0/source/boost_1_66_0.tar.gz && \
tar xzvf boost_1_66_0.tar.gz && \
cd boost_1_66_0 && \
sudo ./bootstrap.sh --prefix=/usr/local && \
sudo ./b2 && \
sudo ./b2 install
# install newer cairo
RUN cd /tmp && \
wget https://cairographics.org/releases/cairo-1.14.12.tar.xz && \
tar xf cairo-1.14.12.tar.xz && \
cd cairo-1.14.12 && \
./configure && \
make && \
sudo make install
RUN cd /tmp && \
wget https://download.gnome.org/sources/libsigc++/2.99/libsigc++-2.99.10.tar.xz && \
tar xf libsigc++-2.99.10.tar.xz && \
cd libsigc++-2.99.10 && \
./configure && \
make && \
sudo make install && \
sudo cp ./sigc++config.h /usr/local/include/sigc++-3.0/sigc++config.h
RUN cd /tmp && \
wget https://www.cairographics.org/releases/cairomm-1.15.5.tar.gz && \
tar xf cairomm-1.15.5.tar.gz && \
cd cairomm-1.15.5 && \
./configure && \
make && \
sudo make install && \
sudo cp ./cairommconfig.h /usr/local/include/cairomm-1.16/cairomm/cairommconfig.h
RUN conda install -y -c conda-forge boost pycairo
RUN conda install -y -c numba numba=0.36.2
RUN conda install -y -c libboost py-boost && \
conda update -y cffi dbus expat pycairo pandas scipy numpy harfbuzz setuptools boost
RUN apt-get install -y apt-file dbus libdbus-1-dev && \
apt-file update
RUN apt-get install -y graphviz
RUN conda install -y -c conda-forge python-graphviz
RUN sudo apt-get install -y valgrind
RUN apt-get install -y libcgal-dev libcairomm-1.0 libcairomm-1.0-dev libcairo2-dev python-cairo-dev
RUN conda install -y -c conda-forge pygobject
RUN conda install -y -c ostrokach gtk
RUN cd /tmp && \
wget https://git.skewed.de/count0/graph-tool/repository/release-2.26/archive.tar.bz2 && \
bunzip2 archive.tar.bz2 && \
tar -xf archive.tar && \
cd graph-tool-release-2.26-b89e6b4e8c5dba675997d6f245b301292a5f3c59 && \
# Fix problematic parts of the graph-tool configure.ac file
sed -i 's/PKG_INSTALLDIR/#PKG_INSTALLDIR/' ./configure.ac && \
sed -i 's/AM_PATH_PYTHON(\[2\.7\])/AM_PATH_PYTHON(\[3\.5\])/' ./configure.ac && \
sed -i 's/\${PYTHON}/\/usr\/local\/anaconda3\/bin\/python/' ./configure.ac && \
sed -i '$a ACLOCAL_AMFLAGS = -I m4' ./Makefile.am && \
sudo ./autogen.sh && \
sudo ./configure CPPFLAGS="-I/usr/local/include -I/usr/local/anaconda3/pkgs/pycairo-1.15.4-py35h1b9232e_1/include -I/usr/local/include/cairo -I/usr/local/include/sigc++-3.0 -I/usr/include/freetype2" \
LDFLAGS="-L/usr/local/include -L/usr/local/lib/cairo -L/usr/local/include/sigc++-3.0 -L/usr/include/freetype2" \
PYTHON="/usr/local/anaconda3/bin/python" \
PYTHON_VERSION=3.5 \
sudo make && \
sudo make install
Warning: makeing graph-tool might take a couple hours and require >7 GB of ram.