How to write a Dockerfile for a custom python project?

How to write a Dockerfile for a custom python project? - python

I'm pretty new to Docker, and I need to create the container to run Docker container as an Apache Mesos task.
The problem is that I can't find any relevant examples. They all are centered around Web development, which is not my case.
I have a pure Python project with large number of dependencies ( like Berkeley Caffe or OpenCV ).
How to write a Docker file to properly enroll all dependecies ( and how to find them out?)

The docker hub registry contains a number of official language images, which you can use as your base image.
https://hub.docker.com/_/python/
The instructions tell you how you can build your python project, including the importation of dependencies.
├── Dockerfile <-- Docker build file
├── requirements.txt <-- List of pip dependencies
└── your-daemon-or-script.py <-- Python script to run
Image supports both Python 2 and 3, you specify this in the Dockerfile:
FROM python:3-onbuild
CMD [ "python", "./your-daemon-or-script.py" ]
The base image uses special ONBUILD instructions to all the hard work for you.

The official Docker site has some step-by-step and reference documentation.
However, to get you started: what might help is to think about what you would do if you were to install and start your project on a fresh machine. You'd probably do something like this...
apt-get update
apt-get install -y python python-opencv wget ...
# copy your app into /myapp/
python /myapp/myscript.py
This maps more or less one-to-one to
FROM ubuntu:14.04
MAINTAINER Vast Academician <vast#example.com>
RUN apt-get update && apt-get install -y python python-opencv wget ...
COPY /path/on/host/to/myapp /myapp
CMD ["python", "/myapp/myscript.py"]
The above is untested, of course, but you probably get the idea.

Related

Can not add python lib to existing Docker image on Ubuntu server

Good day,
I'm trying to deploy Telegram bot on AWS Ubuntu server. But I can not run application because server says (when i run docker-compose up):
there is no name: asyncpg
However I installed it manually on server through
pip3 install asyncpg
and I checked later it is in the "packages" folder.
However, I sort of understand where problem is from. When I first tun
sudo docker-compose up
It used this file:
My Dockerfile:
FROM python:3.8
WORKDIR /src
COPY requirements.txt /src
RUN pip install -r requirements.txt
COPY . /src
Where requirements.txt lacked this library. I edited requirements with
nano
and tried to run:
docker-compose up
again, but i again run into similar problem that
there is no asyncpg package
So as I understand docker-compose up uses already created image where there is no such package. I tried different solutions from SOF like build and > freeze but nothing helped. Probably because I dont quite understand what im doing, Im beginner at programming and python.
How can I add this package to existing docker image?

So after you have added the library package manually on the server, to save back the changes made into the docker image you would need to commit the running docker container using the command,
docker commit <container-id> <image-name>
Let's take an example.
you have an image, application
you run the docker image and get back container-id say 1b390cd1dc2d.
Now, you can go into the running container using the command -
docker exec -it 1b390cd1dc2d /bin/bash
Next, install the package -
pip3 install asyncpg
Now exit from the running container exit
Use the first command shared to update the image like below
docker commit 1b390cd1dc2d application
This updates the image by adding the required library into your image

Divio Cloud deployment error: ENOGIT git is not installed or not in the PATH

I updated all my add ons and Python version at the Divio Control Dashboard to the recommended versions and after that I can't deploy my project. The error message is:
---> Running in fb3fc5000391
[91mbower bootstrap-sass-official#3.3.5 ENOGIT git is not installed or not in the PATH
Stack trace:
Error: git is not installed or not in the PATH
at createError (/opt/nvm/versions/node/v6.10.1/lib/node_modules/bower/lib/util/createError.js:4:15)
at GitHubResolver.GitResolver (/opt/nvm/versions/node/v6.10.1/lib/node_modules/bower/lib/core/resolvers/GitResolver.js:45:15)
at GitHubResolver.GitRemoteResolver (/opt/nvm/versions/node/v6.10.1/lib/node_modules/bower/lib/core/resolvers/GitRemoteResolver.js:10:17)
at new GitHubResolver (/opt/nvm/versions/node/v6.10.1/lib/node_modules/bower/lib/core/resolvers/GitHubResolver.js:13:23)
at /opt/nvm/versions/node/v6.10.1/lib/node_modules/bower/lib/core/resolverFactory.js:20:16```

The issue you're seeing is that when the Docker image is built and the commands in the Dockerfile are executed, something needs Git, but can't find it.
What you need to install
You need to install Git, which you can do in the Dockerfile with:
RUN apt-get update && \
apt-get install -y git
Where to run the command
You need to run it before the command that requires Git.
In fact since Git is quite a low level command, often used in installation processes, you want to install it as early as possible, for example, as soon as possible after the FROM command that specifies the base image.
See How to install system packages in a project in the Divio documentation.
Why you need to do this now
You mention that you updated the Python version of your project. In Divio Cloud projects, this can be done via the Control Panel. The latest versions of Divio Python base projects include slimmed-down base images, which don't include all the system packages that were previously installed (Git is amongst them).
See also The Dockerfile which gives some details of the way the Dockerfile is used in Divio Projects.

Heroku container:push always re-installs conda packages

I've followed the python-miniconda tutorial offered by Heroku in order to create my own ML server on Python, which utilizes Anaconda and its packages.
Everything seems to be in order, however each time I wish to update the scripts located at /webapp by entering
heroku container:push
A complete re-installation of the pip (or rather, Conda) dependencies is performed, which takes quite some time and seems illogical to me. My understanding of both Docker and Heroku frameworks is very shaky, so I haven't been able to find a solution which allows me to push ONLY my code while leaving the container as is without (re?)uploading an entire image.
Dockerfile:
FROM heroku/miniconda
ADD ./webapp/requirements.txt /tmp/requirements.txt
RUN pip install -qr /tmp/requirements.txt
ADD ./webapp /opt/webapp/
WORKDIR /opt/webapp
RUN conda install scikit-learn
RUN conda install opencv
CMD gunicorn --bind 0.0.0.0:$PORT wsgi

This happens because once you updated the webapp directory, you invalidate the build cache. Whatever after this line needs to be rebuild.
When building an image, Docker steps through the instructions in your Dockerfile, executing each in the order specified. As each instruction is examined, Docker looks for an existing image in its cache that it can reuse, rather than creating a new (duplicate) image.
Once the cache is invalidated, all subsequent Dockerfile commands generate new images and the cache is not used. (docs)
Hence, to take advantage of the build cache your Dockerfile needs to be defined such as
FROM heroku/miniconda
RUN conda install scikit-learn opencv
ADD ./webapp /opt/webapp/
RUN pip install -qr /opt/webapp/requirements.txt
WORKDIR /opt/webapp
CMD gunicorn --bind 0.0.0.0:$PORT wsgi
You should merge the two RUN conda commands to a single statement, to reduce number of layers in the image. Also, merge the ADD into single command and run pip requirements from a different directory.

A better way to deploy a Debian-python hybrid application

I wrote a small application for Debian linux that calls python2.7 to perform almost all of its functions.
The python functions include for example remote database access, so the app will depend on python modules that are not in every linux distribution by default.
The app is packaged in a dpkg file in order to be used on many other machines (with same linux distribution), using dpkg -i MyApp01.
But the python dependencies have to be installed separately in order for the app to work: for example pip install mysql-connector-python-rf
Now I want to use Docker to ship my dependencies with the app and make it work on other machines without having to install them as above.
Can Docker be used to do this?and how?
If no, Is there a better approach to natively bundle the python dependencies in the dpkg file (assuming target machines have similar environment)?

A container is an isolated environment, so you have to ship all what will be needed for your program to run.
Your Dockerfile will be based on Debian, so begin with
FROM debian
and will have some
RUN apt-get update \
&& apt-get install -y mysoft mydependency1 mydependency2
and also
RUN pip install xxx
and end with something like
CMD ["python","myapp.py"]
As your Python program does certainly things like
import module1, module2
Those Python modules will need to be installed in your Dockerfile in a RUN directive

What is the purpose of running a django application in a virtualenv inside a docker container?

What is the purpose of virtualenv inside a docker django application? Python and other dependencies are already installed, but at the same time it's necessary to install lots of packages using pip, so it seems that conflict is still unclear.
Could you please explain the concept?
EDIT: Also, for example. I'v created virtualenv inside docker django app and recently installed pip freeze djangorestframework and added it to installed in settings.py but docker-compose up raises error . No module named rest_framework.Checked, everything is correct.Docker/virtualenv conflict ? May it be?

Docker and containerization might inspire the illusion that you do not need a virtual environment. distutil's Glpyh makes a very compelling argument against this misconception in this pycon talk.
The same fundamental aspects of virtualenv advantages apply for a container as they do for a non-containerized application, because fundamentally you're still running a linux distribution.
Debian and Red Hat are fantastically complex engineering projects.
Integrating billions of lines of C code.For example, you can just apt install libavcodec. Or yum install ffmpeg.
Writing a working build
system for one of those things is a PhD thesis. They integrate
thousands of Python packages simultaneously into one working
environment. They don't always tell you whether their tools use Python
or not.
And so, you might want to docker exec some tools inside a
container, they might be written in Python, if you sudo pip install
your application in there, now it's all broken.
So even in containers, isolate your application code from the system's
Regardless of whether you're using docker or not you should always run you application in a virtual environment.
Now in docker in particular using a virtualenv is a little trickier than it should be. Inside docker each RUN command runs in isolation and no state other than file system changes are kept from line to line. To install to a virutalenv you have to prepend the activation command on every line:
RUN apt-get install -y python-virtualenv
RUN virtualenv /appenv
RUN . /appenv/bin/activate; \
pip install -r requirements.txt
ENTRYPOINT . /appenv/bin/activate; \
run-the-app

A virtualenv is there for isolating the packages to a specific environment. Docker is also there to isolate the settings to a specific environment. So in essence if you use docker there isn't much benefit of using virtualenv too.
Just pip install thing into the docker environment directly it'll do no harm. To pip install the requirements use the dockerfile where you can execute commands.
You can find a pseudo code example below.
FROM /path/to/used/docker/image
RUN pip install -r requirements.txt

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.