Some times i need to use modules which are not a part of default python installation and some times even packages like Anaconda or Canopy does not include them. So every time I move my project to another machine or just reinstall python i need to download them again. So my question is. Is there a way to store necessory modules in the project folder and use them from it without moving to default python installation folder.
You can use virtual environment or docker to install the required modules in your project dir so it is isolated from your system Python installation. In fact, you don't need Python installed on your machine when using docker.
Here is my workflow when developing Django web app with Docker. If your project dir is in /Projects/sampleapp, change the current working directory to the project dir and run the following.
Run a docker container from your terminal:
docker run \
-it --rm \
--name django_app \
-v ${PWD}:/app \
-w /app \
-e PYTHONUSERBASE=/app/.vendors \
-p 8000:8000 \
python:3.5 \
bash -c "export PATH=\$PATH:/app/.vendors/bin && bash"
# Command expalanation:
#
# docker run Run a docker container
# -it Set interactive and allocate a pseudo-TTY
# -rm Remove the container on exit
# --name django_app Set the container name
# -v ${PWD}:/app Mount current dir as /app in the container
# -w /app Set the current working directory to /app
# -e PYTHONUSERBASE=/app/.vendors pip will install packages to /app/.vendors
# -p 8000:8000 Open port 8000
# python:3.5 Use the Python:3.5 docker image
# bash -c "..." Add /app/.vendors/bin to PATH and open the shell
On the container's shell, install the required packages:
pip install django celery django-allauth --user
pip freeze > requirements.txt
The --user options along with the PYTHONUSERBASE environment variable will make pip installs the packages in /app/.vendors.
Create the django project and develop the app as usual:
django-admin startproject sampleapp
cd sampleapp
python manage.py runserver 0.0.0.0:8000
The directory structure will look like this:
Projects/
sampleapp/
requirements.txt
.vendors/ # Note: don't add this dir to your VCS
sampleapp/
manage.py
...
This configuration enables you to install the packages in your project dir, isolated from your system. Note that you need to add requirements.txt to your VCS, but remember to exclude .vendors/ dir.
When you need to move and run the project on another machine, run the docker command above and reinstall the required packages on the container's shell:
pip install -r requirements.txt --user
Related
I have a python app that runs every day to download images and saves them into specified folders with each day folder creation like this /home/ubuntu/images/yyyymmdd.
I have built a docker container of my python app on ubuntu 20. When I try to run the app by mounting the host directory then log message prints folder created /home/ubuntu/images/20220123 but I can not see any folder.
I have checked the docker folder /var/lib/docker and found that random hash is created inside folder containers and overlay2. So I have tried to mount with both directories as below but no luck.
sudo docker run -t -i -v /home/ubuntu/images:/var/lib/docker/containers --network=host testapp/img-downloader:0.0.1
sudo docker run -t -i -v /home/ubuntu/images:/var/lib/docker/overlay2 --network=host testapp/img-downloader:0.0.1
I can see the data folder created inside the images folder and image files got saved like this
/var/lib/docker/overlay2/d52bcf61cae2e563c3c8561bab53b4bb2dd2ea2d633a14d40c96d7992fffae28/diff/home/ubuntu/images/20220123
What I am missing here so that it's not saving images to host directory like /home/ubuntu/images/20220123 instead of the inside docker container.
My Dockerfile is as below -
FROM alpine:3.14
ENV PYTHONUNBUFFERED=1
RUN apk add --update --no-cache python3-dev mariadb-dev gcc musl-dev g++ && ln -sf python3 /usr/bin/python
RUN python3 -m ensurepip
RUN pip3 install --no-cache --upgrade pip setuptools
COPY ./requirements.txt /requirements.txt
WORKDIR /
RUN pip3 install -r requirements.txt
COPY ./ /
ENTRYPOINT [ "python3" ]
CMD [ "./main.py" ]
Please help here. thanks
...What i am missing here so that its not saving images to host directory like /home/ubuntu/images/20220123 instead of inside docker container.
Presumed you meant you want to save images to a directory on the host and not inside the container. There's no need to mount /var/lib/docker/.... You need to ensure your program saved files to a path that is bind mounted to the host. Examples:
mkdir images # Create a directory on the host to hold the images
docker run -it --rm -v ~/images:/images alpine ash -c "mkdir /images/yesterday; mkdir /images/today; echo 'hello' > /images/today/msg.txt; echo 'done.'"
After the container exited, issue ls -ld images/* on the host will show you the 2 directories created; cat images/today/msg.txt will print you the content you saved via the container (simulate if you download images via the container).
I'm experiencing differences with the contents of a container depending on whether I open a bash shell via docker run -i -t <container> bash or docker-compose run <container> bash and I don't know/understand how this is possible.
To aid in the explanation, please see this screenshot from my terminal. In both instances, I am running the image called blaze which has been built from the Dockerfile in my code. One of the steps during the build is to create a virutalenv called venv, however when I open a bash shell via docker-compose this virtualenv doesn't seem to exist unlike when I run docker run ....
I am relatively new to setting up my own builds with Docker, but surely if they are both referencing the same image, the output of ls within a bash shell should be the same? I would greatly appreciate any help or guidance to resources that would explain what exactly is going wrong here...
As an additional point, running docker images shows that both commands must be using the same image...
Thanks in advance!
This is my Dockerfile:
FROM blaze-base-image:latest
# add an URL that PIP automatically searches (e.g., Azure Artifact Store URL)
ARG INDEX_URL
ENV PIP_EXTRA_INDEX_URL=$INDEX_URL
# Copy source code to docker image
RUN mkdir /opt/app
COPY . /opt/app
RUN ls /opt/app
# Install Blaze pip dependencies
WORKDIR /opt/app
RUN python3.7 -m venv /opt/app/venv
RUN /opt/app/venv/bin/python -m pip install --upgrade pip
RUN /opt/app/venv/bin/python -m pip install keyring artifacts-keyring
RUN touch /opt/app/venv/pip.conf
RUN echo $'[global]\nextra-index-url=https://www.index.com' > /opt/app/venv/pip.conf
RUN /opt/app/venv/bin/python -m pip install -r /opt/app/requirements.txt
RUN /opt/app/venv/bin/python -m spacy download en_core_web_sm
# Comment
CMD ["echo", "Container build complete"]
And this is my docker-compose.yml:
version: '3'
services:
blaze:
build: .
image: blaze
volumes:
- .:/opt/app
There are two intersecting things going on here:
When you have a Compose volumes: or docker run -v option mounting host content over a container directory, the host content completely replaces what's in the image. If you don't have a ./venv directory on the host, then there won't be a /opt/app/venv directory in the container. That's why, when you docker-compose run blaze ..., the virtual environment is missing.
If you docker run a container, the only options that are considered are those in that specific docker run command. docker run doesn't know about the docker-compose.yml file and won't take options from there. That means there isn't this volume mount in the docker run case, which is why the virtual environment reappears.
Typically in Docker you don't need a virtual environment at all: the Docker image is isolated from other images and Python installations, and so it's safe and normal to install your application into the "system" Python. You also typically want your image to be self-contained and not depend on content from the host, so you wouldn't generally need the bind mount you show.
That would simplify your Dockerfile to:
FROM blaze-base-image:latest
# Any ARG will automatically appear as an environment variable to
# RUN directives; this won't be needed at run time
ARG PIP_EXTRA_INDEX_URL
# Creates the directory if it doesn't exist
WORKDIR /opt/app
# Install the Python-level dependencies
RUN pip install --upgrade pip
COPY requirements.txt .
RUN pip install -r requirements.txt
# The requirements.txt file should list every required package
# Install the rest of the application
COPY . .
# Set the main container command to run the application
CMD ["./app.py"]
The docker-compose.yml file can be similarly simplified to
version: '3.8' # '3' means '3.0'
services:
blaze:
build: .
# Compose picks its own image name
# Do not need volumes:, the image is self-contained
and then it will work consistently with either docker run or docker-compose run (or docker-compose up).
I have built a docker image using a Dockerfile that does the following:
FROM my-base-python-image
WORKDIR /opt/data/projects/project/
RUN mkdir files
RUN rm -rf /etc/yum.repos.d/*.repo
COPY rss-centos-7-config.repo /etc/yum.repos.d/
COPY files/ files/
RUN python -m venv /opt/venv && . /opt/venv/activate
RUN yum install -y unzip
WORKDIR files/
RUN unzip file.zip && rm -rf file.zip && . /opt/venv/bin/activate && python -m pip install *
WORKDIR /opt/data/projects/project/
That builds an image that allows me to run a custom command. In a terminal, for instance, here is the commmand I run after activating my project venv:
python -m pathA.ModuleB -a inputfile_a.json -b inputfile_b.json -c
Arguments a & b are custom tags to identify input files. -c calls a block of code.
So to run the built image successfully, I run the container and map local files to input files:
docker run --rm -it -v /local/inputfile_a.json:/opt/data/projects/project/inputfile_a.json -v /local/inputfile_b.json:/opt/data/projects/project/inputfile_b.json image-name:latest bash -c 'source /opt/venv/bin/activate && python -m pathA.ModuleB -a inputfile_a.json -b inputfile_b.json -c'
Besides shortening file paths, is there anythin I can do to shorten the docker run command? I'm thinking that adding a CMD and/or ENTRYPOINT to the Dockerfile would help, but I cannot figure out how to do it as I get errors.
There are a couple of things you can do to improve this.
The simplest is to run the application outside of Docker. You mention that you have a working Python virtual environment. A design goal of Docker is that programs in containers can't generally access files on the host, so if your application is all about reading and writing host files, Docker may not be a good fit.
Your file paths inside the container are fairly long, and this is bloating your -v mount options. You don't need an /opt/data/projects/project prefix; it's very typical just to use short paths like /app or /data.
You're also installing your application into a Python virtual environment, but inside a Docker image, which provides its own isolation. As you're seeing in your docker run command and elsewhere, the mechanics of activating a virtual environment in Docker are a little hairy. It's also not necessary; just skip the virtual environment setup altogether. (You can also directly run /opt/venv/bin/python and it knows it "belongs to" a virtual environment, without explicitly activating it.)
Finally, in your setup.py file, you can use a setuptools entry_points declaration to provide a script that runs your named module.
That can reduce your Dockerfile to more or less
FROM my-base-python-image
# OS-level setup
RUN rm -rf /etc/yum.repos.d/*.repo
COPY rss-centos-7-config.repo /etc/yum.repos.d/
RUN yum install -y unzip
# Copy the application in
WORKDIR /app/files
COPY files/ ./
RUN unzip file.zip \
&& rm file.zip \
&& pip install *
# Typical runtime metadata
WORKDIR /app
CMD main-script --help
And then when you run it, you can:
docker run --rm -it \
-v /local:/data \ # just map the entire directory
image-name:latest \
main-script -a /data/inputfile_a.json -b /data/inputfile_b.json -c
You can also consider the docker run -w /data option to change the current directory, which would add a Docker-level argument but slightly shorten the script command line.
I'm deploying an app to parse pdfs and return their highlighted content. After submitting my build and deploying it on cloud run, I ran into this error:
ModuleNotFoundError: No module named 'popplerqt5'
I previously ran into this error when running it python3 virtualenv on my local machine. However, I resolved it by running
/usr/bin/python3 main.py
instead of
python3 main.py
Currently I am running the app from my Dockerfile and am hence unable to pull of the same method. This is my Dockerfile configuration.
FROM gcr.io/google-appengine/python
# Create a virtualenv for dependencies. This isolates these packages from
# system-level packages.
# Use -p python3 or -p python3.7 to select python version. Default is version 2.
RUN apt-get update
RUN apt-get install poppler-utils -y
RUN virtualenv -p python3 /env
# Setting these environment variables are the same as running
# source /env/bin/activate.
ENV VIRTUAL_ENV /env
ENV PATH /env/bin:$PATH
# Copy the application's requirements.txt and run pip to install all
# dependencies into the virtualenv.
RUN apt-get install -y python3-poppler-qt5
ADD requirements.txt /app/requirements.txt
RUN pip install Flask gunicorn
RUN pip install -r /app/requirements.txt
# Add the application source code.
ADD . /app
# Run a WSGI server to serve the application. gunicorn must be declared as
# a dependency in requirements.txt.
CMD gunicorn -b :$PORT main:app
How do I get about this error?
I have some files which I want to move them to a docker container.
But at the end docker can't find a file..
The folder with the files on local machine are at /home/katalonne/flask4
File Structure if it matters:
The Dockerfile:
#
# First Flask App Dockerfile
#
#
# Pull base image.
FROM centos:7.0.1406
# Build commands
RUN yum install -y python-setuptools mysql-connector mysql-devel gcc python-devel
RUN easy_install pip
RUN mkdir /opt/flask4
WORKDIR /opt/flask4
ADD requirements.txt /opt/flask4
RUN pip install -r requirements.txt
ADD . /opt/flask4
# Define deafult command.
CMD ["python","hello.py"]
# Expose ports.
EXPOSE 5000
So I built the image with this command :
docker build -t flask4 .
I ran the container with volume by :
docker run -d -p 5000:5000 -v /home/Katalonne/flask4:/opt/flask4 --name web flask4
And when I want to run the file on the container :
docker logs -f web
I get this error that it can not find my hello.py file :
python: can't open file 'hello.py': [Errno 2] No such file or directory
What is my fault?
P.S. : I'm a Docker and Linux partially-noob.
The files and directories that are located in the same location as your Dockerfile are indeed available (temporarily) to your docker build. But, after the docker build, unless you have used ADD or COPY to move those files permanently to the docker container, they will not be available to your docker container after the build is done. This file context is for the build, but you want to move them to the container.
You can add the following command:
...
ADD . /opt/flask4
ADD . .
# Define deafult command.
CMD ["python","hello.py"]
The line ADD . . should copy over all the things in your temporary build context to the container. The location that these files will go to is where your WORKDIR is pointing to (/opt/flask4).
If you only wanted to add hello.py to your container, then use
ADD hello.py hello.py
So, when you run CMD ["python","hello.py"], the pwd that you will be in is /opt/flask4, and hello.py should be in there, and running the command python hello.py in that directory should work.
HTH.