Docker Container - source files disappearing - python

A docker image I am creating and sending to a client is somehow deleting its source code 24-48 hours after it is started. We can see this by exec onto the running container and talking a look around.
The service is a simple flask app. The service doesn't go down as the application doesn't experience an issue but the static files it should be yielding go missing (along with everything else copied in) so we start getting 404s. I can't think of anything that would explain this (especially considering that it takes time for it to occur)
FROM python:3.8-slim-buster
ARG USERNAME=calibrator
ARG USER_UID=1000
ARG USER_GID=$USER_UID
RUN apt-get update \
&& groupadd --gid $USER_GID $USERNAME \
&& useradd -s /bin/bash --uid $USER_UID --gid $USER_GID -m $USERNAME \
&& apt-get install -y sudo \
&& echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME\
&& chmod 0440 /etc/sudoers.d/$USERNAME \
# Install open-cv packaged
&& apt-get install -y libsm6 libxext6 libxrender-dev libgtk2.0-dev libgl1-mesa-glx \
#
## Git
&& sudo apt-get install -y git-lfs \
#
## Bespoke setup
&& apt-get -y install unixodbc-dev \
#
## PostgresSQL
&& apt-get -y install libpq-dev
ENV PATH="/home/${USERNAME}/.local/bin:${PATH}"
ARG git_user
ARG git_password
RUN pip install --upgrade pip
RUN python3 -m pip install --user git+https://${git_user}:${git_password}#bitbucket.org/****
WORKDIR /home/calibrator
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY app app
ENV FLASK_APP=app/app.py
EXPOSE 80
STOPSIGNAL SIGTERM
CMD ["uwsgi", "--http", ":80", "--module", "app.app", "--callable", "app", "--processes=1", "--master"]
version: "3.7"
services:
calibrator:
container_name: sed-calibrator-ui
image: sed-metadata-calibrator:2.0.3
restart: always
ports:
- "8081:80"
environment:
- STORE_ID=N0001
- DYNAMO_TABLE=****
- DYNAMO_REGION=****
- AWS_DEFAULT_REGION=****
- AWS_ACCESS_KEY_ID=****
- AWS_SECRET_ACCESS_KEY=****
The application reads in a single configuration file and connects to a database on startup and then defines the endpoints - none of which touch the filesystem again. How can the source code be deleting itself!?
Creating a new container resolves the issue.
Any suggestions in checking the client's environment would be appreciated because I cannot replicate it.
Clients versions
Docker Version - 18.09.7
Docker Compose version - 1.24.0

I was able to solve the problem by updating the kernel, it also worked with an older kernel (3.10)
works:
4.1.12-124.45.6.el7uek.x86_64
not work:
4.1.12-124.43.4.el7uek.x86_64
I do not know the reason that causes it, I only know that after updating the kernel the problem was solved. I hope it is your same problem

Related

Issue with scheduling the container

My docker-compose looks like this:
version: '3.7'
services:
app:
container_name: container1
#restart: unless-stopped
build:
context: .
dockerfile: Dockerfile
labels:
ofelia.enabled: "true"
ofelia.job-exec.app.schedule: "#every 30m"
ofelia.job-exec.app.command: "/app/Final_DM_For_All_Clients.py"
ofelia:
image: mcuadros/ofelia:latest
#restart: unless-stopped
depends_on:
- app
command: daemon --docker
volumes:
- /var/run/docker.sock:/var/run/docker.sock:r
After the first run, the second job run does not happen in 30 mins and I am getting the
following error during the second run.
container1-ofelia-1 | 2022-05-24T22:59:02.037Z common.go:121 ▶ ERROR [Job "app" (3d8aab0fe293)] Finished in "3.322350449s", failed: true, skipped: false, error: error creating exec: API error (409): Container 65f90ee04e25d1164d29ab911197d239873da3eafbc34823873d3ba2d791a0ad is not running
My docker file looks like this:
FROM python:3.8
#ADD Final_DM_For_All_Clients.py /
RUN /usr/local/bin/python -m pip install --upgrade pip
RUN pip install pandas
RUN pip install numpy
RUN pip install datetime
RUN pip install uuid
RUN apt-get update && apt-get install -y gcc unixodbc-dev
RUN pip install pyodbc
# install SQL Server drivers
#RUN apt-get update \
# && curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - \
# && curl https://packages.microsoft.com/config/debian/9/prod.list \
# > /etc/apt/sources.list.d/mssql-release.list \
# msodbcsql17
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/debian/11/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get update
RUN ACCEPT_EULA=Y apt-get install -y msodbcsql17
RUN ACCEPT_EULA=Y apt-get install -y mssql-tools
RUN curl -L "https://github.com/docker/compose/releases/download/1.29.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
RUN apt-get update && apt-get install -y cron && apt-get install vim -y
COPY . /app
WORKDIR /app
ADD df_curr.csv .
ADD df_prev.csv .
ADD df_roster.csv .
ADD df.csv .
ADD manager_name.csv .
ADD name_data.csv .
EXPOSE 8080
CMD python Final_DM_For_All_Clients.py
After the first run the container exits with code 0 indicating a successful run but on the second scheduled run i.e. at 1 hr, since the container is stopped, the scheduler is not able to run the script during the second run.
I think the issue is that you're using ofelia's exec job, which tries to docker exec into a running container, which fails.
What you are looking for is the run job which, according to the docs, either runs a new container or starts an existing one. So I believe you should change your labels definition like this:
ofelia:
... # Your configs
labels:
ofelia.enabled: "true"
ofelia.job-run.app.schedule: "#every 30m"
ofelia.job-run.app.container: "container1"
... # Your configs
The docs say you have to pass in the container parameter when trying to do a docker start. I am not sure how it plays when you run it for the first time, but I assume that app runs on its own the first time and later times ofelia starts the container again.

Why is there no console output from my docker container?

I have an application that consists of a docker container with a redis instance, another docker container with a rabbitmq server, and a third container in which I'm trying to start a number of celery workers that can interact with the redis and rabbitmq containers. I'm starting and stopping these workers via a rest api, and have checked that I'm able to do this on my host machine. However, after moving the setup to docker, it seems the workers are not behaving as expected. Whereas on my host machine (windows 10) I was able to see the reply from the rest api and console output from the workers, I can only see the response from the rest api (a log message) and no console output. It also seems that the workers are not accessing the redis and rabbitmq instances.
My docker container is built from a python3.6 (linux) base image. I have checked that everything is installed correctly, and there are no error logs. I build the image with the following dockerfile:
FROM python:3.6
WORKDIR /opt
# create a virtual environment and add it to PATH so that it is applied for
# all future RUN and CMD calls
ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
# install msodbcsql17
RUN apt-get update \
&& apt-get install -y curl apt-transport-https gnupg2 \
&& curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
\
&& curl https://packages.microsoft.com/config/debian/9/prod.list >
/etc/apt/sources.list.d/mssql-release.list \
&& apt-get update \
&& ACCEPT_EULA=Y apt-get install -y msodbcsql17 mssql-tools
# Install Mono for pythonnet.
RUN apt-get update \
&& apt-get install --yes \
dirmngr \
clang \
gnupg \
ca-certificates \
# Dependency for pyodbc.
unixodbc-dev \
&& apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys
3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF \
&& echo "deb http://download.mono-project.com/repo/debian
stretch/snapshots/5.20 main" | tee /etc/apt/sources.list.d/mono-official-
stable.list \
&& apt-get update \
&& apt-get install --yes \
mono-devel=5.20\* \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
COPY src ./src
COPY setup.py ./setup.py
COPY config.json ./config.json
COPY Utility.dll ./Utility.dll
COPY settings_docker.ini ./settings.ini
COPY config.json ./config.json
COPY sql_config.json ./sql_config.json
RUN python3 -m venv $VIRTUAL_ENV \
# From here on, use virtual env's python.
&& venv/bin/pip install --upgrade pip \
&& venv/bin/pip install --no-cache-dir --upgrade pip setuptools wheel \
&& venv/bin/pip install --no-cache-dir -r requirements.txt \
# Dependency for pythonnet.
&& venv/bin/pip install --no-cache-dir pycparser \
&& venv/bin/pip install -U --no-cache-dir "pythonnet==2.5.1" \
# && python -m pip install --no-cache-dir "pyodbc==4.0.25" "pythonnet==2.5.1"
EXPOSE 8081
cmd python src/celery_artifacts/docker_workers/worker_app.py
And then run it with this command:
docker run --name app -p 8081:8081 app
I then attach the container to the same bridge network as the other 2:
docker network connect my_network app
Is there a way to see the same console output from my container as the one on the host?

Avoid dependency hell on docker

I build an AI application in Python involving quiet an amount of Python libraries. At this point, I would like to run my application inside of a docker container to make the AI App a service.
What are my options concerning dependencie so that all necessary libraries are downloaded automatically?
As an weak alternative, I tried this with a "requirement.txt" file on the same level as my Docker build file, but this didn't work.
Your Dockerfile will need instructions to install the requirements, e.g.
COPY requirement.txt requirement.txt
RUN pip install -r requirement.txt
Thank you for the very useful comments:
My dockerfile:
# Python 3.7.3
FROM python:3.7-slim
# Set the working directory to /app
WORKDIR /app
COPY greeter_server.py /app
COPY AspenTechClient.py /app
COPY OpcUa_pb2.py /app
COPY OpcUa_pb2_grpc.py /app
COPY helloworld_pb2.py /app
COPY helloworld_pb2_grpc.py /app
COPY Models.py /app
ADD ./requirement.txt /app
# Training & Validation data we need
RUN mkdir -p /app/output
RUN pip install -r requirement.txt
#RUN pip3 install grpcio grpcio-tools
#RUN pip install protobuf
#RUN pip install pandas
#RUN pip install scipy
#expose ports to outside container for web app access
EXPOSE 10500
# Argument to python command
CMD [ "python", "/app/greeter_server.py" ]
By the tips here, I already added the extra lines for "requirement.txt" and that works like a charm. Thank you very much!
Since I only want to run a deployment in the container, I will forseen trained models so no need for a GPU. For this I have a local machine. With an appropriate mount I deliver the .h5 to the container.
#pyeR_biz: Thank you very much for the tips about pipelines. This is something I didn't have experience with but certainly will do it in the near future.
You have several options. It depends a lot on the use case, the number of containers you will eventually build, production vs dev environment etc.
Generally if you have an AI application you will need a graphics card driver pre-installed on your host system for model training. Which means eventually you'll have to come up with a way to automate driver install or write instructions for end users to do that. For an app you might also need database drivers in the docker image, if your front or back end databases are outside the container. Here is a toned-down example of one of my uses cases with requirement being building docker for a data pipeline.
#Taken from puckel/docker-airflow
#can look up this image name on google to see which OS it is based on.
FROM python:3.6-slim-buster
LABEL maintainer="batman"
# Never prompt the user for choices on installation/configuration of packages
ENV DEBIAN_FRONTEND noninteractive
ENV TERM linux
# Set some default configuration for data pipeline management tool called airflow
ARG AIRFLOW_VERSION=1.10.9
ARG AIRFLOW_USER_HOME=/usr/local/airflow
ARG AIRFLOW_DEPS=""
ENV AIRFLOW_HOME=${AIRFLOW_USER_HOME}
# here install some linux dependencies required to run the pipeline.
# use apt-get install, apt-get auto-remove etc to reduce size of image
# curl and install sql server odbc driver for my linux
RUN set -ex \
&& buildDeps=' freetds-dev libkrb5-dev libsasl2-dev libssl-dev libffi-dev libpq-dev git' \
&& apt-get update -yqq \
&& apt-get upgrade -yqq \
&& apt-get install -yqq --no-install-recommends \
$buildDeps freetds-bin build-essential default-libmysqlclient-dev \
apt-utils curl rsync netcat locales gnupg wget \
&& useradd -ms /bin/bash -d ${AIRFLOW_USER_HOME} airflow \
&& curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - \ #
&& curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list \
&& apt-get update \
&& ACCEPT_EULA=Y apt-get install -y msodbcsql17 \
&& ACCEPT_EULA=Y apt-get install -y mssql-tools \
&& pip install apache-airflow[crypto,celery,postgres,hive,jdbc,mysql,ssh${AIRFLOW_DEPS:+,}${AIRFLOW_DEPS}]==${AIRFLOW_VERSION} \
&& apt-get purge --auto-remove -yqq $buildDeps \
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/* \
/usr/share/man \
/usr/share/doc \
/usr/share/doc-base
# Install all required packages in python environment from requirements.txt (I generally remove version numbers if my python version are same)
ADD ./requirements.txt /config/
RUN pip install -r /config/requirements.txt
# CLEANUP
RUN apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/* \
/usr/share/man \
/usr/share/doc \
/usr/share/doc-base
#CONFIGURATION
COPY script/entrypoint.sh /entrypoint.sh
COPY config/airflow.cfg ${AIRFLOW_USER_HOME}/airflow.cfg
# hand ownership of libraries to relevant user
RUN chown -R airflow: ${AIRFLOW_USER_HOME}
#expose ports to outside container for web app access
EXPOSE 8080 5555 8793
USER airflow
WORKDIR ${AIRFLOW_USER_HOME}
ENTRYPOINT ["/entrypoint.sh"]
CMD ["webserver"]
1) Select an appropriate base image which has the operating system you need.
2) Get your gpu drivers installed if you are training a model, not mandatory if you are serving the model

Can I copy a directory from some location outside the docker area to my dockerfile?

I have installed a library called fastai==1.0.59 via requirements.txt file inside my Dockerfile.
But the purpose of running the Django app is not achieved because of one error. To solve that error, I need to manually edit the files /site-packages/fastai/torch_core.py and site-packages/fastai/basic_train.py inside this library folder which I don't intend to.
Therefore I'm trying to copy the fastai folder itself from my host machine to the location inside docker image.
source location: /Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/fastai/
destination location: ../venv/lib/python3.6/site-packages/ which is inside my docker image.
being new to docker, I tried this using COPY command like:
COPY /Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/fastai/ ../venv/lib/python3.6/site-packages/
which gave me an error:
ERROR: Service 'app' failed to build: COPY failed: stat /var/lib/docker/tmp/docker-builder583041406/Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/fastai: no such file or directory.
I tried referring this: How to include files outside of Docker's build context?
but seems like it bounced off my head a bit..
Please help me tackling this. Thanks.
Dockerfile:
FROM python:3.6-slim-buster AS build
MAINTAINER model1
ENV PYTHONUNBUFFERED 1
RUN python3 -m venv /venv
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y git && \
apt-get install -y build-essential && \
apt-get install -y awscli && \
apt-get install -y unzip && \
apt-get install -y nano && \
apt-get install -y libsm6 libxext6 libxrender-dev
RUN apt-cache search mysql-server
RUN apt-cache search libmysqlclient-dev
RUN apt-get install -y libpq-dev
RUN apt-get install -y postgresql
RUN apt-cache search postgresql-server-dev-9.5
RUN apt-get install -y libglib2.0-0
RUN mkdir -p /model/
COPY . /model/
WORKDIR /model/
RUN pip install --upgrade awscli==1.14.5 s3cmd==2.0.1 python-magic
RUN pip install -r ./requirements.txt
EXPOSE 8001
RUN chmod -R 777 /model/
COPY /Users/AjayB/anaconda3/envs/MyDjangoEnv/lib/python3.6/site-packages/fastai/ ../venv/lib/python3.6/site-packages/
CMD python3 -m /venv/activate
CMD /model/my_setup.sh development
CMD export API_ENV = development
CMD cd server && \
python manage.py migrate && \
python manage.py runserver 0.0.0.0:8001
Short Answer
No
Long Answer
When you run docker build the current directory and all of its contents (subdirectories and all) are copied into a staging area called the 'build context'. When you issue a COPY instruction in the Dockerfile, docker will copy from the staging area into a layer in the image's filesystem.
As you can see, this procludes copying files from directories outside the build context.
Workaround
Either download the files you want from their golden-source directly into the image during the build process (this is why you often see a lot of curl statements in Dockerfiles), or you can copy the files (dirs) you need into the build-tree and check them into source control as part of your project. Which method you choose is entirely dependent on the nature of your project and the files you need.
Notes
There are other workarounds documented for this, all of them without exception break the intent of 'portability' of your build. The only quality solutions are those documented here (though I'm happy to add to this list if I've missed any that preserve portability).

Prevent docker-compose from reinstalling requirements.txt while using a built image

I have an app ABC, which I want to put on docker environment. I built a Dockerfile and got the image abcd1234 which I used in docker-compose.yml
But on trying to build the docker-compose, All the requirements.txt files are getting reinstalled. Can it not use the already existing image and prevent time from reinstalling it?
I'm new to docker and trying to understand all the parameters. Also, is the 'context' correct? in docker-compose.yml or it should contain path inside the Image?
PS, my docker-compose.yml is not in same directory of project because I'll be using multiple images to expose more ports.
docker-compose.yml:
services:
app:
build:
context: /Users/user/Desktop/ABC/
ports:
- "8000:8000"
image: abcd1234
command: >
sh -c "python manage.py migrate &&
python manage.py runserver 0.0.0.0:8000"
environment:
- PROJECT_ENV=development
Dockerfile:
FROM python:3.6-slim-buster AS build
MAINTAINER ABC
ENV PYTHONUNBUFFERED 1
RUN python3 -m venv /venv
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y git && \
apt-get install -y build-essential && \
apt-get install -y awscli && \
apt-get install -y unzip && \
apt-get install -y nano
RUN apt-get install -y libsm6 libxext6 libxrender-dev
COPY . /ABC/
RUN apt-cache search mysql-server
RUN apt-cache search libmysqlclient-dev
RUN apt-get install -y libpq-dev
RUN apt-get install -y postgresql
RUN apt-cache search postgresql-server-dev-9.5
RUN pip install --upgrade awscli==1.14.5 s3cmd==2.0.1 python-magic
RUN pip install -r /ABC/requirements.txt
WORKDIR .
Please guide me on how to tackle these 2 scenarios. Thanks!
The context: directory is the directory on your host system that includes the Dockerfile. It's the same directory you would pass to docker build, and it frequently is just the current directory ..
Within the Dockerfile, Docker can cache individual build steps so that it doesn't repeat them, but only until it reaches the point where something has changed. That "something" can be a changed RUN line, but at the point of your COPY, if any file at all changes in your local source tree that also invalidates the cache for everything after it.
For this reason, a typical Dockerfile has a couple of "phases"; you can repeat this pattern in other languages too. You can restructure your Dockerfile in this order:
# 1. Base information; this almost never changes
FROM python:3.6-slim-buster AS build
MAINTAINER ABC
ENV PYTHONUNBUFFERED 1
WORKDIR /ABC
# 2. Install OS packages. Doesn't depend on your source tree.
# Frequently just one RUN line (but could be more if you need
# packages that aren't in the default OS package repository).
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get upgrade -y && \
DEBIAN_FRONTEND=noninteractive apt-get install -y \
build-essential unzip libxrender-dev libpq-dev
# 3. Copy _only_ the file that declares language-level dependencies.
# Repeat starting from here only if this file changes.
COPY requirements.txt .
RUN pip install -r requirements.txt
# 4. Copy the rest of the application in. In a compiled language
# (Javascript/Webpack, Typescript, Java, Go, ...) build it.
COPY . .
# 5. Explain how to run the application.
EXPOSE 8000
CMD python manage.py migrate && \
python manage.py runserver 0.0.0.0:8000

Categories

Resources