Python process never exits in Docker container during CircleCI workflow

Python process never exits in Docker container during CircleCI workflow - python

I have a Dockerfile that looks like this:
FROM python:3.6
WORKDIR /app
ADD . /app/
# Install system requirements
RUN apt-get update && \
xargs -a requirements_apt.txt apt-get install -y
# Install Python requirements
RUN python -m pip install --upgrade pip
RUN python -m pip install -r requirements_pip.txt
# Circle CI ignores entrypoints by default
ENTRYPOINT ["dostuff"]
I have a CircleCI config that does:
version: 2.1
orbs:
aws-ecr: circleci/aws-ecr#6.15.3
jobs:
benchmark_tests_dev:
docker:
- image: blah_blah_image:test_dev
#auth
steps:
- checkout
- run:
name: Compile and run benchmarks
command: make bench
workflows:
workflow_test_and_deploy_dev:
jobs:
- aws-ecr/build-and-push-image:
name: build_test_dev
context: my_context
account-url: AWS_ECR_ACCOUNT_URL
region: AWS_REGION
repo: my_repo
aws-access-key-id: AWS_ACCESS_KEY_ID
aws-secret-access-key: AWS_SECRET_ACCESS_KEY
dockerfile: Dockerfile
tag: test_dev
filters:
branches:
only: my-test-branch
- benchmark_tests_dev:
requires: [build_test_dev]
context: my_context
filters:
branches:
only: my-test-branch
- aws-ecr/build-and-push-image:
name: deploy_dev
requires: [benchmark_tests_dev]
context: my_context
account-url: AWS_ECR_ACCOUNT_URL
region: AWS_REGION
repo: my_repo
aws-access-key-id: AWS_ACCESS_KEY_ID
aws-secret-access-key: AWS_SECRET_ACCESS_KEY
dockerfile: Dockerfile
tag: test2
filters:
branches:
only: my-test-branch
make bench looks like:
bench:
python tests/benchmarks/bench_1.py
python tests/benchmarks/bench_2.py
Both benchmark tests follow this pattern:
# imports
# define constants
# Define functions/classes
if __name__ == "__main__":
# Run those tests
If I build my Docker container on my-test-branch locally, override the entrypoint to get inside of it, and run make bench from inside the container, both Python scripts execute perfectly and exit.
If I commit to the same branch and trigger the CircleCI workflow, the bench_1.py runs and then never exits. I have tried switching the order of the Python scripts in the make command. In that case, bench_2.py runs and then never exits. I have tried putting a sys.exit() at the end of the if __name__ == "__main__": block of both scripts and that doesn't force an exit on CircleCI. I the first script to be run will run to completion because I have placed logs throughout the script to track progress. It just never exits.
Any idea why these scripts would run and exit in the container locally but not exit in the container on CircleCI?
EDIT
I just realized "never exits" is an assumption I'm making. It's possible the script exits but the CircleCI job hangs silently after that? The point is the script runs, finishes, and the CircleCI job continues to run until I get a timeout error at 10 minutes (Too long with no output (exceeded 10m0s): context deadline exceeded).

Turns out the snowflake.connector Python lib we were using has this issue where if an error occurs during an open Snowflake connection, the connection is not properly closed and the process hangs. There is also another issue where certain errors in that lib are being logged and not raised, causing the first issue to occur silently.
I updated our snowflake IO handler to explicitly open/close a connection for every read/execute so that this doesn't happen. Now my scripts run just fine in the container on CircleCI. I still don't know why they ran in the container locally and not remotely, but I'm going to leave that one for the dev ops gods.

Related

How to make docker container running continuously?

I have a Docker image that is actually a server for a device. It is started from a Python script, and I made .sh to run it. However, whenever I run it, it says that it is executed and it ends (server exited with code 0). The only way I made it work is via docker-compose when I run it as detached container, then enter the container via bin/bash and execute the run script (beforementioned .sh) from it manually, then exit the container.
After that everything works as intended, but the issue arises when the server is rebooted. I have to do it manually all over again.
Did anyone else experience anything similar? If yes how can I fix this?
File that starts server (start.sh):
#!/bin/sh
python source/server/main.pyc &
python source/server/main_socket.pyc &
python source/server/main_monitor_server.pyc &
python source/server/main_status_server.pyc &
python source/server/main_events_server.pyc &
Dockerfile:
FROM ubuntu:trusty
RUN mkdir -p /home/server
COPY server /home/server/
EXPOSE 8854
CMD [ /home/server/start.sh ]
Docker Compose:
version: "3.9"
services:
server:
tty: yes
image: deviceserver:latest
container_name: server
restart: always
ports:
- "8854:8854"
deploy:
resources:
limits:
memory: 3072M

It's not a problem with docker-compose. Your docker container should not return (i.e block) even when launched with a simple docker run.
For that your CMD should run in the foreground.
I think the issue is that you're start.sh returns instead of blocking. Have you tried to remove the last '&' from your script (I'm not familiar with python and what these different processes are)?

In general you should run only one process per container. If you have five separate processes you need to run, you would typically run five separate containers.
The corollaries to this are that the main container command should be a foreground process; but also that you can run multiple containers off of the same image with different commands. In Compose you can override the command: separately for each container. So, for example, you can specify:
version: '3.8'
services:
main:
image: deviceserver:latest
command: ./main.py
socket:
image: deviceserver:latest
command: ./main_socket.py
et: cetera
If you're trying to copy-and-paste this exact docker-compose.yml file, make sure to set a WORKDIR in the Dockerfile so that the scripts are in the current directory, make sure the scripts are executable (chmod +x in your source repository), and make sure they start with a "shebang" line #!/usr/bin/env python3. You shouldn't need to explicitly say python anywhere.
FROM python:3.9 # not a bare Ubuntu image
WORKDIR /home/server # creates the directory too
COPY server ./ # don't need to duplicate the directory name here
RUN pip install -r requirements.txt
EXPOSE 8854 # optional, does almost nothing
CMD ["./main.py"] # valid JSON-array syntax; can be overridden
There are two major issues in the setup you show. The CMD is not a syntactically valid JSON array (the command itself is not "quoted") and so Docker will run it as a shell command; [ is an alias for test(1) and will exit immediately. If you do successfully run the script, the script launches a bunch of background processes and then exits, but since the script is the main container command, that will cause the container to exit as well. Running a set of single-process containers is generally easier to manage and scale than trying to squeeze multiple processes into a single container.

You can add sleep command in the end of your start.sh.
#!/bin/sh
python source/server/main.pyc &
python source/server/main_socket.pyc &
python source/server/main_monitor_server.pyc &
python source/server/main_status_server.pyc &
python source/server/main_events_server.pyc &
while true
do
sleep 1;
done

debug_toolbar module is not persisted after docker down / docker up

I started a new project using Django. This project is build using Docker with few containers and poetry to install all dependencies.
When I first run docker-compose up -d, everything is installed correctly. Actually, this problem is not related with Docker I suppose.
After I run that command, I'm running docker-compose exec python make -f automation/local/Makefile which has this content
Makefile
.PHONY: all
all: install-deps run-migrations build-static-files create-superuser
.PHONY: build-static-files
build-static-files:
python manage.py collectstatic --noinput
.PHONY: create-superuser
create-superuser:
python manage.py createsuperuser --noinput --user=${DJANGO_SUPERUSER_USERNAME} --email=${DJANGO_SUPERUSER_USERNAME}#zitec.com
.PHONY: install-deps
install-deps: vendor
vendor: pyproject.toml $(wildcard poetry.lock)
poetry install --no-interaction --no-root
.PHONY: run-migrations
run-migrations:
python manage.py migrate --noinput
pyproject.toml
[tool.poetry]
name = "some-random-application-name"
version = "0.1.0"
description = ""
authors = ["xxx <xxx#xxx.com>"]
[tool.poetry.dependencies]
python = ">=3.6"
Django = "3.0.8"
docusign-esign = "^3.4.0"
[tool.poetry.dev-dependencies]
pytest = "^3.4"
django-debug-toolbar = "^2.2"
Debug toolbar is installed by adding those entries under settings.py (MIDDLEWARE / INSTALLED_APP) and even DEBUG_TOOLBAR_CONFIG with next value: SHOW_TOOLBAR_CALLBACK.
Let me confirm that EVERYTHING works after fresh docker-compose up -d. The problem occurs after I stop container and start it again using next commands:
docker-compose down
docker-compose up -d
When I try to access the project it says that Module debug_toolbar does not exist!.
I read all questions from this website, but nothing worked for me.
Has anyone encountered this problem before?

That sounds like normal behavior. A container has a temporary filesystem, and when the container exits any changes that have been made in that filesystem will be permanently lost. Deleting and recreating containers is extremely routine (even just changing environment: or ports: settings in the docker-compose.yml file would cause that to happen).
You should almost never install software in a running container. docker exec is an extremely useful debugging tool, but it shouldn't be the primary way you interact with your container. In both cases you're setting yourself up to lose work if you ever need to change a Docker-level setting or update the base image.
For this example, you can split the contents of that Makefile into two parts, the install_deps target (that installs Python packages but doesn't have any external dependencies) and the rest (that will depend on a database running). You need to run the installation part at image-build time, but the Dockerfile can't access a database, so the remainder needs to happen at container-startup time.
So in your image's Dockerfile, RUN the installation part:
RUN make install-reps
You will also need an entrypoint script that does the rest of the first-time setup, then runs the main container command. This can look like:
#!/bin/sh
make run-migrations build-static-files create-superuser
exec "$#"
Then run this in your Dockerfile:
COPY entrypoint.sh .
ENTRYPOINT ["./entrypoint.sh"]
CMD python3 manage.py runserver --host 0.0.0.0
(I've recently seen a lot of Dockerfiles that have just ENTRYPOINT ["python3"]. Splitting ENTRYPOINT and CMD this way isn't especially useful; just move the python3 interpreter command into CMD.)

Local GitLab runner freezes while Shared GitLab.com runner succeeds

EDIT: As Rekovni pointed out, using a GitLab runner with Docker on a Windows machine is a problem. Installing the runner in a Linux-based virtual machine solved the problem.
I am developing a Python program using a conda environment. It is hosted on GitLab.com and I am using GitLab-CI to generate the documentation.
I configured the following .gitlab-ci.yml file for it:
image: continuumio/miniconda3:latest
before_script:
# Update conda and create environment, which is then activated.
- conda update -vvv -y -c conda-forge conda
- conda env create -f helpers/NAME.yml
- source activate NAME
# Correct installation.
- conda install -q -y gsl=2.2.1
pages:
script:
# Install make.
- apt-get update
- apt-get install -q -y build-essential
# Install Spinx-related packages.
- conda install -q -y sphinx sphinx_rtd_theme
# Create documentation.
- cd REPO/doc
- sphinx-apidoc -o source/ ../REPO --force --separate
- make html
# Transfer documentation to public pages folder.
- mv build/html/ ../../public/
artifacts:
paths:
- public
# only:
# - master
Running this script with a shared GitLab runner that is supplied with GitLab.com works and the documentation is generated and placed in the public folder.
For future unit tests (which take much longer), I want to provide a local runner on a Win 10 machine in my network. For this, I installed the gitlab-runner.exe and Docker Desktop. I successfully registered the runner with the project on GitLab.com.
The runner is using the following config.toml configuration file:
concurrent = 1
check_interval = 0
log_level = "info"
[session_server]
session_timeout = 1800
[[runners]]
name = "NAME"
url = "https://gitlab.com"
token = "TOKEN"
executor = "docker"
[runners.custom_build_dir]
[runners.docker]
tls_verify = false
image = "alpine:latest"
privileged = false
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
The problem is now that the local runner freezes during the execution of the above script without producing any error messages and I am at a loss on how to debug it. What I have is
The log of the script that is shown on the Job page on GitLab.com; and
The console output of the gitlab-runner.exe on the local machine.
Regarding 1., I see
[0KRunning with gitlab-runner 11.10.0 (3001a600)
...
[32;1mChecking out COMMIT_HASH as BRANCH_NAME...[0;m
...
[0K[32;1m$ conda update -vvv -y -c conda-forge conda[0;m
DEBUG conda.gateways.logging:set_verbosity(148): verbosity set to 3
...
...
...
TRACE conda.gateways.disk.update:rename(52): renaming /opt/conda/share/doc/openssl/html/man3/OSSL_STORE_LOADER_new.html => /opt/conda/share/doc/openssl/html/man3/OSSL_STORE_LOADER_new.html.c~
TRACE conda.core.path_actions:execute(1041): renaming share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_close.html => share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_close.html.c~
TRACE conda.gateways.disk.update:rename(52): renaming /opt/conda/share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_close.html => /opt/conda/share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_close.html.c~
TRACE conda.core.path_actions:execute(1041): renaming share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_ctrl.html => share/doc/openssl/html/man3/OSSL_STORE_LOADER_set_ctrl.html.c~
where it abruptly stops without reaching the - conda env create -f helpers/NAME.yml line.
Regarding 2., I see
C:\GitLab-Runner>gitlab-runner.exe --debug run
Runtime platform arch=amd64 os=windows pid=14116 revision=3001a600 version=11.10.0Starting multi-runner from C:\GitLab-Runner\config.toml ... builds=0
Checking runtime mode GOOS=windows uid=-1
Configuration loaded builds=0
...
Feeding runners to channel builds=0
Checking for jobs... nothing runner=TOKEN
Feeding runners to channel builds=0
Checking for jobs... received job=203033130 repo_url=REPO_URL.git runner=TOKEN
...
Attaching to container HASH ... job=203033130 project=6249897 runner=TOKEN
Starting container HASH ... job=203033130 project=6249897 runner=TOKEN
Waiting for attach to finish HASH ... job=203033130 project=6249897 runner=TOKEN
Waiting for container HASH ... job=203033130 project=6249897 runner=TOKEN
Appending trace to coordinator... ok code=202 job=203033130 job-log=0-10348 job-status=running runner=TOKEN sent-log=1801-10347 status=202 Accepted
Appending trace to coordinator... ok code=202 job=203033130 job-log=0-19445 job-status=running runner=TOKEN sent-log=10348-19444 status=202 Accepted
...
Appending trace to coordinator... ok code=202 job=203033130 job-log=0-933150 job-status=running runner=TOKEN sent-log=241860-933149 status=202 Accepted
Submitting job to coordinator... ok code=200 job=203033130 job-status= runner=TOKEN
Submitting job to coordinator... ok code=200 job=203033130 job-status= runner=TOKEN
where it seems that the switch from Appending trace to coordinator to Submitting job to coordinator happens around the time when it gets stuck.
After this, 1. is not updated with any further information and 2. is stuck in a Submitting job to coordinator loop.
Does anyone know:
What the reason for the failure with a local runner could be (when the same script works with a shared runner)?
What I could do to debug this problem?
Thanks and all the best,
Thomas

GitLab CI doesn't currently offer a solution for using its runner with Docker on a Windows environment, however there is an epic at the moment which is tracking progress for this.
In one of the issues of the epic, a contributer has managed to get a working version of a gitlab-runner which uses Docker for Windows, with which more details can be found here.
A more common (and potentially easier) way of using Docker in a Windows environment, would be to install the gitlab-runner as a Shell runner, and call the Docker commands manually to run your tests.
Conversely, if you just want to keep using the same CI script, you could install a Linux VM on your Windows 10 machine, and have that host the docker runner!

Execute Python script inside a given docker-compose container

I have made a little python script to create a DB and some tables inside a RethinkDB
But now I'm trying to launch this python script inside my rethink container launched with docker-compose.
This is my docker-compose.yml rethink container config
# Rethink DB
rethink:
image: rethinkdb:latest
container_name: rethink
ports:
- 58080:8080
- 58015:28015
- 59015:29015
I'm trying to execute the script with after launching my container
docker exec -it rethink python src/app/db-install.py
But I get this error
rpc error: code = 2 desc = oci runtime error: exec failed: exec: "python": executable file not found in $PATH
Python is not found in me container. Is this possible to execute a python script inside a given container with docker-compose or with docker exec ?

First find out if you have python executable in the container:
docker exec -it rethink which python
If it exists, Use the absolute path provided by which command in previous step:
docker exec -it rethink /absolute/path/to/python src/app/db-install.py
If not, you can convert your python script to bash script, so you can run it without extra executables and libraries.
Or you can create a dockerfile, use base image, and install python.
dockerfile:
FROM rethinkdb:latest
RUN apt-get update && apt-get install -y python
Docker Compose file:
rethink:
build : .
container_name: rethink
ports:
- 58080:8080
- 58015:28015
- 59015:29015

Docker-compose
Assuming that python is installed, try:
docker-compose run --rm MY_DOCKER_COMPOSE_SERVICE MY_PYTHON_COMMAND
For a start, you might also just go into the shell at first and run a python script from the command prompt.
docker-compose run --rm MY_DOCKER_COMPOSE_SERVICE bash
In your case, MY_DOCKER_COMPOSE_SERVICE is 'rethink', and that is not the container name here, but the name of the service (first line rethink:), and only the service is run with docker-compose run, not the container.
The MY_PYTHON_COMMAND is, in your case of Python2, python src/app/db-install.py, but in Python3 it is python -m src/app/db-install (without the ".py"), or, if you have Python3 and Python2 installed, python3 -m src/app/db-install.
Dockerfile
To be able to run this python command, the Python file needs to be in the container. Therefore, in your Dockerfile that you need to call with build: ., you need to copy your build directory to a directory in the container of your choice
COPY $PROJECT_PATH /tmp
This /tmp will be created in your build directory. If you just write ".", you do not have any subfolder and save it directly in the build directory.
When using /tmp as the subfolder, you might write at the end of your Dockerfile:
WORKDIR /tmp
Docker-compose
Or if you do not change the WORKDIR from the build (".") context to /tmp and you still want to reach /tmp, run your Python file like /tmp/db-install.py.

The rethinkdb image is based on the debian:jessie image :
https://github.com/rethinkdb/rethinkdb-dockerfiles/blob/da98484fc73485fe7780546903d01dcbcd931673/jessie/2.3.5/Dockerfile
The debian:jessie image does not come with python installed.
So you will need to create your own Dockerfile, something like :
FROM rethinkdb:latest
RUN apt-get update && apt-get install -y python
Then change your docker-compose :
# Rethink DB
rethink:
build : .
container_name: rethink
ports:
- 58080:8080
- 58015:28015
- 59015:29015
build : . is the path to your Dockerfile.

Best practices for debugging vagrant+docker+flask

My goal is to run a flask webserver from a Docker container. Working on a Windows machine this requires Vagrant for creating a VM. Running vagrant up --provider=docker leads to the following complaint:
INFO interface: error: The container started either never left the "stopped" state or
very quickly reverted to the "stopped" state. This is usually
because the container didn't execute a command that kept it running,
and usually indicates a misconfiguration.
If you meant for this container to not remain running, please
set the Docker provider configuration "remains_running" to "false":
config.vm.provider "docker" do |d|
d.remains_running = false
end
This is my Dockerfile
FROM mrmrcoleman/python_webapp
EXPOSE 5000
# Install Python
RUN apt-get install -y python python-dev python-distribute python-pip
# Add and install Python modules
RUN pip install Flask
#copy the working directory to the container
ADD . /
CMD python run.py
And this is the Vagrantfile
Vagrant.configure("2") do |config|
config.vm.provider "docker" do |d|
d.build_dir = "." #searches for a local dockerfile
end
config.vm.synced_folder ".", "/vagrant", type: "rsync"
rsync__chown = false
end
Because the Vagrantfile and run.py work without trouble independently, I suspect I made a mistake in the Dockerfile. My question is twofold:
Is there something clearly wrong with the Dockerfile or the
Vagrantfile?
Is there a way to have vagrant/docker produce more
specific error messages?

I think the answer I was looking for is using the command
vagrant docker-logs
I broke the Dockerfile because I did not recognize good behaviour as such, because nothing really happens if the app runs as it should. docker-logs confirms that the flask app is listening for requests.

Is there something clearly wrong with the Dockerfile or the Vagrantfile?
Your Dockerfile and Vagrantfiles look good, but I think you need to modify the permissions of run.py to be executable:
...
#copy the working directory to the container
ADD . /
RUN chmod +x run.py
CMD python run.py
Does that work?
Is there a way to have vagrant/docker produce more specific error messages?
Try taking a look at the vagrant debugging page. Another approach I use is to log into the container and try running the script manually.
# log onto the vm running docker
vagrant ssh
# start your container in bash, assuming its already built.
docker run -it my/container /bin/bash
# now from inside your container try to start your app
python run.py
Also, if you want to view your app locally, you'll want to add port forwarding to your Vagrantfile.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.