In one of the containers in Docker, it's running a flask application with API endpoints exposed. I'm trying to configure a cron by updating crontab file to consume the api on regular intervals.
Dockerfile
FROM nikolaik/python-nodejs:python3.7-nodejs14
ENV APP /deploy
.....
.......
COPY . /$APP
RUN pip install -e .
EXPOSE 8080
ADD entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT /entrypoint.sh
entrypoint.sh
#!/bin/sh
until PGPASSWORD=$POSTGRES_PASSWORD psql -h "postgres" -U "postgres" -c '\q'; do
>&2 echo "Postgres is unavailable - sleeping"
sleep 1
done
>&2 echo " "
/venv/bin/uwsgi app.ini
# Setup a cron schedule
>&2 echo "* * * * * /usr/bin/curl -X GET http://localhost:8080/import_data >/var/log/stdout1.log 2>/var/log/stderr1.log
# This extra line makes it a valid cron" > scheduler.txt
crontab scheduler.txt
cron -f
on executing the command
docker-compose up
There are no updates done to scheduler.txt. I checked the container and none of the files which created from these -scheduler, stdout, stderr . Above postgres operations work while below print statements are also not executed.
Alternatively, This can be done by using a different container for cron and creating a virtual network for these two containers but i want to understand what's going wrong when try to update in same container.
Let me know if more info is required on same.
My guess would be that you did not delete the container instances and used:
docker compose stop
instead of
docker compose down
In that case the container will only be stopped and not removed. when then running the compose up command the containers will simpy be restarted and not rebuild/recreated. Note that things that happen inside of the Dockerfile will only be running/changing when first creating the container (assuming that the image was correctly updated).
On your second note I would add that that would be a better solution in my humble opinion. A docker container is typically meant to only do 1 thing. So if this container is running a service and exposing endpoints a second container would be consuming said endpoints on an interval. This separation of functionality is quite easy to setup and makes it clearer which container is in charge of what.
Related
I’m new to working on Linux. I apologize if this is a dumb question. Despite searching for more than a week, I was not able to derive a clear answer to my question.
I’m running a very long Python program on Nvidia CPUs. The output is several csv files. It takes long to compute the output, so I use nohup to be able to exit the process.
Let’s say main.py file is this
import numpy as p
import pandas as pd
if __name__ == ‘__main__’:
a = np.arange(1,1000)
data = a*2
filename = ‘results.csv’
output = pd.DataFrame(data, columns = [“Output”])
output.to_csv(filename)
The calculations for data is more complicated, of course. I build a docker container, and run this program inside this container. When I use python main.py for a smaller-sized example, there is no problem. It writes the csv files.
My question is this:
When I do nohup python main.py &, I check what’s going on with tail -f nohup.out in the docker container, I get what it is doing at that time but I cannot exit it and let the execution run its course. It just stops there. How can I exit safely from the screen that comes with tail -f nohup.out?
I tried not checking the condition of the code and letting the code continue for two days, then I returned. The output of tail -f nohup.out indicated that the execution finished but csv files were nowhere to be seen. It is somehow bundled up inside nohup.out or does it indicate something else is wrong?
If you're going to run this setup in a Docker container:
A Docker container runs only one process, as a foreground process; when that process exits the container completes. That process is almost always the script or server you're trying to run and not an interactive shell. But;
It's possible to use Docker constructs to run the container itself in the background, and collect its logs while it's running or after it completes.
A typical Dockerfile for a Python program like this might look like:
FROM python:3.10
# Create and use some directory; it can be anything, but do
# create _some_ directory.
WORKDIR /app
# Install Python dependencies as a separate step. Doing this first
# saves time if you repeat `docker build` without changing the
# requirements list.
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy in the rest of the application.
COPY . .
# Set the main container command to be the script.
CMD ["./main.py"]
The script should be executable (chmod +x main.py on your host) and begin with a "shebang" line (#!/usr/bin/env python3) so the system knows where to find the interpreter.
You will hear recommendations to use both CMD and ENTRYPOINT for the final line. It doesn't matter much to your immediate question. I prefer CMD for two reasons: it's easier to launch an alternate command to debug your container (docker run --rm your-image ls -l vs. docker run --rm --entrypoint ls your-image -l), and there's a very useful pattern of using ENTRYPOINT to do some initial setup (creating environment variables dynamically, running database migrations, ...) and then launching CMD.
Having built the image, you can use the docker run -d option to launch it in the background, and then run docker logs to see what comes out of it.
# Build the image.
docker build -t long-python-program .
# Run it, in the background.
docker run -d --name run1 long-python-program
# Review its logs.
docker logs run1
If you're running this to produce files that need to be read back from the host, you need to mount a host directory into your container at the time you start it. You need to make a couple of changes to do this successfully.
In your code, you need to write the results somewhere different than your application code. You can't mount a host directory over the /app directory since it will hide the code you're actually trying to run.
data_dir = os.getenv('DATA_DIR', 'data')
filename = os.path.join(data_dir, 'results.csv')
Optionally, in your Dockerfile, create this directory and set a pointer to it. Since my sample code gets its location from an environment variable you can again use any path you want.
# Create the data directory.
RUN mkdir /data
ENV DATA_DIR=/data
When you launch the container, the docker run -v option mounts filesystems into the container. For this sort of output file you're looking for a bind mount that directly attaches a host directory to the container.
docker run -d --name run2 \
-v "$PWD/results:/data" \
long-python-program
In this example so far we haven't set the USER of the program, and it will run as root. You can change the Dockerfile to set up an alternate USER (which is good practice); you do not need to chown anything except the data directory to be owned by that user (leaving your code owned by root and not world-writeable is also good practice). If you do this, when you launch the container (on native Linux) you need to provide the host numeric user ID that can write to the host directory; you do not need to make other changes in the Dockerfile.
docker run -d --name run2 \
-u $(id -u) \
-v "$PWD/results:/data" \
long-python-program
1- Container is a foreground process. Use CMD or Entrypoint in Dockerfile.
2- Map volume in docker to linux directory's.
A sample repo with the directory structure of what I'm working on is on GitHub here. To run the GitHub Action, you just need to go to the Action tab of the repo and run the Action manually.
I have a custom GitHub Action I've written as well with python as the base image in the Docker container but want the python version to be an input for the GitHub Action. In order to do so, I am creating a second intermediate Docker container to run with the python version input argument.
The problem I'm running into is I don't have access to the original repo's files that is calling the GitHub Action. For example, say the repo is called python-sample-project and has folder structure:
python-sample-project
│ main.py
│ file1.py
│
└───folder1
│ │ file2.py
I see main.py, file1.py, and folder1/file2.py in entrypoint.sh. However, in docker-action/entrypoint.sh I only see the linux folder structure and the entrypoint.sh file copied over in docker-action/Dockerfile.
In the Alpine example I'm using, the action entrypoint.sh script looks like this:
#!/bin/sh -l
ALPINE_VERSION=$1
cd /docker-action
docker build -t docker-action --build-arg alpine_version="$ALPINE_VERSION" . && docker run docker-action
In docker-action/ I have a Dockerfile and entrypoint.sh script that should run for the inner container with the dynamic version of Alpine (or Python)
The docker-action/Dockerfile is as follows:
# Container image that runs your code
ARG alpine_version
FROM alpine:${alpine_version}
# Copies your code file from your action repository to the filesystem path `/` of the container
COPY entrypoint.sh /entrypoint.sh
RUN ["chmod", "+x", "/entrypoint.sh"]
# Code file to execute when the docker container starts up (`entrypoint.sh`)
ENTRYPOINT ["/entrypoint.sh"]
In the docker-action/entrypoint I run ls but I do not see the repository files.
Is it possible to access the main.py, file1.py, and folder1/file2.py in entrypoint.sh in the docker-action/entrypoint.sh?
There's generally two ways to get files from your repository available to a docker container you build and run. You either (1) add the files to the image when you build it or (2) mount the files into the container when you run it. There are some other ways, like specifying volumes, but that's probably out of scope for this case.
The Dockerfile docker-action/Dockerfile does not copy any files except for the entrypoint.sh script. Your entrypoint.sh also does not provide any mount points when running the container. Hence, the outcome you observe is the expected outcome based on these facts.
In order to resolve this, you must either (1) add COPY/ADD statements to your Dockerfile to copy files into the image (and set appropriate build context) OR (2) mount the files into the container when it runs by adding -v /source-path:/container-path to the docker run command in your entrypoint.sh.
See references:
COPY reference
Docker run reference
Though, this approach of building another container just to get a user-provided python version is a highly questionable practice for GitHub Actions and should probably be avoided. Consider leaning on the setup-python action instead.
The docker-in-docker problem
Nevertheless, if you continue this route and want to go about mounting the directory, you'll have to keep in mind that, when invoking docker from within a docker action on GitHub, the filesystem in the mount specification refers to the filesystem of the docker host, not the filesystem of the container.
It works on my machine?!
Counter to what you might experience running docker on a local system for example, this does not work in GitHub -- the working directory is not mounted:
docker run -v $(pwd):/opt/workspace \
--workdir /opt/workspace \
--entrypoint /bin/ls \
my-container "-R"
This doesn't work either:
docker run -v $GITHUB_WORKSPACE:$GITHUB_WORKSPACE \
--workdir $GITHUB_WORKSPACE \
--entrypoint /bin/ls \
my-container "-R"
This kind of thing would work perfectly fine if you tried it on a system running docker locally. What gives?
Dealing with the devil (daemon)
In Actions, the starting working directory where files are checked out into $GITHUB_WORKSPACE. In docker actions, that's /github/workspace. The workspace files populate into the workspace when your action runs by the Actions runner mounting the workspace from the host where the docker daemon is running.
You can see that in the command run when your action starts:
/usr/bin/docker run --name f884202608aa2bfab75b6b7e1f87b3cd153444_f687df --label f88420 --workdir /github/workspace --rm -e INPUT_ALPINE-VERSION -e HOME -e GITHUB_JOB -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_REPOSITORY_OWNER -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RETENTION_DAYS -e GITHUB_RUN_ATTEMPT -e GITHUB_ACTOR -e GITHUB_WORKFLOW -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GITHUB_EVENT_NAME -e GITHUB_SERVER_URL -e GITHUB_API_URL -e GITHUB_GRAPHQL_URL -e GITHUB_WORKSPACE -e GITHUB_ACTION -e GITHUB_EVENT_PATH -e GITHUB_ACTION_REPOSITORY -e GITHUB_ACTION_REF -e GITHUB_PATH -e GITHUB_ENV -e RUNNER_OS -e RUNNER_NAME -e RUNNER_TOOL_CACHE -e RUNNER_TEMP -e RUNNER_WORKSPACE -e ACTIONS_RUNTIME_URL -e ACTIONS_RUNTIME_TOKEN -e ACTIONS_CACHE_URL -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/home/runner/work/_temp/_github_home":"/github/home" -v "/home/runner/work/_temp/_github_workflow":"/github/workflow" -v "/home/runner/work/_temp/_runner_file_commands":"/github/file_commands" -v "/home/runner/work/my-repo/my-repo":"/github/workspace" f88420:2608aa2bfab75b6b7e1f87b3cd153444 "3.9.5"
The important bits are this:
-v "/home/runner/work/my-repo/my-repo":"/github/workspace"
-v "/var/run/docker.sock":"/var/run/docker.sock"
/home/runner/work/my-repo/my-repo is the path on the host, where the repository files are. As mentioned, that first line is what gets it mounted into /github/workspace in your action container when it gets run.
The second line is mounting the docker socket from the host to the action container. This means any time you call docker within your action, you're actually talking to the docker daemon outside of your container. This is important because that means when you use the -v argument inside your action, the arguments need to reflect directories that exist outside of the container.
So, what you would actually need to do instead is this:
docker run -v /home/runner/work/my-repo/my-repo:/opt/workspace \
--workdir /opt/workspace \
--entrypoint /bin/ls \
my-container "-R"
Becoming useful to others
And that works. If you only use it for the project itself. However, you have (among others) a remaining problem if you want this action to be consumable by other projects. How do you know where the workspace is on the host? This path will change for each repository, after all. GitHub does not guarantee these paths, either. They may be different on different platforms, or your action may be running on a self-hosted runner.
So how do you content with that problem? There is no inbuilt environment variable that points to this directory you need specifically, unfortunately. However, by relying on implementation detail, you might be able to get away with using the $RUNNER_WORKSPACE variable, which will point, in this case to /home/runner/work/your-project. This is not the same place as the origin of $GITHUB_WORKSPACE but it's close. You can use the GITHUB_REPOSITORY variable to build the path, though this isn't guaranteed to always be the case afaik:
PROJECT_NAME="$(basename ${GITHUB_REPOSITORY})"
WORKSPACE="${RUNNER_WORKSPACE}/${PROJECT_NAME}"
You also have some other things to fix like the working directory form which you build.
TL;DR
You need to mount files in the container when you run it. In GitHub, you're running docker-in-docker, so paths you need to use to mount files work different, so you need to find the correct paths to pass to docker when called from within your action container.
A minimally working solution for the example project you linked is this entrypoint.sh in the root of the repo looks like this:
#!/usr/bin/env sh
ALPINE_VERSION=$1
docker build -t docker-action \
-f ./docker-action/Dockerfile \
--build-arg alpine_version="$ALPINE_VERSION" \
./docker-action
PROJECT_NAME="$(basename ${GITHUB_REPOSITORY})"
WORKSPACE="${RUNNER_WORKSPACE}/${PROJECT_NAME}"
docker run --workdir=$GITHUB_WORKSPACE \
-v $WORKSPACE:$GITHUB_WORKSPACE \
docker-action "$#"
There are probably further concerns with your action, depending on what it does, like making available all the default and user-defined environment variables for the action to the 'inner' container, if that's important.
So, is this possible? Sure. Is it reasonable just to get a dynamic version of alpine/python? I don't think so. There's probably better ways of accomplishing what you want to do, like using setup-python, but that sounds like a different question.
The issue has appeared recently and the previously healthy container now enters a sleep loop when a shutit session is being created. The issue occurs only on Cloud Run and not locally.
Minimum reproducible code:
requirements.txt
Flask==2.0.1
gunicorn==20.1.0
shutit
Dockerfile
FROM python:3.9
# Allow statements and log messages to immediately appear in the Cloud Run logs
ENV PYTHONUNBUFFERED True
COPY requirements.txt ./
RUN pip install -r requirements.txt
# Copy local code to the container image.
ENV APP_HOME /myapp
WORKDIR $APP_HOME
COPY . ./
CMD exec gunicorn \
--bind :$PORT \
--worker-class "sync" \
--workers 1 \
--threads 1 \
--timeout 0 \
main:app
main.py
import os
import shutit
from flask import Flask, request
app = Flask(__name__)
# just to prove api works
#app.route('/ping', methods=['GET'])
def ping():
os.system('echo pong')
return 'OK'
# issue replication
#app.route('/healthcheck', methods=['GET'])
def healthcheck():
os.system("echo 'healthcheck'")
# hangs inside create_session
shell = shutit.create_session(echo=True, loglevel='debug')
# never shell.send reached
shell.send('echo Hello World', echo=True)
# never returned
return 'OK'
if __name__ == '__main__':
app.run(host='127.0.0.1', port=8080, debug=True)
cloudbuild.yaml
steps:
- id: "build_container"
name: "gcr.io/kaniko-project/executor:latest"
args:
- --destination=gcr.io/$PROJECT_ID/borked-service-debug:latest
- --cache=true
- --cache-ttl=99h
- id: "configure infrastructure"
name: "gcr.io/cloud-builders/gcloud"
entrypoint: "bash"
args:
- "-c"
- |
set -euxo pipefail
REGION="europe-west1"
CLOUD_RUN_SERVICE="borked-service-debug"
SA_NAME="$${CLOUD_RUN_SERVICE}#${PROJECT_ID}.iam.gserviceaccount.com"
gcloud beta run deploy $${CLOUD_RUN_SERVICE} \
--service-account "$${SA_NAME}" \
--image gcr.io/${PROJECT_ID}/$${CLOUD_RUN_SERVICE}:latest \
--allow-unauthenticated \
--platform managed \
--concurrency 1 \
--max-instances 10 \
--timeout 1000s \
--cpu 1 \
--memory=1Gi \
--region "$${REGION}"
cloud run logs that get looped:
Setting up prompt
In session: host_child, trying to send: export PS1_ORIGIN_ENV=$PS1 && PS1='OR''IGIN_ENV:rkkfQQ2y# ' && PROMPT_COMMAND='sleep .05||sleep 1'
================================================================================
Sending>>> export PS1_ORIGIN_ENV=$PS1 && PS1='OR''IGIN_ENV:rkkfQQ2y# ' && PROMPT_COMMAND='sleep .05||sleep 1'<<<, expecting>>>['\r\nORIGIN_ENV:rkkfQQ2y# ']<<<
Sending in pexpect session (68242035994000): export PS1_ORIGIN_ENV=$PS1 && PS1='OR''IGIN_ENV:rkkfQQ2y# ' && PROMPT_COMMAND='sleep .05||sleep 1'
Expecting: ['\r\nORIGIN_ENV:rkkfQQ2y# ']
export PS1_ORIGIN_ENV=$PS1 && PS1='OR''IGIN_ENV:rkkfQQ2y# ' && PROMPT_COMMAND='sleep .05||sleep 1'
root#localhost:/myapp# export PS1_ORIGIN_ENV=$PS1 && PS1='OR''IGIN_ENV:rkkfQQ2y# ' && PROMPT_COMMAND='sleep .05||sleep 1'
Stopped sleep .05
Stopped sleep 1
pexpect: buffer: b'' before: b'cm9vdEBsb2NhbGhvc3Q6L3B1YnN1YiMgIGV4cx' after: b'DQpPUklHSU5fRU5WOnJra2ZRUTJ5IyA='
Resetting default expect to: ORIGIN_ENV:rkkfQQ2y#
In session: host_child, trying to send: stty cols 65535
================================================================================
Sending>>> stty cols 65535<<<, expecting>>>ORIGIN_ENV:rkkfQQ2y# <<<
Sending in pexpect session (68242035994000): stty cols 65535
Expecting: ORIGIN_ENV:rkkfQQ2y#
ORIGIN_ENV:rkkfQQ2y# stty cols 65535
stty cols 65535
Stopped stty cols 65535
Stopped sleep .05
Stopped sleep 1
Workarounds tried:
Different regions: a few European(tier 1 and 2), Asia, US.
Build with docker instead of kaniko
Different CPU and Memory allocated to the container
Minimum number of containers 1-5 (to ensure CPU is always allocated to the container)
--no-cpu-throttling also made no difference
Maximum number of containers 1-30
Different GCP project
Different Docker base images (3.5-3.9 + various shas ranging from a year ago to recent ones)
I have reproduced your issue and we have discussed several possibilities, I think the issue is your Cloud Run not being able to process requests and hence preparing to shut down(sigterm).
I am listing some possibilities for you to look at and analyse.
A good reason for your Cloud Run service failing to start is that the
server process inside the container is configured to listen on the
localhost (127.0.0.1) address. This refers to the loopback network
interface, which is not accessible from outside the container and
therefore Cloud Run health check cannot be performed, causing the
service deployment failure. To solve this, configure your application
to start the HTTP server to listen on all network interfaces,
commonly denoted as 0.0.0.0.
While searching for the cloud logs error you are getting, I came
across this answer and GitHub link from the shutit library
developer which points to a technique for tracking inputs and outputs
in complex container builds in shutit sessions. One good finding
from the GitHub link, I think you will have to pass the session_type
in shutit.create_session(‘bash’) or shutit.create_session(‘docker’)
which you are not specifying in the main.py file. That can be the
reason your shutit session is failing.
Also this issue could be due to some Linux kernel feature used by
this shutit library which is not currently supported properly in
gVisor . I am not sure how it was executed for you the first
time. Most apps will work fine, or at least as well as in regular
Docker, but may not provide 100% compatibility.
Cloud Run applications run on gVisor container sandbox(which supports
Linux only currently), which executes Linux kernel system calls made
by your application in userspace. gVisor does not implement all
system calls (see here). From this Github link, “If your
app has such a system call (quite rare), it will not work on Cloud
Run. Such an event is logged and you can use strace to
determine when the system call was made in your app”
If you're running your code on Linux, install and enable strace:
sudo apt-get install strace Run your application with strace by
prefacing your usual invocation with strace -f where -f means to
trace all child threads. For example, if you normally invoke your
application with ./main, you can run it with strace by invoking /usr/bin/strace -f ./main
From this documentation, “ if you feel your issue is caused by
a limitation in the Container sandbox . In the Cloud Logging section
of the GCP Console (not in the "Logs'' tab of the Cloud Run section),
you can look for Container Sandbox with a DEBUG severity in the
varlog/system logs or use the Log Query:
resource.type="cloud_run_revision"
logName="projects/PROJECT_ID/logs/run.googleapis.com%2Fvarlog%2Fsystem"
For example: Container Sandbox: Unsupported syscall
setsockopt(0x3,0x1,0x6,0xc0000753d0,0x4,0x0)”
By default, container instances have min-instances turned off, with a setting of 0. We can change this default using the Cloud Console, the gcloud command line, or a YAML file, by specifying a minimum number of container instances to be kept warm and ready to serve requests.
You can also have a look at this documentation and GitHub Link which talks about the Cloud Run container runtime behaviour and troubleshooting for reference.
It's not a perfect replacement but you can use one of the following instead:
I'm not sure what's the big picture so I'll add various options
For remote automation tasks from a flask web server we're using paramiko for its simplicity and quick setup, though you might prefer something like pyinfra for large projects or subprocess for small local tasks.
Paramiko - a bit more hands-on\manual than shutit, run commands over the ssh protocol.
example:
import paramiko
ip='server ip'
port=22
# you can also use ssh keys
username='username'
password='password'
cmd='some useful command'
ssh=paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(ip,port,username,password)
stdin,stdout,stderr=ssh.exec_command(cmd)
outlines=stdout.readlines()
resp=''.join(outlines)
print(resp)
more examples
pyinfra - ansible like library to automate tasks in ad-hoc style
example to install a package using apt:
from pyinfra.operations import apt
apt.packages(
name='Ensure iftop is installed',
packages=['iftop'],
sudo=True,
update=True,
)
subprocess - like Paramiko not as extensive as shutit but works like a charm
I have to setup "dockerized" environments (integration, qa and production) on the same server (client's requirement). Each environment will be composed as follow:
rabbitmq
celery
flower
python 3 based application called "A" (specific branch per
environment)
Over them, jenkins will handle the deployment based on CI.
Using set of containers per environment sounds like the best approach.
But now I need, process manager to run and supervise all of them:
3 rabbit containers,
3 celery/flower containers,
3 "A" containers,
1 jenkins containers.
Supervisord seem to be the best choice, but during my tests, i'm not able to "properly" restart a container. Here a snippet of the supervisord.conf
[program:docker-rabbit]
command=/usr/bin/docker run -p 5672:5672 -p 15672:15672 tutum/rabbitmq
startsecs=20
autorestart=unexpected
exitcodes=0,1
stopsignal=KILL
So I wonder what is the best way to separate each environment and be able to manage and supervise each service (a container).
[EDIT My solution inspired by Thomas response]
each container is run by a .sh script that looking like
rabbit-integration.py
#!/bin/bash
#set -x
SERVICE="rabbitmq"
SH_S = "/path/to_shs"
export MY_ENV="integration"
. $SH_S/env_.sh
. $SH_S/utils.sh
SERVICE_ENV=$SERVICE-$MY_ENV
ID_FILE=/tmp/$SERVICE_ENV.name # pid file
trap stop SIGHUP SIGINT SIGTERM # trap signal for calling the stop function
run_rabbitmq
$SH_S/env_.sh is looking like:
# set env variable
...
case $MONARCH_ENV in
$INTEGRATION)
AMQP_PORT="5672"
AMQP_IP="172.17.42.1"
...
;;
$PREPRODUCTION)
AMQP_PORT="5673"
AMQP_IP="172.17.42.1"
...
;;
$PRODUCTION)
AMQP_PORT="5674"
REDIS_IP="172.17.42.1"
...
esac
$SH_S/utils.sh is looking like:
#!/bin/bash
function random_name(){
echo "$SERVICE_ENV-$(cat /proc/sys/kernel/random/uuid)"
}
function stop (){
echo "stopping docker container..."
/usr/bin/docker stop `cat $ID_FILE`
}
function run_rabbitmq (){
# do no daemonize and use stdout
NAME="$(random_name)"
echo $NAME > $ID_FILE
/usr/bin/docker run -i --name "$NAME" -p $AMQP_IP:$AMQP_PORT:5672 -p $AMQP_ADMIN_PORT:15672 -e RABBITMQ_PASS="$AMQP_PASSWORD" myimage-rabbitmq &
PID=$!
wait $PID
}
At least myconfig.intergration.conf is looking like:
[program:rabbit-integration]
command=/path/sh_s/rabbit-integration.sh
startsecs=20
priority=90
autorestart=unexpected
exitcodes=0,1
stopsignal=TERM
In the case i want use the same container the startup function is looking like:
function _run_my_container () {
NAME="my_container"
/usr/bin/docker start -i $NAME &
PID=$!
wait $PID
rc=$?
if [[ $rc != 0 ]]; then
_run_my_container
fi
}
where
function _run_my_container (){
/usr/bin/docker run -p{} -v{} --name "$NAME" myimage &
PID=$!
wait $PID
}
Supervisor requires that the processes it manages do not daemonize, as per its documentation:
Programs meant to be run under supervisor should not daemonize
themselves. Instead, they should run in the foreground. They should
not detach from the terminal from which they are started.
This is largely incompatible with Docker, where the containers are subprocesses of the Docker process itself (i.e. and hence are not subprocesses of Supervisor).
To be able to use Docker with Supervisor, you could write an equivalent of the pidproxy program that works with Docker.
But really, the two tools aren't really architected to work together, so you should consider changing one or the other:
Consider replacing Supervisor with Docker Compose (which is designed to work with Docker)
Consider replacing Docker with Rocket (which doesn't have a "master" process)
You need to make sure you use stopsignal=INT in your supervisor config, then exec docker run normally.
[program:foo]
stopsignal=INT
command=docker -rm run whatever
At least this seems to work for me with docker version 1.9.1.
If you run docker from inside a shell script, it is very important that you have exec in front of the docker run command, so that docker run replaces the shell process and thus receives the SIGINT directly from supervisord.
You can have Docker just not detach and then things work fine. We manage our Docker containers in this way through supervisor. Docker compose is great, but if you're already using Supervisor to manage non-docker things as well, it's nice to keep using it to have all your management in one place. We'll wrap our docker run in a bash script like the following and have supervisor track that, and everything works fine:
#!/bin/bash¬
TO_STOP=docker ps | grep $SERVICE_NAME | awk '{ print $1 }'¬
if [$TO_STOP != '']; then¬
docker stop $SERVICE_NAME¬
fi¬
TO_REMOVE=docker ps -a | grep $SERVICE_NAME | awk '{ print $1 }'¬
if [$TO_REMOVE != '']; then¬
docker rm $SERVICE_NAME¬
fi¬
¬
docker run -a stdout -a stderr --name="$SERVICE_NAME" \
--rm $DOCKER_IMAGE:$DOCKER_TAG
I found that executing docker run via supervisor actually works just fine, with a few precautions. The main thing one needs to avoid is allowing supervisord to send a SIGKILL to the docker run process, which will kill off that process but not the container itself.
For the most part, this can be handled by following the instructions in Why Your Dockerized Application Isn’t Receiving Signals. In short, one needs to:
Use the CMD ["/path/to/myapp"] form (same for ENTRYPOINT) instead of the shell form (CMD /path/to/myapp).
Pass --init to docker run.
If using an ENTRYPOINT, ensure its last line calls exec, so as to avoid spawning a new process.
If the above still isn't working, add a STOPSIGNAL to your Dockerfile.
Additionally, you'll want to make sure that your stopwaitsecs setting in supervisor is greater than the time your process might take to shutdown gracefully when it receives a SIGTERM (e.g., graceful_timeout if using gunicorn).
Here's a sample config to run a gunicorn container:
[program:gunicorn]
command=/usr/bin/docker run --init --rm -i -p 8000:8000 gunicorn
redirect_stderr=true
stopwaitsecs=31
Starting (docker run) the rabbitmq image results in a crash. The contents of startup_err:
Crash dump was written to: erl_crash.dump
init terminating in do_boot ()
startup_log
BOOT FAILED
===========
Error description:
{error,{cannot_create_mnesia_dir,"/var/lib/rabbitmq/mnesia/rabbit#localhost/",
eacces}}
Log files (may contain more information):
/var/log/rabbitmq/rabbit#localhost.log
/var/log/rabbitmq/rabbit#localhost-sasl.log
Stack trace:
[{rabbit_mnesia,ensure_mnesia_dir,0,
[{file,"src/rabbit_mnesia.erl"},{line,472}]},
{rabbit_node_monitor,prepare_cluster_status_files,0,
[{file,"src/rabbit_node_monitor.erl"},{line,99}]},
{rabbit,'-boot/0-fun-1-',0,[{file,"src/rabbit.erl"},{line,326}]},
{rabbit,start_it,1,[{file,"src/rabbit.erl"},{line,354}]},
{init,start_it,1,[]},
{init,start_em,1,[]}]
{"init terminating in do_boot",{rabbit,failure_during_boot,{error, {cannot_create_mnesia_dir,"/var/lib/rabbitmq/mnesia/rabbit#localhost/",eacces}}}}
Here's the rabbitmq portion of the Dockerfile:
RUN apt-get install rabbitmq-server -y
ENV RABBITMQ_CONFIG_FILE /etc/rabbitmq/rabbitmq
ADD rabbitmq.config /etc/rabbitmq/rabbitmq.config
# plugins --offline
RUN /usr/sbin/rabbitmq-plugins enable rabbitmq_management
RUN /usr/sbin/rabbitmq-plugins enable rabbitmq_shovel
RUN /usr/sbin/rabbitmq-plugins enable rabbitmq_shovel_management
EXPOSE 5672 15672 4369
VOLUME ["/var/log/rabbitmq", "/var/lib/rabbitmq/mnesia"]
and here's the docker run command. The rabbitmqbase variable holds the value to the host (my OSX) directory where the volumes are to be mapped to.
fab.local('docker run -itd -h rabbithost -p 5672:5672 -p 15672:15672 -p 4369:4369 -p 9001:9001 -v {}/data/mnesia:/var/lib/rabbitmq/mnesia -v {}/data/log:/var/log/rabbitmq --name rabbitmq dtwill/rabbitmq'.format(rabbitmqbase, rabbitmqbase))
So yes, it looks like a rights issue...I'm not sure how to solve it.
[Update]
So I thought it could be the -h param and tried the boot2docker ip, localhost & removed it all together - still crashes.
Thanks!
Indeed, rabbitmq can't write to /var/lib/rabbitmq/mnesia inside the container (which maps to rabbitmqbase + /data/mnesia on your host).
The reason for this is likely that /var/lib/rabbitmq/mnesia does not exist before docker mounts your volume. The mountpoint is therefore created by docker, but owned by user different from rabbitmq.
Make sure that /var/lib/rabbitmq/mnesia exists in the image before starting the container.