Is logging from multiple celery worker into one file safe? - python

I run celeryd in prefork mode with cuncurrency > 1 like below:
celery worker -c 100 -A x.y.z.tasks -f mylogfile.log --loglevel=INFO -n myworker
As Pythons's logging from multiple processes into one file is not safe (link), does Celery do something about this? Like dispatching logging to main process and open file just once?
What if I redirect all logs to stderr (no -f) and pipe stderr to a file with supervisor?

only master process is handling this log file, so you are safe

Related

Detect and Initiate celery worker in Python Code

Normally I run the following in terminal to start the worker process
celery -A myapp worker --loglevel=info
What I want to achieve now is that with python code
I will check whether they are worker process being initiated,
if not only I run this command (with python code)
How to achieve that?
There is no need for that as Celery gives you standard way to do it...
--pidfile PIDFILE Optional file used to store the process pid. The
program won't start if this file already exists and
the pid is still alive.
So simply change how you start your worker to something like celery -A myapp worker --loglevel=info --pidfile celery1.pid
If you open another terminal and run the command I wrote above, it will not run as the PID file is already created.

Trying to get supervisor to create a worker for python-rq

I am trying to get supervisor to spawn a worker following this pattern using python-RQ, much like what is mentioned in this stackoverflow question. I can start workers manually from the terminal as follows:
$ venv/bin/rq worker
14:35:27 Worker rq:worker:fd403822518a4f21802fa0dc417e526a: started, version 1.2.2
14:35:27 *** Listening on default...
It works great. I can confirm the worker exists in another terminal:
$ venv/bin/rq info
0 queues, 0 jobs total
fd403822518a4f21802fa0dc417e526a (b'' 41735): idle default
1 workers, 0 queues
Now to start a worker using supervisor.... Here is my supervisord.conf file, located in the same directory.
[supervisord]
;[program:worker]
command=venv/bin/rq worker
process_name=%(program_name)s-%(process_num)s
numprocs=1
directory=.
stopsignal=TERM
autostart=false
autorestart=false
I can start supervisor as follows:
$ venv/bin/supervisord -n
2020-03-05 14:36:45,079 INFO supervisord started with pid 41757
However, checking for a new worker, I see it's not there.
$ venv/bin/rq info
0 queues, 0 jobs total
0 workers, 0 queues
I have tried a multitude of other ways to get this worker to start, such as...
... within the virtual environment:
$ source venv/bin/activate
(venv) $ rq worker
*** Listening on default...
... using a shell file
#!/bin/bash
source /venv/bin/activate
rq worker low
$ ./start.sh
*** Listening on default...
... using a python script
$ venv/bin/python3 worker.py
*** Listening on default...
When started manually they all work fine. Changing the command= in supervisord.conf doesn't seem to make a difference. There is no worker to be found. What am I missing? Why won't supervisor start a worker? I am running this in Mac OS and my file structure is as follows:
.
|--__pycache__
|--supervisord.conf
|--supervisord.log
|--supervisord.pid
|--main.py
|--task.py
|--venv
|--bin
|--rq
|--supervisord
|--...etc
|--include
|--lib
|--pyenv.cfg
Thanks in advance.
I had two problems with supervisord.conf, which was preventing the worker from starting. The corrected config file is as follows:
[supervisord]
[program:worker]
command=venv/bin/rqworker
process_name=%(program_name)s-%(process_num)s
numprocs=1
directory=.
stopsignal=TERM
autostart=true
autorestart=false
First, the line [program:worker] was in fact commented out. I must have taken this line from the commented out sample file and not realized. However removing the comment still didn't start the worker.... I also had to set autostart=true, as starting supervisor does not automatically start a command.

How to revoke the task while running

When I send a task and I try to revoke:
app=Celery()
app.control.revoke(task.id)
#or
app.control.revoke(task.id, terminate=True)
I get that error:
[2019-09-05 05:27:50,110: ERROR/MainProcess] pidbox command error: NotImplementedError("<class 'celery.concurrency.gevent.TaskPool'> does not implement kill_job",)
I'm using gevent.
celery -A MyApp worker -l info -P gevent
what's wrong?
gevent concurrency does not allow killing of the jobs. Pre-fork does allow it as it is as simple as killing the worker-process that is running the task that you want to terminate, same goes for threading.
There is an issue about this, with a proposed solution - https://github.com/celery/celery/issues/4019 - but nobody made a PR.

Celeryd multi with supervisord

Trying to run supervisord (3.2.2) with celery multi.
Seems to be that supervisord can't handle it. Single celery worker works fine.
This is my supervisord configuration
celery multi v3.1.20 (Cipater)
> Starting nodes...
> celery1#parzee-dev-app-sfo1: OK
Stale pidfile exists. Removing it.
> celery2#parzee-dev-app-sfo1: OK
Stale pidfile exists. Removing it.
celeryd.conf
; ==================================
; celery worker supervisor example
; ==================================
[program:celery]
; Set full path to celery program if using virtualenv
command=/usr/local/src/imbue/application/imbue/supervisorctl/celeryd/celeryd.sh
process_name = %(program_name)s%(process_num)d#%(host_node_name)s
directory=/usr/local/src/imbue/application/imbue/conf/
numprocs=2
stderr_logfile=/usr/local/src/imbue/application/imbue/log/celeryd.err
logfile=/usr/local/src/imbue/application/imbue/log/celeryd.log
stdout_logfile_backups = 10
stderr_logfile_backups = 10
stdout_logfile_maxbytes = 50MB
stderr_logfile_maxbytes = 50MB
autostart=true
autorestart=false
startsecs=10
Im using the following supervisord variables to emulate the way I start celery:
%(program_name)s
%(process_num)d
#
%(host_node_name)s
Supervisorctl
supervisorctl
celery:celery1#parzee-dev-app-sfo1 FATAL Exited too quickly (process log may have details)
celery:celery2#parzee-dev-app-sfo1 FATAL Exited too quickly (process log may have details)
I tried changing this value in /usr/local/lib/python2.7/dist-packages/supervisor/options.py from 0 to 1:
numprocs_start = integer(get(section, 'numprocs_start', 1))
I still get:
celery:celery1#parzee-dev-app-sfo1 FATAL Exited too quickly (process log may have details)
celery:celery2#parzee-dev-app-sfo1 EXITED May 14 12:47 AM
Celery is starting but supervisord is not keeping track of it.
root#parzee-dev-app-sfo1:/etc/supervisor#
ps -ef | grep celery
root 2728 1 1 00:46 ? 00:00:02 [celeryd: celery1#parzee-dev-app-sfo1:MainProcess] -active- (worker -c 16 -n celery1#parzee-dev-app-sfo1 --loglevel=DEBUG -P processes --logfile=/usr/local/src/imbue/application/imbue/log/celeryd.log --pidfile=/usr/local/src/imbue/application/imbue/log/1.pid)
root 2973 1 1 00:46 ? 00:00:02 [celeryd: celery2#parzee-dev-app-sfo1:MainProcess] -active- (worker -c 16 -n celery2#parzee-dev-app-sfo1 --loglevel=DEBUG -P processes --logfile=/usr/local/src/imbue/application/imbue/log/celeryd.log --pidfile=/usr/local/src/imbue/application/imbue/log/2.pid)
celery.sh
source ~/.profile
CELERY_LOGFILE=/usr/local/src/imbue/application/imbue/log/celeryd.log
CELERYD_OPTS=" --loglevel=DEBUG"
CELERY_WORKERS=2
CELERY_PROCESSES=16
cd /usr/local/src/imbue/application/imbue/conf
exec celery multi start $CELERY_WORKERS -P processes -c $CELERY_PROCESSES -n celeryd#{HOSTNAME} -f $CELERY_LOGFILE $CELERYD_OPTS
Similar:
Running celeryd_multi with supervisor
How to use Supervisor + Django + Celery with multiple Queues and Workers?
Since supervisor monitors(start/stop/restart) process, the process should be run in foreground(should not be daemonized).
Celery multi daemonizes itself, so it can't be run with supervisor.
You can create separate process for each worker and group them into one.
[program:worker1]
command=celery worker -l info -n worker1
[program:worker2]
command=celery worker -l info -n worker2
[group:workers]
programs=worker1,worker2
You can also write a shell script which makes daemon process run in foreground like this.
#! /usr/bin/env bash
set -eu
pidfile="/var/run/your-daemon.pid"
command=/usr/sbin/your-daemon
# Proxy signals
function kill_app(){
kill $(cat $pidfile)
exit 0 # exit okay
}
trap "kill_app" SIGINT SIGTERM
# Launch daemon
$ celery multi start 2 -l INFO
sleep 2
# Loop while the pidfile and the process exist
while [ -f $pidfile ] && kill -0 $(cat $pidfile) ; do
sleep 0.5
done
exit 1000 # exit unexpected

Cannot setup Celery as daemon on server

I cannot setup Celery as daemon on server (django 1.6.11, celery 3.1, Ubuntu 14.04)
Tried lot of options, can anyone place full setting of working configuration to run celery as daemon?
I am very disappointed from official docs http://docs.celeryproject.org/en/latest/tutorials/daemonizing.html#generic-init-scripts - none of this working, no full step-by-step tutorial. Zero (!!!) videos on youtube on how to setup daemon.
Now i able to run celery simple by celery worker -A engine -l info -E
tasks from django are executed successfully.
I have done configs:
/etc/defaults/celery
# Name of nodes to start
# here we have a single node
CELERYD_NODES="w1"
# or we could have three nodes:
#CELERYD_NODES="w1 w2 w3"
# Absolute path to "manage.py"
CELERY_BIN="/var/www/engine/manage.py"
# How to call manage.py
CELERYD_MULTI="celery multi"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=2"
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# Workers should run as an unprivileged user.
CELERYD_USER="root"
CELERYD_GROUP="root"
/etc/init.d/celeryd
got from https://github.com/celery/celery/blob/3.1/extra/generic-init.d/celeryd without changes
Now, when i go to console and run:
cd /etc/init.d
celery multi start w1
i see output:
celery multi v3.1.11 (Cipater)
> Starting nodes...
> w1#engine: OK
So, no errors! Tasks are not invoked and i cannot figure out whats wrong.
I would suggest to use Supervisor. It's better way than init scripts, because you can run multiple Celery instances for different projects on one server. Example config for Supervisor you can find in Celery repo or fully working example from my project:
# /etc/supervisor/conf.d/celery.conf
[program:celery]
command=/home/newspos/.virtualenvs/newspos/bin/celery worker -A newspos --loglevel=INFO
user=newspos
environment=DJANGO_SETTINGS_MODULE="newspos.settings"
directory=/home/newspos/projects/newspos/
autostart=true
autorestart=true
stopwaitsecs = 600
killasgroup=true
startsecs=10
stdout_logfile=/var/log/celery/newspos-celeryd.log
stderr_logfile=/var/log/celery/newspos-celeryd.log

Categories

Resources