How to properly configure Celery logging functionality with Django? - python

I use Django and Celery to schedule a task but I have an issue with the logger because it doesn't propagate properly. As you can see in the code below I have configured the Python logging module and the get_task_logger Celery module.
import logging
from celery import Celery
from celery.utils.log import get_task_logger
# Configure logging
logging.basicConfig(filename='example.log',level=logging.DEBUG)
# Create Celery application and Celery logger
app = Celery('capital')
logger = get_task_logger(__name__)
#app.task()
def candle_updated(d, f):
logging.warning('This is a log')
logger.info('This is another log')
return d+f
I use django-celery-beat extension to setup periodic task from the Django admin. This module stores the schedule in the Django database, and presents a convenient admin interface to manage periodic tasks at runtime.
As recommended in the documentation I start the worker and the scheduler that way:
$ celery -A capital beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
celery beat v4.4.0 (cliffs) is starting.
__ - ... __ - _
LocalTime -> 2020-04-02 22:33:32
Configuration ->
. broker -> redis://localhost:6379//
. loader -> celery.loaders.app.AppLoader
. scheduler -> django_celery_beat.schedulers.DatabaseScheduler
. logfile -> [stderr]#%INFO
. maxinterval -> 5.00 seconds (5s)
[2020-04-02 22:33:32,630: INFO/MainProcess] beat: Starting...
[2020-04-02 22:33:32,631: INFO/MainProcess] Writing entries...
[2020-04-02 22:33:32,710: INFO/MainProcess] Scheduler: Sending due task Candles update (marketsdata.tasks.candle_updated)
[2020-04-02 22:33:32,729: INFO/MainProcess] Writing entries...
[2020-04-02 22:33:38,726: INFO/MainProcess] Scheduler: Sending due task Candles update (marketsdata.tasks.candle_updated)
[2020-04-02 22:33:44,751: INFO/MainProcess] Scheduler: Sending due task Candles update (marketsdata.tasks.candle_updated)
Everything seems to run fine. There is output in the console every 6 seconds (frequency of the periodic task) so it seems the task is executed in the background but I can't check it. And the problem I have is that the file example.log is empty, what could be the reason?

Did you start a worker node as well? beat is just the scheduler, You have to run a worker as well
celery -A capital worker -l info

Related

How to run a task periodically in celery?

I want to run a task every 10 seconds by celery periodic task. This is my code in celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'DjangoCelery1.settings')
app = Celery('DjangoCelery1')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.on_after_finalize.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(10, test.s('hello'), name='add every 10')
#app.task
def test(arg):
print(arg)
with open("test.txt", "w") as myfile:
myfile.write(arg)
Then I run it by the following command:
celery -A DjangoCelery1 beat -l info
It seems to run and in the terminal, I give the following message:
celery beat v4.4.2 (cliffs) is starting.
__ - ... __ - _
LocalTime -> 2020-04-26 15:56:48
Configuration ->
. broker -> amqp://guest:**#localhost:5672//
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> 5.00 minutes (300s)
[2020-04-26 15:56:48,483: INFO/MainProcess] beat: Starting...
[2020-04-26 15:56:48,499: INFO/MainProcess] Scheduler: Sending due task add every 10 (DjangoCelery1.celery.test)
[2020-04-26 15:56:53,492: INFO/MainProcess] Scheduler: Sending due task add every 10 (DjangoCelery1.celery.test)
[2020-04-26 15:56:58,492: INFO/MainProcess] Scheduler: Sending due task add every 10 (DjangoCelery1.celery.test)
[2020-04-26 15:57:03,492: INFO/MainProcess] Scheduler: Sending due task add every 10 (DjangoCelery1.celery.test)
[2020-04-26 15:57:08,492: INFO/MainProcess] Scheduler: Sending due task add every 10 (DjangoCelery1.celery.test)
[2020-04-26 15:57:13,492: INFO/MainProcess] Scheduler: Sending due task add every 10 (DjangoCelery1.celery.test)
But, the task is not run and there is no printed message and created text file.
what is the problem?
This is the beat process - now you need to run another process:
celery -A tasks worker ...
so a worker could consume the tasks that you'ree triggering via beats and handle them.

Celery sees my task but does not execute it

I'm trying to run a celery chain in local, with a Redis broker.
The problem: As you can see below the worker command, tasks are not ran. Nothing happens.
Find below all the informations needed.
The redis-server is running
redis-server
[15048] 05 Feb 11:36:30 * Server started, Redis version 2.4.5
[15048] 05 Feb 11:36:30 * DB loaded from disk: 0 seconds
[15048] 05 Feb 11:36:30 * The server is now ready to accept connections on port 6379
...
The celery beat is also running
celery beat -b redis://localhost:6379/0
celery beat v4.2.1 (windowlicker) is starting.
__ - ... __ - _
LocalTime -> 2020-02-05 11:37:30
Configuration ->
. broker -> redis://localhost:6379/0
. loader -> celery.loaders.default.Loader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%WARNING
. maxinterval -> 5.00 minutes (300s)
But when I run the worker, it can see the tasks, but does not execute them:
celery -A module.tasks worker --loglevel=info -Q training_queue -P eventlet
[tasks]
. module.tasks.generate_the_chain
. module.tasks.post_pipeline_message
. module.tasks.train_by_site_post
[2020-02-05 11:39:14,700: INFO/MainProcess] Connected to redis://localhost:6379/0
[2020-02-05 11:39:16,746: INFO/MainProcess] mingle: searching for neighbors
[2020-02-05 11:39:20,804: INFO/MainProcess] mingle: all alone
[2020-02-05 11:39:21,836: INFO/MainProcess] pidbox: Connected to redis://localhost:6379/0.
[2020-02-05 11:39:22,850: INFO/MainProcess] celery#HOST ready.
Nothing happens after this line.
BELOW is the code:
init.py
import os
from celery import Celery
CELERY_APP_NAME = 'app_name'
CELERY_BROKER = 'redis://localhost:6379/0'
CELERY_BACKEND = CELERY_BROKER
CELERY = Celery(CELERY_APP_NAME,
backend=CELERY_BACKEND,
broker=CELERY_BROKER,
include=['module.tasks']
)
And tasks.py
import os
import logging
import time
from celery import Celery, chain, signature
from celery.schedules import crontab
from module import CELERY
#CELERY.on_after_finalize.connect
def setup_periodic_tasks(sender, **kwargs):
site_id = "f1ae"
sign = signature(
'module.tasks.generate_the_chain',
args=(site_id),
queue='training_queue'
)
sender.add_periodic_task(5.0, sign, name='training')
#CELERY.task
def generate_the_chain(site_id):
print("Hello World")
"""
This function run the chain of functions and is then
scheduled with Celery beat
"""
chain = signature(
'module.tasks.site_post',
args=(site_id),
queue='training_queue'
)
chain |= signature(
'module.tasks.post_pipeline_message',
args=(),
queue='training_queue'
)
chain.apply_async()
#CELERY.task
def site_post(site_id):
print(f"Hello {site_id}")
return True
#CELERY.task
def post_pipeline_message(success):
if success is True:
LOGGER.info("Pipeline build with success")

Celery worker disappeared without errors

I setup Celery with Django app and broker "Redis"
#task
def proc(product_id,url,did,did_name):
## some long operation here
#task
def Scraping(product_id,num=None):
if num:
num=int(num) ## this for i can set what count of subtasks run now
res=group([proc.s(product_id,url,did,dis[did]) for did in dis.keys()[:num]])()
result = res.get()
return sum(result)
First few subtasks run successful, but later any worker dissapears and new tasks are still in RECEIVED status. Because the worker which must operate it does not exist.
I setup minimal concurency and 2 workers in /etc/default/celeryd.
I monitor CPU and memory usage, no highload detected.
There are no errors in the Celery logs!!!
What's wrong?
[2015-12-19 04:00:30,131: INFO/MainProcess] Task remains.tasks.proc[fd0ec29c-436f-4f60-a1b6-3785342ac173] succeeded in 20.045763085s: 6
[2015-12-19 04:17:28,895: INFO/MainProcess] missed heartbeat from w2#server.domain.com
[2015-12-19 04:17:28,897: DEBUG/MainProcess] w2#server.domain.com joined the party
[2015-12-19 05:11:44,057: INFO/MainProcess] missed heartbeat from w2#server.domain.com
[2015-12-19 05:11:44,058: DEBUG/MainProcess] w2#server.domain.com joined the party
SOLUTION>>> --------------------------------------------------------------)))))
if you use django-celery and want use celery as daemon: no use app() http://docs.celeryproject.org/en/latest/userguide/application.html , instead you must setup your celery in /etc/default/celeryd direct to manage.py of your project as: CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
do not disable heartbeats!!!!!
for use celery with direct to manage.py need:
create arg. CELERY_APP="" in /etc/default/celeryd beacuse if you don't do it, beat will be make run-command with old argument "app".
add line: "export DJANGO_SETTINGS_MODULE="your_app.settings"" to celeryd config if you not use default settings

Print statement in Celery scheduled task doesn't appear in terminal

When I run celery -A tasks2.celery worker -B I want to see "celery task" printed every second. Currently nothing is printed. Why isn't this working?
from app import app
from celery import Celery
from datetime import timedelta
celery = Celery(app.name, broker='amqp://guest:#localhost/', backend='amqp://guest:#localhost/')
celery.conf.update(CELERY_TASK_RESULT_EXPIRES=3600,)
#celery.task
def add(x, y):
print "celery task"
return x + y
CELERYBEAT_SCHEDULE = {
'add-every-30-seconds': {
'task': 'tasks2.add',
'schedule': timedelta(seconds=1),
'args': (16, 16)
},
}
This is the only output after staring the worker and beat:
[tasks]
. tasks2.add
[INFO/Beat] beat: Starting...
[INFO/MainProcess] Connected to amqp://guest:**#127.0.0.1:5672//
[INFO/MainProcess] mingle: searching for neighbors
[INFO/MainProcess] mingle: all alone
You wrote the schedule, but didn't add it to the celery config. So beat saw no scheduled tasks to send. The example below uses celery.config_from_object(__name__) to pick up config values from the current module, but you can use any other config method as well.
Once you configure it properly, you will see messages from beat about sending scheduled tasks, as well as the output from those tasks as the worker receives and runs them.
from celery import Celery
from datetime import timedelta
celery = Celery(__name__)
celery.config_from_object(__name__)
#celery.task
def say_hello():
print('Hello, World!')
CELERYBEAT_SCHEDULE = {
'every-second': {
'task': 'example.say_hello',
'schedule': timedelta(seconds=5),
},
}
$ celery -A example.celery worker -B -l info
[tasks]
. example.say_hello
[2015-07-15 08:23:54,350: INFO/Beat] beat: Starting...
[2015-07-15 08:23:54,366: INFO/MainProcess] Connected to amqp://guest:**#127.0.0.1:5672//
[2015-07-15 08:23:54,377: INFO/MainProcess] mingle: searching for neighbors
[2015-07-15 08:23:55,385: INFO/MainProcess] mingle: all alone
[2015-07-15 08:23:55,411: WARNING/MainProcess] celery#netsec-ast-15 ready.
[2015-07-15 08:23:59,471: INFO/Beat] Scheduler: Sending due task every-second (example.say_hello)
[2015-07-15 08:23:59,481: INFO/MainProcess] Received task: example.say_hello[2a9d31cb-fe11-47c8-9aa2-51690d47c007]
[2015-07-15 08:23:59,483: WARNING/Worker-3] Hello, World!
[2015-07-15 08:23:59,484: INFO/MainProcess] Task example.say_hello[2a9d31cb-fe11-47c8-9aa2-51690d47c007] succeeded in 0.0012782540870830417s: None
In version 4.1.0 you have to add logger to your task.py file like so:
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
#task(name="multiply_two_numbers")
def mul(x, y):
total = x * (y * random.randint(3, 100))
#HERE:
logger.info('Adding {0} + {1}'.format(x, y))
return total
Stated halfway down in the docs here if you want more info:
http://docs.celeryproject.org/en/latest/userguide/tasks.html
Make sure you run celery beat worker for scheduled tasks:
celery beat --app app.celery
Check the docs here: http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html#starting-the-scheduler
I actually had issues printing mine on command prompt because I was using the wrong command but I found a link to a project which I forked Project
(If on Mac ) celery -A Project worker --loglevel=info
(If on Windows) celery -A Project worker -l info --pool=solo

Celery / RabbitMQ / Django not running tasks

I am hoping someone can help me as I've looked on Stack Overflow and cannot find a solution to my problem. I am running a Django project and have Supervisor, RabbitMQ and Celery installed. RabbitMQ is up and running and Supervisor is ensuring my celerybeat is running, however, while it logs that the beat has started and sends tasks every 5 minutes (see below), the tasks never actually execute:
My supervisor program conf:
[program:nrv_twitter]
; Set full path to celery program if using virtualenv
command=/Users/tsantor/.virtualenvs/nrv_env/bin/celery beat -A app --loglevel=INFO --pidfile=/tmp/nrv-celerybeat.pid --schedule=/tmp/nrv-celerybeat-schedule
; Project dir
directory=/Users/tsantor/Projects/NRV/nrv
; Logs
stdout_logfile=/Users/tsantor/Projects/NRV/nrv/logs/celerybeat_twitter.log
redirect_stderr=true
autorestart=true
autostart=true
startsecs=10
user=tsantor
; if rabbitmq is supervised, set its priority higher so it starts first
priority=999
Here is the output of the log from the program above:
[2014-12-16 20:29:42,293: INFO/MainProcess] beat: Starting...
[2014-12-16 20:34:08,161: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
[2014-12-16 20:39:08,186: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
[2014-12-16 20:44:08,204: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
[2014-12-16 20:49:08,205: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
[2014-12-16 20:54:08,223: INFO/MainProcess] Scheduler: Sending due task gettweets-every-5-mins (twitter.tasks.get_tweets)
Here is my celery.py settings file:
from datetime import timedelta
BROKER_URL = 'amqp://guest:guest#localhost//'
CELERY_DISABLE_RATE_LIMITS = True
CELERYBEAT_SCHEDULE = {
'gettweets-every-5-mins': {
'task': 'twitter.tasks.get_tweets',
'schedule': timedelta(seconds=300) # 300 = every 5 minutes
},
}
Here is my celeryapp.py:
from __future__ import absolute_import
import os
from django.conf import settings
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'app.settings')
app = Celery('app')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
Here is my twitter/tasks.py:
from __future__ import absolute_import
import logging
from celery import shared_task
from twitter.views import IngestTweets
log = logging.getLogger('custom.log')
#shared_task
def get_tweets():
"""
Get tweets and save them to the DB
"""
instance = IngestTweets()
IngestTweets.get_new_tweets(instance)
log.info('Successfully ingested tweets via celery task')
return True
The get_tweets method never gets executed, however I know it works as I can execute get_tweets manually and it works fine.
I have spent two days trying to figure out why its sending due tasks, but not executing them? Any help is greatly appreciated. Thanks in advance.
user2097159 thanks for pointing me in the right direction, I was not aware I also must run a worker using supervisor. I thought it was either a worker or a beat, but now I understand that I must have a worker to handle the task and a beat to fire off the task periodically.
Below is the missing worker config for supervisor:
[program:nrv_celery_worker]
; Worker
command=/Users/tsantor/.virtualenvs/nrv_env/bin/celery worker -A app --loglevel=INFO
; Project dir
directory=/Users/tsantor/Projects/NRV/nrv
; Logs
stdout_logfile=/Users/tsantor/Projects/NRV/nrv/logs/celery_worker.log
redirect_stderr=true
autostart=true
autorestart=true
startsecs=10
user=tsantor
numprocs=1
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
I then reset the RabbitMQ queue. Now that I have both the beat and worker programs managed via supervisor, all is working as intended. Hope this helps someone else out.
You need to a start both a worker process and a beat process. You can create separate processes as described in tsantor's answer, or you can create a single process with both a worker and a beat. This can be more convenient during development (but is not recommended for production).
From "Starting the scheduler" in the Celery documentation:
You can also embed beat inside the worker by enabling the workers -B option, this is convenient if you’ll never run more than one worker node, but it’s not commonly used and for that reason isn’t recommended for production use:
$ celery -A proj worker -B
For expression in Supervisor config files see https://github.com/celery/celery/tree/master/extra/supervisord/ (linked from "Daemonization")

Categories

Resources