The celery worker is reporting:
[2015-09-29 16:13:50,411: INFO/MainProcess] Task xxx.tasks.tasks.process_files[e91db27d-9d16-487f-acae-d6966205ba16] succeeded in 52.9951906s: None
From my client, I can access some task info:
from celery.result import AsyncResult
task_id = 'e91db27d-9d16-487f-acae-d6966205ba16'
res = AsyncResult(task_id)
print(res.state)
print(res.info)
I would also like to know:
When was the task triggered?
How long did it take to complete?
Can I access this information from the AsyncResult? I see nothing in the documentation.
Related
Tasker class setups the initial job when instantiated. Basically what I want is put a job in the 'main_queue', decide if job is running or if there is already same job that is queued in the 'process_queue', return from the the current 'main_queue' job. Else queue a job in the 'process_queue'. When that process queue finishes, put a job in the 'main_queue'.
however, the 'process_queue' has the same job with id for that duration, despite it should have been finished looking at the outputs. So a new job is never put in to process. Is there a deadlock happening that I am unable to see?
main_queue worker
$ rq worker main_queue --with-scheduler
22:44:19 Worker rq:worker:7fe23a24ae404135a10e301f7509eb7e: started, version 1.9.0
22:44:19 Subscribing to channel rq:pubsub:7fe23a24ae404135a10e301f7509eb7e
22:44:19 *** Listening on main_queue...
22:44:19 Trying to acquire locks for main_queue
22:44:19 Scheduler for main_queue started with PID 3747
22:44:19 Cleaning registries for queue: main_queue
22:44:33 main_queue: tasks.redis_test_job() (e90e0dff-bbcc-48ab-afed-6d1ba8b020a8)
None
Job is enqueued to process_queue!
22:44:33 main_queue: Job OK (e90e0dff-bbcc-48ab-afed-6d1ba8b020a8)
22:44:33 Result is kept for 500 seconds
22:44:47 main_queue: tasks.redis_test_job() (1a7f91d0-73f4-466e-92f4-9f918a9dd1e9)
<Job test_job: tasks.print_job()>
!!Scheduler added job to main but same job is already queued in process_queue!!
22:44:47 main_queue: Job OK (1a7f91d0-73f4-466e-92f4-9f918a9dd1e9)
22:44:47 Result is kept for 500 seconds
process_queue worker
$ rq worker process_queue
22:44:24 Worker rq:worker:d70daf20ff324c18bc17f0ea9576df52: started, version 1.9.0
22:44:24 Subscribing to channel rq:pubsub:d70daf20ff324c18bc17f0ea9576df52
22:44:24 *** Listening on process_queue...
22:44:24 Cleaning registries for queue: process_queue
22:44:33 process_queue: tasks.print_job() (test_job)
The process job executed.
22:44:42 process_queue: Job OK (test_job)
22:44:42 Result is kept for 500 seconds
tasker.py
class Tasker():
def __init__(self):
self.tasker_conn = RedisClient().conn
self.process_queue = Queue(name='process_queue', connection=Redis(),
default_timeout=-1)
self.main_queue = Queue(name='main_queue', connection=Redis(),
default_timeout=-1)
self.__setup_tasks()
def __setup_tasks(self):
self.main_queue.enqueue_in(timedelta(seconds=3), tasks.redis_test_job)
tasks.py
import tasks
def redis_test_job():
q = Queue('process_queue', connection=Redis(), default_timeout=-1)
queued = q.fetch_job('test_job')
print(queued)
if queued:
print("!!Scheduler added job to main but same job is already queued in process_queue!!")
return False
else:
q.enqueue(tasks.print_job, job_id='test_job')
print("Job is enqueued to process_queue!")
return True
def print_job():
sleep(8)
print("The process job executed.")
q = Queue('main_queue', connection=Redis(), default_timeout=-1)
q.enqueue_in(timedelta(seconds=5), tasks.redis_test_job)
From the docs, enqueued jobs have a result_ttl that defaults to 500 seconds if you don't define it.
If you want to change it to e.g. make the job and result live for only 1 second, enqueue your job like this:
q.enqueue(tasks.print_job, job_id='test_job', result_ttl=1)
I tried to create a task that should run every minute in celery along with redis server
To execute redis I ran "redis-server"
To execute celery I ran "celery -A tasks worker --loglevel=info"
This is my tasks.py file
from celery import Celery
from celery.schedules import crontab
from celery.task import periodic_task
app = Celery('tasks', backend='redis://localhost', broker='redis://localhost')
#app.task
def add(x, y):
return x + y
#periodic_task(run_every=(crontab(minute='1')),name="run_every_minute",ignore_result=True)
def run_every_minute():
print("hehe")
return "ok"
When I ran in python console
from tasks.py import run_every_minute
z=run_every_minute.delay()
I got output at celery running terminal as
[2019-06-05 01:35:02,591: INFO/MainProcess] Received task: run_every_minute[06498b4b-1d13-45af-b91c-fb10476e0aa3]
[2019-06-05 01:35:02,595: WARNING/Worker-2] hehe
[2019-06-05 01:35:02,599: INFO/MainProcess] Task run_every_minute[06498b4b-1d13-45af-b91c-fb10476e0aa3] succeeded in
0.004713802001788281s: 'ok'
But this should execute every minute since its a periodic task. How this can happen.
Also, how can we execute a celery task at some specific time say 5:30 GMT(for example).
Ok, based on the commentary
First periodic_task needs the scheduler/beat be started (Periodic Tasks), with this the scheduler will send the task depending in the run_every parameter
celery -A tasks beat
Next, if you need to send the beat every minute, you need the crontab be like this
#periodic_task(run_every=(crontab(minute='*')),name="run_every_minute",ignore_result=True)
def run_every_minute():
print("hehe")
return "ok"
With minute='*', it will send the task every minute. minute=1 will send the task at every hour in the minute one
Answering your last comment:
run_every=(crontab(minute='1'))
You have specified 'minute of hour' = 1, so celery beat runs your periodic task every hour at minute '1', e.g. 00:01, 01:01 and so on.
You should set hour attribute for your crontab, propably as a range
I setup Celery with Django app and broker "Redis"
#task
def proc(product_id,url,did,did_name):
## some long operation here
#task
def Scraping(product_id,num=None):
if num:
num=int(num) ## this for i can set what count of subtasks run now
res=group([proc.s(product_id,url,did,dis[did]) for did in dis.keys()[:num]])()
result = res.get()
return sum(result)
First few subtasks run successful, but later any worker dissapears and new tasks are still in RECEIVED status. Because the worker which must operate it does not exist.
I setup minimal concurency and 2 workers in /etc/default/celeryd.
I monitor CPU and memory usage, no highload detected.
There are no errors in the Celery logs!!!
What's wrong?
[2015-12-19 04:00:30,131: INFO/MainProcess] Task remains.tasks.proc[fd0ec29c-436f-4f60-a1b6-3785342ac173] succeeded in 20.045763085s: 6
[2015-12-19 04:17:28,895: INFO/MainProcess] missed heartbeat from w2#server.domain.com
[2015-12-19 04:17:28,897: DEBUG/MainProcess] w2#server.domain.com joined the party
[2015-12-19 05:11:44,057: INFO/MainProcess] missed heartbeat from w2#server.domain.com
[2015-12-19 05:11:44,058: DEBUG/MainProcess] w2#server.domain.com joined the party
SOLUTION>>> --------------------------------------------------------------)))))
if you use django-celery and want use celery as daemon: no use app() http://docs.celeryproject.org/en/latest/userguide/application.html , instead you must setup your celery in /etc/default/celeryd direct to manage.py of your project as: CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
do not disable heartbeats!!!!!
for use celery with direct to manage.py need:
create arg. CELERY_APP="" in /etc/default/celeryd beacuse if you don't do it, beat will be make run-command with old argument "app".
add line: "export DJANGO_SETTINGS_MODULE="your_app.settings"" to celeryd config if you not use default settings
I'm trying to execute a periodic task using celery to delete users who didn't activate their account in time. The screenshot bellow shows that the task is correctly discovered and executed, but when i check the database no changes are done.
The celery task :
#tasks.py
from celery.task.schedules import crontab
from celery.decorators import periodic_task
from celery.utils.log import get_task_logger
from .utils import unconfirmed_users_delete
logger = get_task_logger(__name__)
# A periodic task that will run every minute (the symbol "*" means every)
#periodic_task(run_every=(crontab(hour="*", minute="*", day_of_week="*")))
def delete_unconfirmed_users():
return unconfirmed_users_delete()
The queryset to execute (checked in django shell and correctly working) :
#utils.py
from django.contrib.auth.models import User
from django.utils import timezone
def unconfirmed_users_delete():
return User.objects.filter(is_active=False).filter(profile__key_expires__lt=timezone.now()).delete()
The task is correctly called every minute :
What could be wrong ?
As #schillingt mentioned most of the time, we forget to (re)start worker process for the periodic task.
This happens because we have a beat scheduler which schedules the task and worker which executes the task.
celery -A my_task beat # schedule tasks
celery worker -A my_task -l info # consume tasks
A much better solution is to have a worker which schedules task & executes. You can do that using
celery worker -A my_task -l info --beat # schedule & consume tasks
This schedules the periodic task and consumes it.
I tried using this code to try to dynamically add / remove scheduled tasks.
My tasks.py file looks like this:
from celery.decorators import task
import logging
log = logging.getLogger(__name__)
#task
def mytask():
log.debug("Executing task")
return
The problem is that the tasks do not actually execute (i.e there is no log output), but I get the following messages in my celery log file, exactly on schedule:
[2013-05-10 04:53:00,005: INFO/MainProcess] Got task from broker: cron.tasks.mytask[dfcf397b-e30b-45bd-9f5f-11a17a51b6c4]
[2013-05-10 04:54:00,007: INFO/MainProcess] Got task from broker: cron.tasks.mytask[f013b3cd-6a0f-4060-8bcc-3bb51ffaf092]
[2013-05-10 04:55:00,007: INFO/MainProcess] Got task from broker: cron.tasks.mytask[dfc0d563-ff4b-4132-955a-4293dd3a9ac7]
[2013-05-10 04:56:00,012: INFO/MainProcess] Got task from broker: cron.tasks.mytask[ba093535-0d70-4dc5-89e4-441b72cfb61f]
I can definitely confirm that the logger is configured correctly and working fine. If I were to try and call result = mytask.delay() in the interactive shell, result.state will indefinitely contain the state PENDING.
EDIT: See also Django Celery Periodic Tasks Run But RabbitMQ Queues Aren't Consumed