I have four time-consuming tasks which should execute one by one and the result from previous task can be the input of the next one. So I choose Celery chain to do this. And I do this exampled by the follow code:
mychain = chain(task1.s({'a': 1}), task2.s(), task3.s(), task4.s())
mychain.apply_async()
But the execute order of the tasks is:
enter code here`task1() ---> task4() ---> task3() --->task2()
I don't know what happen.
I run a web server by tornado, and it woke up the tasks by chain.
logging:
[2018-07-23 18:34:12,816][pid:25557][tid:140228657469056][util.py:109] DEBUG: chain: fetch({}) | callback() | convert() | format()
the other tasks run in celery
logging:
[2018-07-23 18:34:12,816: INFO/MainProcess] Received task: fetch[045acf81-274b-457c-8bb5-6d0248264b76]
[2018-07-23 18:34:17,786: INFO/MainProcess] Received task: format[103b4ffa-57db-4b04-a745-7dfee5786695]
[2018-07-23 18:34:18,227: INFO/MainProcess] Received task: convert[81ddbaf9-37b3-406a-b608-a05affa97f45]
[2018-07-23 18:34:20,942: INFO/MainProcess] Received task: callback[b1ea7c70-db45-4501-9859-7ad22532c38a]
The reason is that the celery version of the two machines is different!
And then we set the same celery version, they work!
Related
I want to send a chain task at startup o worker like in this https://stackoverflow.com/a/14589445/3922534 question, but task run out of order.
Logs from worker
[2022-07-12 20:51:47,369: INFO/MainProcess] Task task.add_newspapers[5de1f446-65af-472a-a4b6-d9752142b588] received
[2022-07-12 20:51:47,372: WARNING/MainProcess] Now Runing Newspaper Function
[2022-07-12 20:51:47,408: INFO/MainProcess] Task task.check_tasks_are_created[33d6b9d1-660b-4a80-a726-6f167e246480] received
[2022-07-12 20:51:47,412: WARNING/MainProcess] Now Runing Podcast Function
[2022-07-12 20:51:47,427: INFO/MainProcess] Task task.add_newspapers[5de1f446-65af-472a-a4b6-d9752142b588] succeeded in 0.0470000000204891s: 'Now Runing Podcast Function'
[2022-07-12 20:51:47,432: INFO/MainProcess] Task task.add_yt_channels[26179491-2632-46bd-95c1-9e9dbb9e8130] received
[2022-07-12 20:51:47,433: WARNING/MainProcess] None
[2022-07-12 20:51:47,457: INFO/MainProcess] Task task.check_tasks_are_created[33d6b9d1-660b-4a80-a726-6f167e246480] succeeded in 0.0470000000204891s: None
[2022-07-12 20:51:47,463: INFO/MainProcess] Task task.add_podcasts[ad94a119-c6b2-475a-807b-b1a73bef589e] received
[2022-07-12 20:51:47,468: WARNING/MainProcess] Now Runing Check Tasks are Created Function
[2022-07-12 20:51:47,501: INFO/MainProcess] Task task.add_yt_channels[26179491-2632-46bd-95c1-9e9dbb9e8130] succeeded in 0.06299999984912574s: 'Now Runing Check Tasks are Created Function'
[2022-07-12 20:51:47,504: INFO/MainProcess] Task task.add_podcasts[ad94a119-c6b2-475a-807b-b1a73bef589e] succeeded in 0.030999999959021807s: 'Now Runing Yotube Channels Function'
Code How i send the task:
#worker_ready.connect
def at_start(sender, **k):
with sender.app.connection() as conn:
#sender.app.send_task(name='task.print_word', args=["I Send Task On Startup"],connection=conn,)
#ch = [add_newspapers.s(),add_podcasts.s(),add_yt_channels.s(),check_tasks_are_created.s()]
ch = [
signature("task.add_podcasts"),
signature("task.add_yt_channels"),
signature("task.check_tasks_are_created"),
]
sender.app.send_task(name='task.add_newspapers',chain=ch,connection=conn,)
Then I try it to run chain task like normally run apply_async(), but it runs at every worker. I want to run just once at one worker
#worker_ready.connect
def at_start(sender, **k):
chain(add_newspapers.s(),add_podcasts.s(),add_yt_channels.s(),check_tasks_are_created.s()).apply_async()
Then I try it to recognize the worker then apply .apply_async(), but it does not catch the if statment.
Documentation https://docs.celeryq.dev/en/latest/userguide/signals.html#celeryd-init
celery -A celery_app.celery worker --loglevel=INFO -P gevent --concurrency=40 -n celeryworker1
#worker_ready.connect
def at_start(sender, **k):
print("This is host name ", sender.hostname)
if sender == "celery#celeryworker1":
with sender.app.connection() as conn:
chain(add_newspapers.s(),add_podcasts.s(),add_yt_channels.s(),check_tasks_are_created.s()).apply_async()
Am I doing something wrong or is it just a bug?
Since a task doesn't need the return value of the previous task you can run it as:
chain(add_newspapers.si(),add_podcasts.si(),add_yt_channels.si(),check_tasks_are_created.si()).apply_async()
(change call from s() to si()
You can read about immutability here.
#worker_ready.connect handler will run on every worker. So, if you have 10 workers, you will send the same task 10 times, when they broadcast the "worker_ready" signal. Is this intentional?
I use Django and Celery to schedule a task but I have an issue with the logger because it doesn't propagate properly. As you can see in the code below I have configured the Python logging module and the get_task_logger Celery module.
import logging
from celery import Celery
from celery.utils.log import get_task_logger
# Configure logging
logging.basicConfig(filename='example.log',level=logging.DEBUG)
# Create Celery application and Celery logger
app = Celery('capital')
logger = get_task_logger(__name__)
#app.task()
def candle_updated(d, f):
logging.warning('This is a log')
logger.info('This is another log')
return d+f
I use django-celery-beat extension to setup periodic task from the Django admin. This module stores the schedule in the Django database, and presents a convenient admin interface to manage periodic tasks at runtime.
As recommended in the documentation I start the worker and the scheduler that way:
$ celery -A capital beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
celery beat v4.4.0 (cliffs) is starting.
__ - ... __ - _
LocalTime -> 2020-04-02 22:33:32
Configuration ->
. broker -> redis://localhost:6379//
. loader -> celery.loaders.app.AppLoader
. scheduler -> django_celery_beat.schedulers.DatabaseScheduler
. logfile -> [stderr]#%INFO
. maxinterval -> 5.00 seconds (5s)
[2020-04-02 22:33:32,630: INFO/MainProcess] beat: Starting...
[2020-04-02 22:33:32,631: INFO/MainProcess] Writing entries...
[2020-04-02 22:33:32,710: INFO/MainProcess] Scheduler: Sending due task Candles update (marketsdata.tasks.candle_updated)
[2020-04-02 22:33:32,729: INFO/MainProcess] Writing entries...
[2020-04-02 22:33:38,726: INFO/MainProcess] Scheduler: Sending due task Candles update (marketsdata.tasks.candle_updated)
[2020-04-02 22:33:44,751: INFO/MainProcess] Scheduler: Sending due task Candles update (marketsdata.tasks.candle_updated)
Everything seems to run fine. There is output in the console every 6 seconds (frequency of the periodic task) so it seems the task is executed in the background but I can't check it. And the problem I have is that the file example.log is empty, what could be the reason?
Did you start a worker node as well? beat is just the scheduler, You have to run a worker as well
celery -A capital worker -l info
Task run once and long celery tasks(5-6 hours long) starts to duplicate itself approximately every hour up to 4(concurrency parameter).
Logs:
[2016-08-19 07:43:08,505: INFO/MainProcess] Received task: doit[ed09d5fd-ba07-4cd5-96eb-7ae546bf94db]
[2016-08-19 07:45:44,067: INFO/MainProcess] Received task: doit[7cbc4633-0687-499f-876c-3298ffdf90f9]
[2016-08-19 08:41:16,611: INFO/MainProcess] Received task: doit[ed09d5fd-ba07-4cd5-96eb-7ae546bf94db]
[2016-08-19 08:48:36,623: INFO/MainProcess] Received task: doit[7cbc4633-0687-499f-876c-3298ffdf90f9]
Task code
#task()
def doit(company_id, cls):
p = cls.objects.get(id=company_id)
Celery worker start with --concurrency=4 -Ofair, broker - Redis 3.0.5 version.
Python package versions:
Django==1.8.14
celery==3.1.18
redis==2.10.3
I setup Celery with Django app and broker "Redis"
#task
def proc(product_id,url,did,did_name):
## some long operation here
#task
def Scraping(product_id,num=None):
if num:
num=int(num) ## this for i can set what count of subtasks run now
res=group([proc.s(product_id,url,did,dis[did]) for did in dis.keys()[:num]])()
result = res.get()
return sum(result)
First few subtasks run successful, but later any worker dissapears and new tasks are still in RECEIVED status. Because the worker which must operate it does not exist.
I setup minimal concurency and 2 workers in /etc/default/celeryd.
I monitor CPU and memory usage, no highload detected.
There are no errors in the Celery logs!!!
What's wrong?
[2015-12-19 04:00:30,131: INFO/MainProcess] Task remains.tasks.proc[fd0ec29c-436f-4f60-a1b6-3785342ac173] succeeded in 20.045763085s: 6
[2015-12-19 04:17:28,895: INFO/MainProcess] missed heartbeat from w2#server.domain.com
[2015-12-19 04:17:28,897: DEBUG/MainProcess] w2#server.domain.com joined the party
[2015-12-19 05:11:44,057: INFO/MainProcess] missed heartbeat from w2#server.domain.com
[2015-12-19 05:11:44,058: DEBUG/MainProcess] w2#server.domain.com joined the party
SOLUTION>>> --------------------------------------------------------------)))))
if you use django-celery and want use celery as daemon: no use app() http://docs.celeryproject.org/en/latest/userguide/application.html , instead you must setup your celery in /etc/default/celeryd direct to manage.py of your project as: CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
do not disable heartbeats!!!!!
for use celery with direct to manage.py need:
create arg. CELERY_APP="" in /etc/default/celeryd beacuse if you don't do it, beat will be make run-command with old argument "app".
add line: "export DJANGO_SETTINGS_MODULE="your_app.settings"" to celeryd config if you not use default settings
I tried using this code to try to dynamically add / remove scheduled tasks.
My tasks.py file looks like this:
from celery.decorators import task
import logging
log = logging.getLogger(__name__)
#task
def mytask():
log.debug("Executing task")
return
The problem is that the tasks do not actually execute (i.e there is no log output), but I get the following messages in my celery log file, exactly on schedule:
[2013-05-10 04:53:00,005: INFO/MainProcess] Got task from broker: cron.tasks.mytask[dfcf397b-e30b-45bd-9f5f-11a17a51b6c4]
[2013-05-10 04:54:00,007: INFO/MainProcess] Got task from broker: cron.tasks.mytask[f013b3cd-6a0f-4060-8bcc-3bb51ffaf092]
[2013-05-10 04:55:00,007: INFO/MainProcess] Got task from broker: cron.tasks.mytask[dfc0d563-ff4b-4132-955a-4293dd3a9ac7]
[2013-05-10 04:56:00,012: INFO/MainProcess] Got task from broker: cron.tasks.mytask[ba093535-0d70-4dc5-89e4-441b72cfb61f]
I can definitely confirm that the logger is configured correctly and working fine. If I were to try and call result = mytask.delay() in the interactive shell, result.state will indefinitely contain the state PENDING.
EDIT: See also Django Celery Periodic Tasks Run But RabbitMQ Queues Aren't Consumed