Celery apply_async get get called multiple times

Celery apply_async get get called multiple times - python

I have created a task
#app.task(bind=True, max_retries=1)
def notify_feedback(self, req_id):
#some things
I have called this task from my view with a delay of 1 hour like
later = datetime.datetime.utcnow() + datetime.timedelta(hours=1)
notify_feedback.apply_async((req_id,), eta=later)
When I checked the SQS Messages in Flight it has 1 count pending
after one hour this notify_feedback get called multiple times. Did any one encountered this kind of issue with celery?
celery- 4.1.0 is used

I faced such issue as well, but I have delayed task more than for 1 hour.
When I set this in settings.py my I solved my issue.
BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': 86400}
The visibility timeout defines the number of seconds to wait for the worker to acknowledge the task before the message is redelivered to another worker.
More details there.

Related

Celery Django runing periodic tasks after previus was done. [django-celery-beat]

I want to use django-celery-beat library to make some changes in my database periodically. I set task to run each 10 minutes. Everything working fine till my task takes less than 10 minutes, if it lasts longer next tasks starts while first one is doing calculations and it couses an error.
my tasks loks like that:
from celery import shared_task
from .utils.database_blockchain import BlockchainVerify
#shared_task()
def run_function():
build_block = BlockchainVerify()
return "Database updated"
is there a way to avoid starting the same task if previous wasn't done ?

There is definitely a way. It's locking.
There is whole page in the celery documentation - Ensuring a task is only executed one at a time.
Shortly explained - you can use some cache or even database to put lock in and then every time some task starts just check if this lock is still in use or has been already released.
Be aware of that the task may fail or run longer than expected. Task failure may be handled by adding some expiration to the lock. And set the lock expiration to be long enough just in case the task is still running.
There already is a good thread on SO - link.

Is this the right approach to run long running async tasks?

I am trying to come up with a notification service for a list of events for which the data is available in the database every few minutes and gets updated using some mechanism. 2 minutes before the next event, I need to read this database and send out the data to my subscribers as a reminder that the event is about to start. This times are not fixed. They depend on the event time of the next event.
Right now I am creating a celery worker for every user who subscribes. I make the specific celery worker go to sleep till the next event, at which point it resumes and sends out the messge.
Something like this:
nextEventDelay = events.getTimeToNextEventInSeconds()
sleep(nextEventDelay)
SendEventNotification()
But I know, it is not good. For a single person/ 2 people it's working. But for 1000 users, if it spawns 1000 workers, it will not be good.
So my solution? I am thinking of creating a single worker process which will monitor the database for subscribers and once the notification is to be sent out will read from database and send to them. But, this takes care of only one event. Should I keep this in an infinite for loop to notify about the next event?
I am using Celery for async task management with redis. The appplication is Python flask application. Let me know if you need any more info. Thanks.

Using celery beats you could run a job every x seconds to check if any events are within two minutes of starting. You could then trigger your 'reminder' jobs from that task.
Here is the documentation for periodic celery tasks.
http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html
I would suggest you stay far away from long running celery tasks as I have not had a great experience with them.
Here is some untested pseudo code to get you started.
from celery import Celery
from celery.schedules import crontab
app = Celery()
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
# check for events every 20 seconds
sender.add_periodic_task(20.0, trigger_reminders.s(), name='check for upcoming events')
#app.task
def trigger_reminders(*args, **kwargs):
upcoming_events = get_upcoming_events()
for event in upcoming_events:
send_notification.delay(event)
#app.task
def send_event(*args, **kwargs):
#Send the user notification

How to retry a celery task without duplicating it - SQS

I have a Celery task that takes a message from an SQS queue and tries to run it. If it fails it is supposed to retry every 10 seconds at least 144 times. What I think is happening is that it fails and gets back into the queue, and at the same time it creates a new one, duplicating it to 2. These 2 fail again and follow the same pattern to create 2 new and becoming 4 messages in total. So if I let it run for some time the queue gets clogged.
What I am not getting is the proper way to retry it without duplicating. Following is the code that retries. Please see if someone can guide me here.
from celery import shared_task
from celery.exceptions import MaxRetriesExceededError
#shared_task
def send_br_update(bgc_id, xref_id, user_id, event):
from myapp.models.mappings import BGC
try:
bgc = BGC.objects.get(pk=bgc_id)
return bgc.send_br_update(user_id, event)
except BGC.DoesNotExist:
pass
except MaxRetriesExceededError:
pass
except Exception as exc:
# retry every 10 minutes for at least 24 hours
raise send_br_update.retry(exc=exc, countdown=600, max_retries=144)
Update:
More explanation of the issue...
A user creates an object in my database. Other users act upon that object and as they change the state of that object, my code emits signals. The signal handler then initiates a celery task, which means that it connects to the desired SQS queue and submits the message to the queue. The celery server, running the workers, see that new message and try to execute the task. This is where it fails and the retry logic comes in.
According to celery documentation to retry a task all we need to do is to raise self.retry() call with countdown and/or max_retries. If a celery task raises an exception it is considered as failed. I am not sure how SQS handles this. All I know is that one task fails and there are two in the queue, both of these fail and then there are 4 in the queue and so on...

This is NOT celery nor SQS issues.
The real issues is the workflow , i.e. way of you sending message to MQ service and handle it that cause duplication. You will face the same problem using any other MQ service.
Imagine your flow
script : read task message. MQ Message : lock for 30 seconds
script : task fail. MQ Message : locking timeout, message are now free to be grab again
script : create another task message
Script : Repeat Step 1. MQ Message : 2 message with the same task, so step 1 will launch 2 task.
So if the task keep failing, it will keep multiply, 2,4,8,16,32....
If celery script are mean to "Recreate failed task and send to message queue", you want to make sure these message can only be read ONCE. **You MUST discard the task message after it already been read 1 time, even if the task failed. **
There are at least 2 ways to do this, choose one.
Delete the message before recreate the task. OR
In SQS, you can enforce this by create DeadLetter Queue, configure the Redrive Policy, set Maximum Receives to 1. This will make sure the message
with the task that have been read never recycle.
You may prefer method 2, because method 1 require you to configure celery to "consume"(read and delete) ASAP it read the message, which is not very practical. (and you must make sure you delete it before create a new message for failed task)
This dead letter queue is a way to let you to check if celery CRASH, i.e. message that have been read once but not consumed (delete) means program stop somewhere.

This is probably a little bit late, I have written a backoff policy for Celery + SQS as a patch.
You can see how it is implemented in this repository
https://github.com/galCohen88/celery_sqs_retry_policy/blob/master/svc/celery.py

Celery's expires option doesn't work

I'm playing around with Celery, and I'm trying to do a periodic task with CELERYBEAT_SCHEDULER. Here is my configuration:
CELERY_TIMEZONE = 'Europe/Kiev'
CELERYBEAT_SCHEDULE = {
'run-task-every-5-seconds': {
'task': 'tasks.run_every_five_seconds',
'schedule': timedelta(seconds=5),
'options': {
'expires': 10,
}
},
}
# the task
#app.task()
def run_every_five_seconds():
return '5 seconds passed'
When running the beat with celery -A celery_app beat the task doesn't seem to expire. Then I've read that there might be some issue with the beat, so it do not take into account the expires option.
Then I've tried to do a task, so it gets called manually.
#app.task()
def print_hello():
while True:
print datetime.datetime.now()
sleep(1)
I am calling the task in this way:
print_hello.apply_async(args=[], expires=5)
The worker's console is telling my that my task will expire, but it doesn't get expired as well. It's getting executed infinitely.
Received task: tasks.print_hello[05ee0175-cf3a-492b-9601-1450eaaf8ef7] expires:[2016-01-15 00:08:03.707062+02:00]
Is there something I am doing wrong?

I think you have understood the expires argument wrong.
The documentation says: "The task will not be executed after the expiration time." ref. It means the execution will not start if the expiration time has passed. If the execution has already started, the execution will run to completion.
Your configuration adds a task to task queue every 5 seconds. If the execution does not start in 10 seconds from the time the task is added to the task queue, the task is discarded. However, the tasks is executed immediately because there is a free celery worker available.
Your code example adds a task that is discarded if the execution is not started in 5 seconds.
To get the functionality you want, you can replace 'expires': 10, with 'expires': datetime.datetime.now() + timedelta(seconds=10),. That will set the expires to a absolute time.

To add to the previous answer, the purpose of the expire parameter is captured at: https://github.com/celery/celery/issues/591
Let me explain with an example,
let's say you are scheduling a task to be executed every 5 minutes. so celery beat adds the task every 5 minutes to the task queue. Now, for some reason if the worker was not working, it would not pick any task from the task queue. The task queue grows over time with many repetitive tasks. As soon as the worker starts, it has a huge backlog and wastes time in doing the old tasks.
Solution? expires parameter.
Each task now will have, let's say 1 minute of expiring time. So when the worker is online again, it discards all old tasks which are expires and only works on the latest unexpired task. Thanks to this, worker doesn't have to waste time in old repetitive tasks.
Best Practice
When you don't know what to set the expires time, it's always good to set it equal to the schedule/interval.

Same task executed multiple times

I have ETA tasks that get sent to a Redis broker for Celery. It is a single celery and redis instance, both int he same machine.
The problem is, tasks are getting executed multiple times. I've seen tasks executed 4 to 11 times.
I set up the visibility timeout to be 12 hours, given that my ETA's are between 4-11 hours (determined at runtime):
BROKER_TRANSPORT_OPTIONS = {'visibility_timeout': 12 * 60 * 60}
Even with that, tasks still get executed multiple times.
Initially, the task in question was not idempotent, so I tried adding in a DB check to make them idempotent.
it looks something like this:
#app.task
def foo(side_effect_action):
if side_effect_action.executed:
return ALREADY_EXECUTED
else:
do_side_effect()
side_effect_action.executed = True
side_effect_action.save() #hits the db
return JUST_EXECUTED
Turns out that the celery worker gets to the task before foo is able to call side_effect_action.save() and save the state, so in all cases when it's looking for side_effect_action.executed it is still False, and thus gets executed multiple times.
Any ideas how can I solve this issue?

I switched my celery broker to RabbitMQ to avoid this issue. It is unfortunate since I now have one more component in my webapp (I still need redis for something else), but it did solve the multiple execution for ETA tasks bug.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.