Flask + Celery task duplications on 3rd party notifications

Flask + Celery task duplications on 3rd party notifications - python

I have a flask app which sends emails/SMSs to users at a specific time using the ETA/Countdown celery functions with Redis as a broker, The issue is the emails & SMS tasks duplicate randomly - sometimes users get 10 emails/SMSs sometimes users get 20+ for these tasks and the task is only supposed to run once off. The data flow:
Initial function schedule_event_main calls the ETA tasks with the notifications
date_event = datetime.combine(day, time.max)
schedule_ratings_email.apply_async([str(event[0])], eta=date_event)
schedule_ratings_sms.apply_async([str(event[0])], eta=date_event)
Inside function schedule_ratings_email & schedule_ratings_sms task is the .delay task function which creates the individual celery tasks to send out the emails + SMSs to the various guests for an event.
#app.task(bind=True)
def schedule_ratings_email(self,event_id):
""" Fetch feed of URLs to crawl and queue up a task to grab and process
each url. """
try:
url = SITE_URL + 'user/dashboard'
guests = db.session.query(EventGuest).filter(EventGuest.event_id == int(event_id)).all()
event_details = db.session.query(Event).filter(Event.id == event_id).first()
if guests:
if event_details.status == "archived":
for guest in guests:
schedule_individual_ratings_emails.delay(guest.user.first_name, guest.event.host.first_name, guest.user.email,url)
except Exception as e:
log.error("Error processing ratings email for %s" % event_id, exc_info=e)
# self.retry()
This is the final .delay individual task for sending the notifications:
#app.task()
def schedule_individual_ratings_emails(guest_name, host, guest, url):
try:
email_rating(guest_name, host, guest, url)
except Exception as e:
log.error("Error processing ratings email for %s", exc_info=e)
I've tried multiple SO answers and tweaked a lot of variables including celery settings however the notifications are still duplicating. It's only the ETA/Countdown tasks and ONLY with 3rd party servers as I have certain ETA tasks which have DB data writing and those tasks don't have any issues.
This is both an issue on local and heroku (production). Current tech stack:
Flask==1.0.2
celery==4.1.0
Redis 4.0.9
Celery startup: worker: celery worker --app openseat.tasks --beat --concurrency 1 --loglevel info
Celery config details:
CELERY_ACKS_LATE = True
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TIMEZONE = 'Africa/Johannesburg'
CELERY_ENABLE_UTC = True

Related

Celery send task but worker sometimes doesn't received it

Basically, everything works as expected, but occasionally the worker does not receive the sent task. There is no the sent task in flower and no logs in worker logs, but we know that task was sent because we log task id.
We are usin Redis for Backend and Broker and have multiple celery workers with the following configuration:
import logging
from config import (
CELERY_BROKER_URL,
CELERY_RESULT_BACKEND,
)
worker_concurrency = 1
worker_redirect_stdouts_level = logging.INFO
worker_prefetch_multiplier = 1
broker_url = CELERY_BROKER_URL
result_backend = CELERY_RESULT_BACKEND
result_backend_transport_options = {
# Reschedule tasks if they are not processed in 3 hours.
"visibility_timeout": 60 * 60 * 3,
# Enable priorities
"queue_order_strategy": "priority",
}
Sending a task:
def process(self, message):
config = get_configs()
link_result = signature(...)
fallback = signature(...)
link_error = create_link_error(...)
result = self._app.send_task(
TASK_NAME,
args=(
message.session_id,
message.chunk_id,
message.audio_resource_uri,
message.start_offset_sec,
config,
),
priority=5, # Medium redis priority
link=link_result,
link_error=link_error,
)
logger.info(f"Executed task {TASK_NAME} with id: {result.task_id}")
Has anyone experienced similar behavior?

Celery consumer does not receive messages from SQS queue on LocalStack

I have a SQS queue on a LocalStack server and I'm trying to consume messages from it with a Celery consumer.
It seams that the consumer is properly attached to the queue, for example the queue sqs-test-queue, but it does not receive any message when I try to send one with aws command.
My celeryconfig.py looks like this:
from kombu import (
Exchange,
Queue
)
broker_transport_options = {'region': REGION}
broker_transport = 'sqs'
accept_content = ['application/json']
result_serializer = 'json'
content_encoding = 'utf-8'
task_serializer = 'json'
worker_enable_remote_control = False
worker_send_task_events = False
result_backend = None
task_queues = (
Queue('sqs-test-queue', exchange=Exchange(''), routing_key='sqs-test-queue'),
)
and my tasks.py module looks like this:
from celery import Celery
from kombu.utils.url import quote
AWS_ACCESS_KEY = quote("AWS_ACCESS_KEY")
AWS_SECRET_KEY = quote("AWS_SECRET_KEY")
LOCALSTACK = "<IP>:<PORT>"
broker_url = "sqs://{access}:{secret}#{host}".format(access=AWS_ACCESS_KEY,
secret=AWS_SECRET_KEY,
host=LOCALSTACK)
app = Celery('tasks', broker=broker_url, backend=None)
app.config_from_object('celeryconfig')
#app.task(bind=True, name='tasks.consume', acks_late=True, ignore_result=True)
def consume(self, msg):
# DO SOMETHING WITH THE RECEIVED MESSAGE
return True
Tried to execute it with celery -A tasks worker -l INFO -Q sqs-test-queue and everything seams OK:
...
[tasks]
. tasks.consume
[... INFO/MainProcess] Connected to sqs://AWS_ACCESS_KEY:**#<IP>:<PORT>//
[... INFO/MainProcess] celery#local ready
but when I try to send a message with aws sqs send-message --endpoint-url=http://<IP>:<PORT> --queue-url=http://localhost:<PORT>/queue/sqs-test-queue --message-body="Test message", nothing happens.
What am I doing wrong? Have I missed something in the configuration maybe?
PS: If I try to run the command aws sqs receive-message --endpoint-url=http://<IP>:<PORT> --queue-url=http://localhost:<PORT>/queue/sqs-test-queue, I'm able to get the message.
NOTE:
I'm using Python 3.7.0 and my pip freeze looks like this:
boto3==1.10.16
botocore==1.13.16
celery==4.3.0
kombu==4.6.6
pycurl==7.43.0.3
...

I am going through the same thing as you. To fix it I did a couple of things:
I set the HOSTNAME_EXTERNAL and HOSTNAME env variables in localstack
Set broker_url to sqs://{access}:{secret}#{host}:{port} (as you have it)
Make sure that the celery worker's broker_transport_options does not include the config item: wait_time_seconds since this causes errors with localstack as of February 7th, 2020 (check issue here).
Once I did those two things, it started working, hope it helps.

Celery can't publish or consume arbitrary messages to/from any message queue system. Use kombu for that - that is what Celery uses behind the scenes too.

Celery worker doesn't takes updated values from the database

I am building a webapp using flask and using celery to send mails periodically.
The problem is, whenever the there is a new entry in the database, celery doesn't sees it and continues to use the old entries. I have to restart the celery worker each time to make it work properly. Celery beat is running and I am using redis as a broker.
Celery related functions:
#celery.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(30.0, appointment_checkout, name='appointment_checkout')
#celery.task(name='app.Blueprints.periods.appointment_checkout')
def appointment_checkout():
from app.Blueprints.api.routes import fetchAllAppointments, fetch_user_email, fetch_user_phone
from app.Blueprints import db
dt = datetime.now() + timedelta(minutes=10)
#fa = Appointment.query.filter_by(date = dt.strftime("%Y-%m-%d"))
fa = fetchAllAppointments()
for i in fa:
# send emails to clients and counsellors
try:
if(str(i.date.year) != dt.strftime("%Y") or str(i.date.month) != dt.strftime("%m") or str(i.date.day) != dt.strftime("%d")):
continue
except:
continue
if(i.reminderFlag == 1):
continue
if(int(dt.strftime("%H")) == int(i.time.hour) and int(dt.strftime("%M")) == int(i.time.minute)):
client = fetch_user_email(i.user)
counsellor = fetch_user_email(i.counsellor)
client_phone = fetch_user_phone(i.user)
counsellor_phone = fetch_user_phone(i.counsellor)
i.reminderFlag = 1
db.session.add(i)
db.session.commit()
# client email
subject = "appointment notification"
msg = "<h1>Greetings</h1></p>This is to notify you that your appointment is about to begin soon.</p>"
sendmail.delay(subject, msg, client)
sendmail.delay(subject, msg, counsellor)
sendmsg.delay(client_phone, msg)
sendmsg.delay(counsellor_phone, msg)
When I add something in the appointment table, celery doesn't sees the new entry. After restarting the celery worker it sees it.
I am running beat and worker using the following commands:
celery -A periods beat --loglevel=INFO
celery -A periods worker --loglevel=INFO --concurrency=2

Celery, RabbitMQ messages keep getting sent

Here is the setup - django project with celery and a CloudAMQP rabbitMQ worker doing the message brokering.
My Celery/RabbitMQ settings:
# RabbitMQ & Celery settings
BROKER_URL = 'ampq://guest:guest#localhost:5672/' # Understandably fake
BROKER_POOL_LIMIT = 1
BROKER_CONNECTION_TIMEOUT = 30
BROKER_HEARTBEAT = 30
CELERY_SEND_EVENTS = False
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
A docker container running celery with the following command:
bash -c 'cd django && celery -A pkm_main worker -E -l info --concurrency=3'
The shared_task definition:
from __future__ import absolute_import
from celery import shared_task
#shared_task
def push_notification(user_id, message):
logging.critical('Push notifications sent')
return {'status': 'success'}
And me actually calling it when something happens (I have omitted some of the code because it does not seem to be relevant):
from notificatons.tasks import push_notification
def like_this(self, **args):
# Do like stuff and then do .delay()
push_notification.delay(media.user.id, request.user.username + ' has liked your item')
So when this is ran - everything seems fine and dandy - the output looks like so:
worker_1 | [2016-03-25 09:03:34,888: INFO/MainProcess] Received task: notifications.tasks.push_notification[8443bd88-fa02-4ea4-9bff-8fbec8c91516]
worker_1 | [2016-03-25 09:03:35,333: CRITICAL/Worker-1] Push notifications sent
worker_1 | [2016-03-25 09:03:35,336: INFO/MainProcess] Task notifications.tasks.push_notification[8443bd88-fa02-4ea4-9bff-8fbec8c91516] succeeded in 0.444933412999s: {'status': 'success'}
So from what I gather the task has been ran and executed properly, the messages should be stopped and RabbitMQ should stop.
But in my RabbitMQ Management I see messages getting published and delivered non-stop:
So what I'm gathering from this is that RabbitMQ is trying to send some sort of confirmation and failing and retrying? Is there a way to actually turn this behavior off?
All help and advice is warmly welcomed.
EDIT: Forgot to mentions something important - until I call on push_notification.delay() the message tab is empty save for the heartbeat that comes and goes every 30 seconds. Only after I have called .delay() does this happen.
EDIT 2: CELERYBEAT_SCHEDULE settings (I've tried running with and without them - there was no difference but adding them just in case)
CELERYBEAT_SCHEDULE = {
"minutely_process_all_notifications": {
'task': 'transmissions.tasks.process_all_notifications',
'schedule': crontab(minute='*')
}
}
EDIT 3: Added View code. Also I'm not using the CELERYBEAT_SCHEDULE. I'm just keeping the config in the code for future scheduled tasks
from notifications.tasks import push_notification
class MediaLikesView(BaseView):
def post(self, request, media_id):
media = self.get_object(media_id)
data = {}
data['media'] = media.id
data['user'] = request.user.id
serializer = MediaLikeSerializer(data=data)
if serializer.is_valid():
like = serializer.save()
push_notification.delay(media.user.id, request.user.username + ' has liked your item')
serializer = MediaGetLikeSerializer(like)
return self.get_mocked_pagination_response(status=status.HTTP_204_NO_CONTENT)
return self.get_mocked_pagination_response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)

It's Celery's mingle and gossiping. Disable by adding --without-gossip --without-mingle --without-heartbeat to the command line arguments.
Also don't forget to set BROKER_HEARTBEAT = None when you've disabled heartbeats on the commandline, otherwise you'll disconnected after 30s. It's most often better to rely on TCP keepalive then AMQP heartbeats, or even worse, Celery's own heartbeats.

Python - Retry a failed Celery task from another queue

I'm posting a data to a web-service in Celery. Sometimes, the data is not posted to web-service because of the internet is down, and the task is retried infinite times until it is posted. The retrying of the task is un-necessary because the net was down and hence its not required to re-try it again.
I thought of a better solution, ie if a task fails thrice (retrying a min of 3 times), then it is shifted to another queue. This queue contains list of all failed tasks.
Now when the internet is up and the data is posted over the net , ie the task has been completed from the normal queue, it then starts processing the tasks from the queue having failed tasks.
This will not waste the CPU memory of retrying the task again and again.
Here's my code :- As of right now, I'm just retrying the task again, But I doubt whether that'll be the right way of doing it.
#shared_task(default_retry_delay = 1 * 60, max_retries = 10)
def post_data_to_web_service(data,url):
try :
client = SoapClient(
location = url,
action = 'http://tempuri.org/IService_1_0/',
namespace = "http://tempuri.org/",
soap_ns='soap', ns = False
)
response= client.UpdateShipment(
Weight = Decimal(data['Weight']),
Length = Decimal(data['Length']),
Height = Decimal(data['Height']),
Width = Decimal(data['Width']) ,
)
except Exception, exc:
raise post_data_to_web_service.retry(exc=exc)
How do I maintain 2 queues simultaneous and trying to execute tasks from both the queues.
Settings.py
BROKER_URL = 'redis://localhost:6379/0'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'

By default celery adds all tasks to queue named celery. So you can run your task here and when an exception occurs, it retries, once it reaches maximum retries, you can shift them to a new queue say foo
from celery.exceptions import MaxRetriesExceededError
#shared_task(default_retry_delay = 1 * 60, max_retries = 10)
def post_data_to_web_service(data,url):
try:
#do something with given args
except MaxRetriesExceededError:
post_data_to_web_service([data, url], queue='foo')
except Exception, exc:
raise post_data_to_web_service.retry(exc=exc)
When you start your worker, this task will try to do something with given data. If it fails it will retry 10 times with a dealy of 60 seconds. Then when it encounters MaxRetriesExceededError it posts the same task to new queue foo.
To consume these tasks you have to start a new worker
celery worker -l info -A my_app -Q foo
or you can also consume this task from the default worker if you start it with
celery worker -l info -A my_app -Q celery,foo

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Flask + Celery task duplications on 3rd party notifications - python

Related

Celery send task but worker sometimes doesn't received it

Celery consumer does not receive messages from SQS queue on LocalStack

Celery worker doesn't takes updated values from the database

Celery, RabbitMQ messages keep getting sent

Python - Retry a failed Celery task from another queue

Categories

Resources