Celery task hangs after calling .delay() in Django - python

While calling the .delay() method of an imported task from a django application, the process gets stuck and the request is never completed.
We also don't get any error on the console.
Setting up a set_trace() with pdb results in the same thing.
The following questions were reviewed which didn't help resolve the issue:
Calling celery task hangs for delay and apply_async
celery .delay hangs (recent, not an auth problem)
Eg.:
backend/settings.py
CELERY_BROKER_URL = os.environ.get("CELERY_BROKER", RABBIT_URL)
CELERY_RESULT_BACKEND = os.environ.get("CELERY_BROKER", RABBIT_URL)
backend/celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'backend.settings')
app = Celery('backend')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
app/tasks.py
import time
from celery import shared_task
#shared_task
def upload_file(request_id):
time.sleep(request_id)
return True
app/views.py
from rest_framework.views import APIView
from .tasks import upload_file
class UploadCreateAPIView(APIView):
# other methods...
def post(self, request, *args, **kwargs):
id = request.data.get("id", None)
# business logic ...
print("Going to submit task.")
import pdb; pdb.set_trace()
upload_file.delay(id) # <- this hangs the runserver as well as the set_trace()
print("Submitted task.")

The issue was with the setup of the celery application with Django. We need to make sure that the celery app is imported and initialized in the following file:
backend\__init__.py
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app
__all__ = ('celery_app',)

I've run into this issue that Celery calls through delay or apply_async may randomly hang the program indefinitely. I tried the all broker_transport_options and retry_policy options to let Celery to recover, but it still happens. Then I found this solution to enforce an execution time limit for an execution block/function by using underlying Python signal handlers.
#contextmanager
def time_limit(seconds):
def signal_handler(signum, frame):
raise TimeoutException("Timed out!")
signal.signal(signal.SIGALRM, signal_handler)
signal.alarm(seconds)
try:
yield
finally:
signal.alarm(0)
def my_function():
with time_limit(3):
celery_call.apply_sync(kwargs={"k1", "v1"}, expires=30)

Related

Task results are saved only when all tasks are done

I want to monitor/report task statuses, but tasks are saved only when all tasks are done. I want them to be saved as soon as they start.
'''inside my Queue project'''
celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'Queue.settings')
app = Celery('Queue',
broker='redis://localhost:6379',
backend='django-db',
task_track_started=True,
include=['index.tasks'])
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
inside "index" app
tasks.py
from __future__ import absolute_import, unicode_literals
from celery import shared_task, current_task
import time
#shared_task
def gozle():
time.sleep(15)
return 1
views.py
def index(request):
gozle.delay()
return HttpResponse('admin')
I expect that as I visit index page my task be triggered and be recorded into ResultTask, but it waits 15 seconds then be recorded.
The start of the task should be acknowledged immediately, you should see in your celery server a log entry that says that gozle task started, then it wait 15 seconds and will exit, this exit state will also be on the celery server log. If that is not the case maybe it has something to do with task_acks_late enabled in celery config, here is the doc, making it only acknowledge tasks after they end.

Django celery trigger manage.py cmd #periodic_task

i want to run a manage.py cmd from celery as a periodic task every x Minutes but every time i try to accomplish that as show below i get the following error:
[2019-01-17 01:36:00,006: INFO/MainProcess] Received task: Delete
unused media file(s)[3dd2b93b-e32a-4736-8b24-028b9ad8da35]
[2019-01-17 01:36:00,007: WARNING/ForkPoolWorker-3] Scanning for
unused media files [2019-01-17 01:36:00,008: WARNING/ForkPoolWorker-3]
Unknown command: 'cleanup_unused_media --noinput --remove-empty-dirs'
[2019-01-17 01:36:00,008: INFO/ForkPoolWorker-3] Task Delete unused
media file(s)[3dd2b93b-e32a-4736-8b24-028b9ad8da35] succeeded in
0.0008139749998008483s: None
tasks.py
from celery import Celery
from celery.schedules import crontab
from celery.task import periodic_task
from celery.utils.log import get_task_logger
import requests
from django.core import management
logger = get_task_logger(__name__)
app = Celery('tasks', broker='redis://127.0.0.1')
...
#periodic_task(run_every=(crontab(minute='*/90')), name="Delete unused media file(s)", ignore_result=True)
def delete_unused_media():
try:
print("Scanning for unused media files")
management.call_command('cleanup_unused_media --noinput --remove-empty-dirs')
return "success, old media files have been deleted"
except Exception as e:
print(e)
that management cmd comes from the following project:
https://github.com/akolpakov/django-unused-media
is my management call simply wrong or whats the deal?
thanks in advance
UPDATE (Celery Config):
settings.py
INSTALLED_APPS = [
...
'celery',
'django_unused_media',
...
# Celery Settings:
BROKER_URL = 'redis://127.0.0.1:6379'
CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = 'UTC'
...
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": [
"redis://127.0.0.1:6379/0",
#"redis://127.0.0.1:6379/1",
],
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
"SOCKET_CONNECT_TIMEOUT": 15, # in seconds
"SOCKET_TIMEOUT": 15, # in seconds
"COMPRESSOR": "django_redis.compressors.zlib.ZlibCompressor",
"CONNECTION_POOL_KWARGS": {"max_connections": 1000, "retry_on_timeout": True}
}
}
}
celery.py
from __future__ import absolute_import, unicode_literals
from celery import Celery
import os
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'MyProject.settings')
app = Celery('MyProject')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
_init__.py
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app
__all__ = ('celery_app')
e.g. This task is working fine:
#periodic_task(run_every=(crontab(minute='*/1')), name="Get exchange rate(s)", ignore_result=True)
def get_exchange_rate():
api_url = "https://api.coinmarketcap.com/v1/ticker/?limit=1"
try:
exchange_rate = requests.get(api_url).json()
logger.info("BTC Exchange rate updated successfully.")
except Exception as e:
print(e)
exchange_rate = dict()
return exchange_rate
The issue is the way you're calling call_command. call_command takes the name of the command as its first argument, followed by the arguments passed positionally. You're passing the whole lot as a single string. Try changing it to:
management.call_command('cleanup_unused_media', '--noinput', '--remove-empty-dirs')
For anyone using Django 1.7+, it seems that simply import the settings module is not enough.
i got it working like this
from celery import Celery
from celery.schedules import crontab
from celery.task import periodic_task
from celery.utils.log import get_task_logger
import requests, os, django
from django.core import management
logger = get_task_logger(__name__)
app = Celery('tasks', broker='redis://127.0.0.1')
#periodic_task(run_every=(crontab(minute='*/1')), name="Delete unused media file(s)", ignore_result=True)
def delete_unused_media():
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "MyProject.settings")
django.setup()
try:
print("Scanning for unused media files")
management.call_command('cleanup_unused_media', '--noinput', '--remove-empty-dirs')
return "success, old media files have been deleted"
except Exception as e:
print(e)

Celery task not run at the specified time

I created celery task and it should start each hour at the 0th minute, but it does not run. What doing wrong?
celery
from __future__ import absolute_import, unicode_literals
import os
import pytz
from celery import Celery
from datetime import datetime
from celery.schedules import crontab
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'app.settings')
app = Celery('app', broker='amqp://rabbit:5672')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
#app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(crontab(minute=0, hour='0,1,2,3,4,5,6,7,8,9,10,11,12, \
13,14,15,16,17,18,19,20,21,22,23,24'),
task.s())
#app.task
def task():
#any code
In the terminal, I see this info, but the task is not running
[2018-05-09 19:05:16,275: INFO/Beat] beat: Starting...
Try change your task code to something like this:
from celery.task import periodic_task
from celery.schedules import crontab
#periodic_task(
run_every=(crontab(minute='*/60')),
name="task_name")
def run_some_task():
'''some code'''

How to run a function periodically with Flask and Celery?

I have a flask app that roughly looks like this:
app = Flask(__name__)
#app.route('/',methods=['POST'])
def foo():
data = json.loads(request.data)
# do some stuff
return "OK"
Now in addition I would like to run a function every ten seconds from that script. I don't want to use sleep for that. I have the following celery script in addition:
from celery import Celery
from datetime import timedelta
celery = Celery('__name__')
CELERYBEAT_SCHEDULE = {
'add-every-30-seconds': {
'task': 'tasks.add',
'schedule': timedelta(seconds=10)
},
}
#celery.task(name='tasks.add')
def hello():
app.logger.info('run my function')
The script works fine, but the logger.info is not executed. What am I missing?
Do you have Celery worker and Celery beat running? Scheduled tasks are handled by beat, which queues the task mentioned when appropriate. Worker then actually crunches the numbers and executes your task.
celery worker --app myproject--loglevel=info
celery beat --app myproject
Your task however looks like it's calling the Flask app's logger. When using the worker, you probably don't have the Flask application around (since it's in another process). Try using a normal Python logger for the demo task.
Well, celery beat can be embedded in regular celery worker as well, with -B parameter in your command.
celery -A --app myproject --loglevel=info -B
It is only recommended for the development environment. For production, you should run beat and celery workers separately as documentation mentions. Otherwise, your periodic task will run more than one time.
A celery task by default will run outside of the Flask app context and thus it won't have access to Flask app instance. However it's very easy to create the Flask app context while running a task by using app_context method of the Flask app object.
app = Flask(__name__)
celery = Celery(app.name)
#celery.task
def task():
with app.app_context():
app.logger.info('running my task')
This article by Miguel Grinberg is a very good place to get a primer on the basics of using Celery in a Flask application.
First install the redis on machine and check it is running or not.
install the python dependencies
celery
redis
flask
folder structure
project
app
init.py
task.py
main.py
write task.py
from celery import Celery
from celery.schedules import crontab
from app import app
from app.scrap import product_data
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
def make_celery(app):
#Celery configuration
app.config['CELERY_BROKER_URL'] = 'redis://127.0.0.1:6379'
app.config['CELERY_RESULT_BACKEND'] = 'db+postgresql://user:password#172.17.0.3:5432/mydatabase'
app.config['CELERY_RESULT_EXTENDED']=True
app.config['CELERYBEAT_SCHEDULE'] = {
# Executes every minute
'periodic_task-every-minute': {
'task': 'periodic_task',
'schedule': crontab(minute="*")
}
}
celery = Celery(app.import_name, broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
celery = make_celery(app)
#celery.task(name="periodic_task",bind=True)
def testing(self):
file1 = open("../myfile.txt", "a")
# writing newline character
file1.write("\n")
file1.write("Today")
#faik
print("Running")
self.request.task_name = "state"
logger.info("Hello! from periodic task")
return "Done"
write init.py
from flask import Flask, Blueprint,request
from flask_restx import Api,Resource,fields
from flask_sqlalchemy import SQLAlchemy
import redis
from rq import Queue
app = Flask(__name__)
app.config['SECRET_KEY']='7c09ebc8801a0ce8fb82b3d2ec51b4db'
app.config['SQLALCHEMY_DATABASE_URI']='sqlite:///site.db'
db=SQLAlchemy(app)
command to run celery beat and worker
celery -A app.task.celery beat
celery -A app.task.celery worker --loglevel=info

Celery tasks in Django are always blocking

I have the following setup in my django settings:
CELERY_TASK_RESULT_EXPIRES = timedelta(minutes=30)
CELERY_CHORD_PROPAGATES = True
CELERY_ACCEPT_CONTENT = ['json', 'msgpack', 'yaml']
CELERY_ALWAYS_EAGER = True
CELERY_EAGER_PROPAGATES_EXCEPTIONS = True
BROKER_URL = 'django://'
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend'
I've included this under my installed apps:
'djcelery',
'kombu.transport.django'
My project structure is (django 1.5)
proj
|_proj
__init__.py
celery.py
|_apps
|_myapp1
|_models.py
|_tasks.py
This is my celery.py file:
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings.dev')
app = Celery('proj')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS, related_name='tasks')
In the main __init__.pyI have:
from __future__ import absolute_import
from .celery import app as celery_app
And finally in myapp1/tasks.py I define my task:
#task()
def retrieve():
# Do my stuff
Now, if I launch a django interactive shell and I launch the retrieve task:
result = retrieve.delay()
it always seems to be a blocking call, meaning that the prompt is bloked until the function returns. The result status is SUCCESS, the function actually performs the operations BUT it seems not to be async. What am I missing?
it seems like CELERY_ALWAYS_EAGER causes this
if this is True, all tasks will be executed locally by blocking until
the task returns. apply_async() and Task.delay() will return an
EagerResult instance, which emulates the API and behavior of
AsyncResult, except the result is already evaluated.
That is, tasks will be executed locally instead of being sent to the
queue.

Categories

Resources