How to make Celery + Django + Pytest work together - python

Trying to test an endpoint that has a celery task. Celery tasks don't seem to run in the test.
django==4.1.5
celery==5.2.7
pytest==7.2.1
pytest-django==4.5.2
An endpoint:
def do_some_stuff(blah: Blah) -> Blah:
res = cool_task.apply_async(kwargs={
'cool_id': int(pk),
'config': config,
'name': RESUBMIT.value,
},
link=update_status.si(
cool_id=int(pk),
new_status="why is this so hard",
)
)
[...]
A test:
#pytest.mark.django_db
def test_my_thing(django_client: Client) -> None:
[...]
response = django_client.post(f"/api/myendpoint/{mything.id}/do_some_stuff/")
It hits the endpoint. Gets a 202 back as expected. But celery doesn't seem to be picking up the task in the test. The update_status method updates the db, and I'm not seeing that happen.
I've tried creating a celery app in the test, creating a worker in the test, changing the test to use the main DB instead of the test db, setting the override_settings to "BROKER_BANDEND='memory'".
I'd like a full working example. It seems kind of basic but it's eluding me. I don't understand what combination of fixtures and overrides I need for this to work.
Seems to work when I actually call the application.

Related

Pytest with Celery tasks works but shows "ERROR in consumer: Received unregistered task of type X"

I have Pytest working to test Celery tasks based on this Stack Overflow Q&A: Celery's pytest fixtures (celery_worker and celery_app) does not work.
conftest.py
import pytest
#pytest.fixture(scope="session")
def celery_config():
return {
"broker_url": REDIS_URL,
"result_backend": REDIS_URL,
}
Tests are passing with the configuration below:
import pytest
#pytest.mark.usefixtures("celery_session_app")
#pytest.mark.usefixtures("celery_session_worker")
class TestMyCeleryTask:
def test_run_task(self) -> None:
...
All of the tests are passing. However, no matter what order I import the Celery App and/or my tasks, I always receive the following output:
ERROR in consumer: Received unregistered task of type 'my_celery_task'.
The message has been ignored and discarded.
Did you remember to import the module containing this task?
Or maybe you're using relative imports?
Please see
http://docs.celeryq.org/en/latest/internals/protocol.html
for more information.
Note that I am using the old-school class-based Task approach rather than using the decorator to convert functions into classes.
Whichever Celery Pytest fixture is being used to obtain a Celery App instance, that particular instance needs to have the Celery tasks registered.
In this case, you are using the Celery session instance:
#pytest.mark.usefixtures("celery_session_app")
To register your Celery task for all tests, you could add the following to "conftest.py".
conftest.py
import pytest
from my_application import MyCeleryTask
#pytest.mark.usefixtures("celery_session_app")
#pytest.mark.usefixtures("celery_session_worker")
#pytest.fixture(scope="session", autouse=True)
def celery_register_tasks(celery_session_app):
celery_session_app.register_task(MyCeleryTask)
#pytest.fixture(scope="session")
def celery_config():
return {
"broker_url": REDIS_URL,
"result_backend": REDIS_URL,
}

Running a single test works, but running multiple tests fails - Flask and Pytest

This is really strange. I have the following simple flask application:
- root
- myapp
- a route with /subscription_endpoint
- tests
- test_az.py
- test_bz.py
test_az.py and test_bz.py look both the same. There is a setup (taken from https://diegoquintanav.github.io/flask-contexts.html) and then one simple test:
import pytest
from myapp import create_app
import json
#pytest.fixture(scope='module')
def app(request):
from myapp import create_app
return create_app('testing')
#pytest.fixture(autouse=True)
def app_context(app):
"""Creates a flask app context"""
with app.app_context():
yield app
#pytest.fixture
def client(app_context):
return app_context.test_client(use_cookies=True)
def test_it(client):
sample_payload = {"test": "test"}
response = client.post("/subscription_endpoint", json=sample_payload)
assert response.status_code == 500
running pytest, will run both files, but test_az.py will succeed, while test_bz.py will fail. The http request will return a 404 error, meaning test_bz cannot find the route in the app.
If I run them individually, then they booth succeed. This is very strange! It seems like the first test is somehow influencing the second test.
I have added actually a third test test_cz.py, which will fail as well. So only the first one will ever run. I feel like this has something todo with those fixtures, but no idea where to look.
Create a conftest.py for fixtures e.g. for client fixture and use the same fixture in both tests?
Now if you're saying that the provided code is the example of a test that is the same in another file, then you are creating 2 fixtures for a client. I would first clean it up and create a 1 conftest.py that contains all the fixtures and then use them in your tests this might help you.
Check out also how to use pytest as described in Flask documentation

Django Celery : Something wrong with shared_task

I use django-rest-framework and celery
this ims my views.py
# GET /server/test/<para>/
class Testcelery(APIView):
def test(self):
print(celery_test())
def get(self, request, para, format=None):
print('test')
self.test()
# result = add.delay(4, 4)
# print(result.id)
result = OrderedDict()
result['result'] = 'taskid'
result['code'] = status.HTTP_200_OK
result['message'] = 'success'
return Response(result, status=status.HTTP_200_OK)
this is a simple celery task
#shared_task()
def celery_test():
print('celerytest')
return True
I debug the django
it can goes to the test method
but the program stuck at the next step in call in local.py
where the error happens
the debug stops there, and shows like this
debug result
There are several problems:
Tasks are supposed to run with whatever.delay() http://docs.celeryproject.org/en/latest/userguide/calling.html#basics
I wouldn't call a class Testsomething unless it's a test class
Make sure the worker is running and that it is initialized correctly. Don't forget to check if the broker is running correctly.
When debugging, take into account that the celery worker is a different process. Your debugger is probably attached only to the process running manage.py. If you run the worker as a command from the IDE, it'll probably be easier to debug.

Is it possible to skip delegating a celery task if the params and the task name is already queued in the server?

Say that I have this task:
def do_stuff_for_some_time(some_id):
e = Model.objects.get(id=some_id)
e.domanystuff()
and I'm using it like so:
do_stuff_for_some_time.apply_async(args=[some_id], queue='some_queue')
The problem I'm facing is that there are a lot of repetitive tasks with the same arg param and it's boggling down the queue.
Is it possible to apply async only if the same args and the same task is not in the queue?
celery-singleton solves this requirement
Caveat: requires redis broker (for distributed locks)
pip install celery-singleton
Use the Singleton task base class:
from celery_singleton import Singleton
#celery_app.task(base=Singleton)
def do_stuff_for_some_time(some_id):
e = Model.objects.get(id=some_id)
e.domanystuff()
from the docs:
calls to do_stuff.delay() will either queue a new task
or return an AsyncResult for the currently queued/running instance of
the task
I would try a mix of a cache lock and a task result backend which stores each task's results:
The cache lock will prevent tasks with the same arguments to get added to the queue multiple times. Celery documentation contains a nice example of cache lock implementation here, but if you don't want to create it yourself, you can use the celery-once module.
For a task result backend, we will use the recommended django-celery-results, which creates a TaskResult table that we will query for task results.
Example:
Install and configure django-celery-results:
settings.py:
INSTALLED_APPS = (
...,
'django_celery_results',
)
CELERY_RESULT_BACKEND = 'django-db' # You can also use 'django-cache'
./manage.py migrate django_celery_results
Install and configure the celery-once module:
tasks.py:
from celery import Celery
from celery_once import QueueOnce
from time import sleep
celery = Celery('tasks', broker='amqp://guest#localhost//')
celery.conf.ONCE = {
'backend': 'celery_once.backends.Redis',
'settings': {
'url': 'redis://localhost:6379/0',
'default_timeout': 60 * 60
}
}
#celery.task(base=QueueOnce)
def do_stuff_for_some_time(some_id):
e = Model.objects.get(id=some_id)
e.domanystuff()
At this point, if a task with the same arguments is going to be executed,
an AlreadyQueued exception will be raised.
Let's use the above:
from django_celery_results.models import TaskResult
try:
result = do_stuff_for_some_time(some_id)
except AlreadyQueued:
result = TaskResult.objects.get(task_args=some_id)
Caveats:
Mind that at the time an AlreadyQueued exception arises, the initial task with argument=some_id may not be executed and therefore it will not have results in TaskResult table.
Mind everything in your code that can go wrong and hang any of the above processes (because it will do that!).
Extra Reading:
Another Task with Lock DIY implementation
django-celery-result's TaskResult model.
I am not really sure if celery has such an option. However, I would like to suggest a work-around.
1) Create a model for all the celery tasks being queued. In that model, save the task_name, queue_name as well as the parameters
2) Use a get_or_create on that model for every celery task that is ready to be queued.
3) If created = True from step 2, allow the task to be added to the queue, else do not add the task into the queue

How to cache SQL Alchemy calls with Flask-Cache and Redis?

I have a Flask app that takes parameters from a web form, queries a DB with SQL Alchemy and returns Jinja-generated HTML showing a table with the results. I want to cache the calls to the DB. I looked into Redis (Using redis as an LRU cache for postgres), which led me to http://pythonhosted.org/Flask-Cache/.
Now I am trying to use Redis + Flask-Cache to cache the calls to the DB. Based on the Flask-Cache docs, it seems like I need to set up a custom Redis cache.
class RedisCache(BaseCache):
def __init__(self, servers, default_timeout=500):
pass
def redis(app, config, args, kwargs):
args.append(app.config['REDIS_SERVERS'])
return RedisCache(*args, **kwargs)
From there I would need to something like:
# not sure what to put for args or kwargs
cache = redis(app, config={'CACHE_TYPE': 'redis'})
app = Flask(__name__)
cache.init_app(app)
I have two questions:
What do I put for args and kwargs? What do these mean? How do I set up a Redis cache with Flask-Cache?
Once the cache is set up, it seems like I would want to somehow "memoize" the calls the DB so that if the method gets the same query it has the output cached. How do I do this? My best guess would be to wrap the call the SQL Alchemy in a method that could then be given memoize decorator? That way if two identical queries were passed to the method, Flask-Cache would recognize this and return to the appropriate response. I'm guessing that it would look like this:
#cache.memoize(timeout=50)
def queryDB(q):
return q.all()
This seems like a fairly common use of Redis + Flask + Flask-Cache + SQL Alchemy, but I am unable to find a complete example to follow. If someone could post one, that would be super helpful -- but for me and for others down the line.
You don't need to create custom RedisCache class. The docs is just teaching how you would create new backends that are not available in flask-cache. But RedisCache is already available in werkzeug >= 0.7, which you might have already installed because it is one of the core dependencies of flask.
This is how I could run the flask-cache with redis backend:
import time
from flask import Flask
from flask_cache import Cache
app = Flask(__name__)
cache = Cache(app, config={'CACHE_TYPE': 'redis'})
#cache.memoize(timeout=60)
def query_db():
time.sleep(5)
return "Results from DB"
#app.route('/')
def index():
return query_db()
app.run(debug=True)
The reason you're getting "ImportError: redis is not a valid FlaskCache backend" is probably because you don't have redis (python library) installed which you can simply install by:
pip install redis.
your redis args would look something like this:
cache = Cache(app, config={
'CACHE_TYPE': 'redis',
'CACHE_KEY_PREFIX': 'fcache',
'CACHE_REDIS_HOST': 'localhost',
'CACHE_REDIS_PORT': '6379',
'CACHE_REDIS_URL': 'redis://localhost:6379'
})
Putting the #cache.memoize over a method that grabs the info from the DB should work.

Categories

Resources