How to refresh cache from DB in django rest service? - python

The main intent of this question is to know how to refresh cache from db (which is populated by some other team not in our control) in django rest service which will then be used in serving requests received on rest end point.
Currently I am using the following approach but my concern is since python (cpython with GIL) is not multithreaded then can we have following type of code in rest service where one thread is populating cache every 30 mins and main thread is serving requests on rest end point.Here is sample code only for illustration.
# mainproject.__init__.py
globaldict = {} # cache
class MyThread(Thread):
def __init__(self, event):
Thread.__init__(self)
self.stopped = event
def run(self):
while not self.stopped.wait(1800):
refershcachefromdb() # function that takes around 5-6 mins for refreshing cache (global datastructure) from db
refershcachefromdb() # this is explicitly called to initially populate cache
thread = MyThread(stop_flag)
thread.start() # started thread that will refresh cache every 30 mins
# views.py
import mainproject
#api_view(['GET'])
def get_data(request):
str_param = request.GET.get('paramid')
if str_param:
try:
paramids = [int(x) for x in str_param.split(",")]
except ValueError:
return JsonResponse({'Error': 'This rest end point only accept comma seperated integers'}, status=422)
# using global cache to get records
output_dct_lst = [mainproject.globaldict[paramid] for paramid in paramids if paramid in mainproject.globaldict]
if not output_dct_lst:
return JsonResponse({'Error': 'Data not available'}, status=422)
else:
return JsonResponse(output_dct_lst, status=200, safe=False)

Related

Multi-threading use of SQLAlchemy Sessions

I've been reading about the multi-threading use of the Sqlalchemy session, but still wonder if it is the best way to use one global session for multiple threads and allow only one thread to use it at once or to limit the number of sessions created and make a new session per thread and have like a limit on the number of sessions running together. In my project, I need to make several requests simultaneously, and while I wait for the response to those requests, the session is still active. Once the response is received, I add it to the PostgreSQL DB and close the session. Up to 6-7 requests per second are fine and 85 concurrent sessions seem to be enough, but when I increase it to 8-9 requests per second, due to larger response time, many sessions get stuck and don't close in time.
def plan_request(self, url, time_num):
logger.info(f'Sleeping for {time_num} seconds')
session_manager = GetNewLocalSession()
time.sleep(time_num)
# after delay new session is created using session manager I made
session_local = session_manager.get_new_local_session()
self.tool.test(session_local, item)
self.session_manager.remove_local_session(session_local)
session_local.close()
db.Session.remove()
class GetNewLocalSession():
def __init__(self):
self.sessions = []
def get_new_local_session(self):
print(f"Sessions used: {len(self.sessions)}")
if len(self.sessions) <= 85:
# creates new scoped_session object
session_local = Session()
self.sessions.append(session_local)
return session_local
else:
success = False
while not success:
if len(self.sessions) <= 85:
session_local = Session()
self.sessions.append(session_local)
success = True
else:
time.sleep(0.1)
return session_local
def remove_local_session(self, session_local):
if session_local in self.sessions:
self.sessions.remove(session_local)
else:
pass

Python: Flask simple task queue without external libraries not working

I'm trying to do a simple task queue with Flask and without any DB.
In the most simple version, I have two endpoints. Submit job and Check status.
Submit job will add request to queue and check status will return the status of a job id (queued, running, failed, finished).
The workflow is as follows:
user submits a job
job is added to queue
user will check status of the job every 5 seconds
every check of status will trigger a function that checks if the number of running jobs is smaller than the maximum number of jobs (from config). If the number is smaller, it will span another thread with the job on top of queue.
This is the simplified code:
app = Flask(__name__)
def finish_job(job_id):
finished.append(job_id)
last = running.pop(job_id)
last.close()
def remove_finished():
for j in list(running.keys()):
if not running[j].is_alive():
finish_job(j)
def start_jobs():
while len(running) < config.threads and len(queue_list) > 0:
print('running now', len(running))
next_job = queue.pop()
queue_list.remove(next_job[0])
start_job(*next_job)
#app.route("/Simulation", methods=['POST'])
#authenticate
def submit_job():
# create id
job_id = str(uuid.uuid4())
job_data = request.data.decode('utf-8')
queue.append((job_id, job_data))
queue_list.add(job_id)
return 'QUEUED', 200
#app.route("/Simulation/<uuid:job_id>", methods=['GET'])
#authenticate
def check_status(job_id: uuid):
job_id = str(job_id)
remove_finished()
start_jobs()
if job_id in running:
r = 'RUNNING'
elif job_id in queue_list:
r = 'QUEUED'
elif job_id in finished:
r = 'COMPLETED'
else:
r = 'FAILED'
return status_response(r), 200
running = {}
finished = []
queue = []
queue_list = set()
app.run()
Now, the problem is, that if multiple users submit a check status request at the same time, and there is only one slot free for running a task, both requests will spawn the job.
Is there some way to force Flask to only run one instance of a function at a time?
Thank you
after much searching, I have finally found an answer for this.
As of Flask 1.0, the builtin WSGI server runs threaded by default.
So, I just needed to add parameter to stop threads
app.run(threaded=False)

How to let a Flask web page (route) run in the background while on another web page(route)

So i'm creating this application and a part of it is a web page where a trading algorithm is testing itself using live data. All that is working but the issue is if i leave (exit) the web page, it stops. I was wondering how i can keep it running in the background indefinitely as i want the algorithm to keep doing it's thing.
This is the route which i would like to run in the background.
#app.route('/live-data-source')
def live_data_source():
def get_live_data():
live_options = lo.Options()
while True:
live_options.run()
live_options.update_strategy()
trades = live_options.get_all_option_trades()
trades = trades[0]
json_data = json.dumps(
{'data': trades})
yield f"data:{json_data}\n\n"
time.sleep(5)
return Response(get_live_data(), mimetype='text/event-stream')
I've looked into multi-threading but not too sure if that's the right thing for the job. I am kind of still new to flask so hence the poor question. If you need more info, please do comment.
You can do it the following way - this is 100% working example below. Note, in production use Celery for such tasks, or write another one daemon app (another process) by yourself and feed it with tasks from http server with the help of message queue (e.g. RabbitMQ) or with the help of common database.
If any questions regarding code below, feel free to ask, it was quite good exercise for me:
from flask import Flask, current_app
import threading
from threading import Thread, Event
import time
from random import randint
app = Flask(__name__)
# use the dict to store events to stop other treads
# one event per thread !
app.config["ThreadWorkerActive"] = dict()
def do_work(e: Event):
"""function just for another one thread to do some work"""
while True:
if e.is_set():
break # can be stopped from another trhead
print(f"{threading.current_thread().getName()} working now ...")
time.sleep(2)
print(f"{threading.current_thread().getName()} was stoped ...")
#app.route("/long_thread", methods=["GET"])
def long_thread_task():
"""Allows to start a new thread"""
th_name = f"Th-{randint(100000, 999999)}" # not really unique actually
stop_event = Event() # is used to stop another thread
th = Thread(target=do_work, args=(stop_event, ), name=th_name, daemon=True)
th.start()
current_app.config["ThreadWorkerActive"][th_name] = stop_event
return f"{th_name} was created!"
#app.route("/stop_thread/<th_id>", methods=["GET"])
def stop_thread_task(th_id):
th_name = f"Th-{th_id}"
if th_name in current_app.config["ThreadWorkerActive"].keys():
e = current_app.config["ThreadWorkerActive"].get(th_name)
if e:
e.set()
current_app.config["ThreadWorkerActive"].pop(th_name)
return f"Th-{th_id} was asked to stop"
else:
return "Sorry something went wrong..."
else:
return f"Th-{th_id} not found"
#app.route("/", methods=["GET"])
def index_route():
text = ("/long_thread - create another thread. "
"/stop_thread/th_id - stop thread with a certain id. "
f"Available Threads: {'; '.join(current_app.config['ThreadWorkerActive'].keys())}")
return text
if __name__ == '__main__':
app.run(host="0.0.0.0", port=9999)

Python muiltithreading is mixing the data of different request in django

I am using python muiltithreading for achieving a task which is like 2 to 3 mins long ,i have made one api endpoint in django project.
Here is my code--
from threading import Thread
def myendpoint(request):
print("hello")
lis = [ *args ]
obj = Model.objects.get(name =" jax")
T1 = MyThreadClass(lis, obj)
T1.start()
T1.deamon = True
return HttpResponse("successful", status=200)
Class MyThreadClass(Thread):
def __init__(self,lis,obj):
Thread.__init__(self)
self.lis = lis
self.obj = obj
def run(self):
for i in lis:
res =Func1(i)
self.obj.someattribute = res
self.obj.save()
def Func1(i):
'''Some big codes'''
context =func2(*args)
return context
def func2(*args):
"' some codes "'
return res
By this muiltithreading i can achieve the quick response from the django server on calling the endpoint function as the big task is thrown in another tread and execution of the endpoint thread is terminated on its return statement without keeping track of the spawned thread.
This part works for me correctly if i hit the url once , but if i hit the url 2 times as soon as 1st execution starts then on 2nd request i can see my request on console. But i cant get any response from it.
And if i hit the same url from 2 different client at the same time , both the individual datas are getting mixed up and i see few records of one client's request on other client data.
I am testing it to my local django runserver.
So guys please help , and i know about celery so dont recommend celery. Just tell me why this thing is happening or can it be fixed . As my task is not that long to use celery. I want to achieve it by muiltithreading.

Flask Celery task locking

I am using Flask with Celery and I am trying to lock a specific task so that it can only be run one at a time. In the celery docs it gives a example of doing this Celery docs, Ensuring a task is only executed one at a time. This example that was given was for Django however I am using flask I have done my best to convert this to work with Flask however I still see myTask1 which has the lock can be run multiple times.
One thing that is not clear to me is if I am using the cache correctly, I have never used it before so all of it is new to me. One thing from the doc's that is mentioned but not explained is this
Doc Notes:
In order for this to work correctly you need to be using a cache backend where the .add operation is atomic. memcached is known to work well for this purpose.
Im not truly sure what that means, should i be using the cache in conjunction with a database and if so how would I do that? I am using mongodb. In my code I just have this setup for the cache cache = Cache(app, config={'CACHE_TYPE': 'simple'}) as that is what was mentioned in the Flask-Cache doc's Flask-Cache Docs
Another thing that is not clear to me is if there is anything different I need to do as I am calling my myTask1 from within my Flask route task1
Here is an example of my code that I am using.
from flask import (Flask, render_template, flash, redirect,
url_for, session, logging, request, g, render_template_string, jsonify)
from flask_caching import Cache
from contextlib import contextmanager
from celery import Celery
from Flask_celery import make_celery
from celery.result import AsyncResult
from celery.utils.log import get_task_logger
from celery.five import monotonic
from flask_pymongo import PyMongo
from hashlib import md5
import pymongo
import time
app = Flask(__name__)
cache = Cache(app, config={'CACHE_TYPE': 'simple'})
app.config['SECRET_KEY']= 'super secret key for me123456789987654321'
######################
# MONGODB SETUP
#####################
app.config['MONGO_HOST'] = 'localhost'
app.config['MONGO_DBNAME'] = 'celery-test-db'
app.config["MONGO_URI"] = 'mongodb://localhost:27017/celery-test-db'
mongo = PyMongo(app)
##############################
# CELERY ARGUMENTS
##############################
app.config['CELERY_BROKER_URL'] = 'amqp://localhost//'
app.config['CELERY_RESULT_BACKEND'] = 'mongodb://localhost:27017/celery-test-db'
app.config['CELERY_RESULT_BACKEND'] = 'mongodb'
app.config['CELERY_MONGODB_BACKEND_SETTINGS'] = {
"host": "localhost",
"port": 27017,
"database": "celery-test-db",
"taskmeta_collection": "celery_jobs",
}
app.config['CELERY_TASK_SERIALIZER'] = 'json'
celery = Celery('task',broker='mongodb://localhost:27017/jobs')
celery = make_celery(app)
LOCK_EXPIRE = 60 * 2 # Lock expires in 2 minutes
#contextmanager
def memcache_lock(lock_id, oid):
timeout_at = monotonic() + LOCK_EXPIRE - 3
# cache.add fails if the key already exists
status = cache.add(lock_id, oid, LOCK_EXPIRE)
try:
yield status
finally:
# memcache delete is very slow, but we have to use it to take
# advantage of using add() for atomic locking
if monotonic() < timeout_at and status:
# don't release the lock if we exceeded the timeout
# to lessen the chance of releasing an expired lock
# owned by someone else
# also don't release the lock if we didn't acquire it
cache.delete(lock_id)
#celery.task(bind=True, name='app.myTask1')
def myTask1(self):
self.update_state(state='IN TASK')
lock_id = self.name
with memcache_lock(lock_id, self.app.oid) as acquired:
if acquired:
# do work if we got the lock
print('acquired is {}'.format(acquired))
self.update_state(state='DOING WORK')
time.sleep(90)
return 'result'
# otherwise, the lock was already in use
raise self.retry(countdown=60) # redeliver message to the queue, so the work can be done later
#celery.task(bind=True, name='app.myTask2')
def myTask2(self):
print('you are in task2')
self.update_state(state='STARTING')
time.sleep(120)
print('task2 done')
#app.route('/', methods=['GET', 'POST'])
def index():
return render_template('index.html')
#app.route('/task1', methods=['GET', 'POST'])
def task1():
print('running task1')
result = myTask1.delay()
# get async task id
taskResult = AsyncResult(result.task_id)
# push async taskid into db collection job_task_id
mongo.db.job_task_id.insert({'taskid': str(taskResult), 'TaskName': 'task1'})
return render_template('task1.html')
#app.route('/task2', methods=['GET', 'POST'])
def task2():
print('running task2')
result = myTask2.delay()
# get async task id
taskResult = AsyncResult(result.task_id)
# push async taskid into db collection job_task_id
mongo.db.job_task_id.insert({'taskid': str(taskResult), 'TaskName': 'task2'})
return render_template('task2.html')
#app.route('/status', methods=['GET', 'POST'])
def status():
taskid_list = []
task_state_list = []
TaskName_list = []
allAsyncData = mongo.db.job_task_id.find()
for doc in allAsyncData:
try:
taskid_list.append(doc['taskid'])
except:
print('error with db conneciton in asyncJobStatus')
TaskName_list.append(doc['TaskName'])
# PASS TASK ID TO ASYNC RESULT TO GET TASK RESULT FOR THAT SPECIFIC TASK
for item in taskid_list:
try:
task_state_list.append(myTask1.AsyncResult(item).state)
except:
task_state_list.append('UNKNOWN')
return render_template('status.html', data_list=zip(task_state_list, TaskName_list))
Final Working Code
from flask import (Flask, render_template, flash, redirect,
url_for, session, logging, request, g, render_template_string, jsonify)
from flask_caching import Cache
from contextlib import contextmanager
from celery import Celery
from Flask_celery import make_celery
from celery.result import AsyncResult
from celery.utils.log import get_task_logger
from celery.five import monotonic
from flask_pymongo import PyMongo
from hashlib import md5
import pymongo
import time
import redis
from flask_redis import FlaskRedis
app = Flask(__name__)
# ADDING REDIS
redis_store = FlaskRedis(app)
# POINTING CACHE_TYPE TO REDIS
cache = Cache(app, config={'CACHE_TYPE': 'redis'})
app.config['SECRET_KEY']= 'super secret key for me123456789987654321'
######################
# MONGODB SETUP
#####################
app.config['MONGO_HOST'] = 'localhost'
app.config['MONGO_DBNAME'] = 'celery-test-db'
app.config["MONGO_URI"] = 'mongodb://localhost:27017/celery-test-db'
mongo = PyMongo(app)
##############################
# CELERY ARGUMENTS
##############################
# CELERY USING REDIS
app.config['CELERY_BROKER_URL'] = 'redis://localhost:6379/0'
app.config['CELERY_RESULT_BACKEND'] = 'mongodb://localhost:27017/celery-test-db'
app.config['CELERY_RESULT_BACKEND'] = 'mongodb'
app.config['CELERY_MONGODB_BACKEND_SETTINGS'] = {
"host": "localhost",
"port": 27017,
"database": "celery-test-db",
"taskmeta_collection": "celery_jobs",
}
app.config['CELERY_TASK_SERIALIZER'] = 'json'
celery = Celery('task',broker='mongodb://localhost:27017/jobs')
celery = make_celery(app)
LOCK_EXPIRE = 60 * 2 # Lock expires in 2 minutes
#contextmanager
def memcache_lock(lock_id, oid):
timeout_at = monotonic() + LOCK_EXPIRE - 3
print('in memcache_lock and timeout_at is {}'.format(timeout_at))
# cache.add fails if the key already exists
status = cache.add(lock_id, oid, LOCK_EXPIRE)
try:
yield status
print('memcache_lock and status is {}'.format(status))
finally:
# memcache delete is very slow, but we have to use it to take
# advantage of using add() for atomic locking
if monotonic() < timeout_at and status:
# don't release the lock if we exceeded the timeout
# to lessen the chance of releasing an expired lock
# owned by someone else
# also don't release the lock if we didn't acquire it
cache.delete(lock_id)
#celery.task(bind=True, name='app.myTask1')
def myTask1(self):
self.update_state(state='IN TASK')
print('dir is {} '.format(dir(self)))
lock_id = self.name
print('lock_id is {}'.format(lock_id))
with memcache_lock(lock_id, self.app.oid) as acquired:
print('in memcache_lock and lock_id is {} self.app.oid is {} and acquired is {}'.format(lock_id, self.app.oid, acquired))
if acquired:
# do work if we got the lock
print('acquired is {}'.format(acquired))
self.update_state(state='DOING WORK')
time.sleep(90)
return 'result'
# otherwise, the lock was already in use
raise self.retry(countdown=60) # redeliver message to the queue, so the work can be done later
#celery.task(bind=True, name='app.myTask2')
def myTask2(self):
print('you are in task2')
self.update_state(state='STARTING')
time.sleep(120)
print('task2 done')
#app.route('/', methods=['GET', 'POST'])
def index():
return render_template('index.html')
#app.route('/task1', methods=['GET', 'POST'])
def task1():
print('running task1')
result = myTask1.delay()
# get async task id
taskResult = AsyncResult(result.task_id)
# push async taskid into db collection job_task_id
mongo.db.job_task_id.insert({'taskid': str(taskResult), 'TaskName': 'myTask1'})
return render_template('task1.html')
#app.route('/task2', methods=['GET', 'POST'])
def task2():
print('running task2')
result = myTask2.delay()
# get async task id
taskResult = AsyncResult(result.task_id)
# push async taskid into db collection job_task_id
mongo.db.job_task_id.insert({'taskid': str(taskResult), 'TaskName': 'task2'})
return render_template('task2.html')
#app.route('/status', methods=['GET', 'POST'])
def status():
taskid_list = []
task_state_list = []
TaskName_list = []
allAsyncData = mongo.db.job_task_id.find()
for doc in allAsyncData:
try:
taskid_list.append(doc['taskid'])
except:
print('error with db conneciton in asyncJobStatus')
TaskName_list.append(doc['TaskName'])
# PASS TASK ID TO ASYNC RESULT TO GET TASK RESULT FOR THAT SPECIFIC TASK
for item in taskid_list:
try:
task_state_list.append(myTask1.AsyncResult(item).state)
except:
task_state_list.append('UNKNOWN')
return render_template('status.html', data_list=zip(task_state_list, TaskName_list))
if __name__ == '__main__':
app.secret_key = 'super secret key for me123456789987654321'
app.run(port=1234, host='localhost')
Here is also a screen shot you can see that I ran myTask1 two times and myTask2 a single time. Now I have the expected behavior for myTask1. Now myTask1 will be run by a single worker if another worker attempt to pick it up it will just keep retrying based on whatever i define.
In your question, you point out this warning from the Celery example you used:
In order for this to work correctly you need to be using a cache backend where the .add operation is atomic. memcached is known to work well for this purpose.
And you mention that you don't really understand what this means. Indeed, the code you show demonstrates that you've not heeded that warning, because your code uses an inappropriate backend.
Consider this code:
with memcache_lock(lock_id, self.app.oid) as acquired:
if acquired:
# do some work
What you want here is for acquired to be true only for one thread at a time. If two threads enter the with block at the same time, only one should "win" and have acquired be true. This thread that has acquired true can then proceed with its work, and the other thread has to skip doing the work and try again later to acquire the lock. In order to ensure that only one thread can have acquired true, .add must be atomic.
Here's some pseudo code of what .add(key, value) does:
1. if <key> is already in the cache:
2. return False
3. else:
4. set the cache so that <key> has the value <value>
5. return True
If the execution of .add is not atomic, this could happen if two threads A and B execute .add("foo", "bar"). Assume an empty cache at the start.
Thread A executes 1. if "foo" is already in the cache and finds that "foo" is not in the cache, and jumps to line 3 but the thread scheduler switches control to thread B.
Thread B also executes 1. if "foo" is already in the cache, and also finds that "foo" is not in the cache. So it jumps to line 3 and then executes line 4 and 5 which sets the key "foo" to the value "bar" and the call returns True.
Eventually, the scheduler gives control back to Thread A, which continues executing 3, 4, 5 and also sets the key "foo" to the value "bar" and also returns True.
What you have here is two .add calls that return True, if these .add calls are made within memcache_lock this entails that two threads can have acquired be true. So two threads could do work at the same time, and your memcache_lock is not doing what it should be doing, which is only allow one thread to work at a time.
You are not using a cache that ensures that .add is atomic. You initialize it like this:
cache = Cache(app, config={'CACHE_TYPE': 'simple'})
The simple backend is scoped to a single process, has no thread-safety, and has an .add operation which is not atomic. (This does not involve Mongo at all by the way. If you wanted your cache to be backed by Mongo, you'd have to specify a backed specifically made to send data to a Mongo database.)
So you have to switch to another backend, one that guarantees that .add is atomic. You could follow the lead of the Celery example and use the memcached backend, which does have an atomic .add operation. I don't use Flask, but I've does essentially what you are doing with Django and Celery, and used the Redis backend successfully to provide the kind of locking you're using here.
I also found this to be a surprisingly hard problem. Inspired mainly by Sebastian's work on implementing a distributed locking algorithm in redis I wrote up a decorator function.
A key point to bear in mind about this approach is that we lock tasks at the level of the task's argument space, e.g. we allow multiple game update/process order tasks to run concurrently, but only one per game. That's what argument_signature achieves in the code below. You can see documentation on how we use this in our stack at this gist:
import base64
from contextlib import contextmanager
import json
import pickle as pkl
import uuid
from backend.config import Config
from redis import StrictRedis
from redis_cache import RedisCache
from redlock import Redlock
rds = StrictRedis(Config.REDIS_HOST, decode_responses=True, charset="utf-8")
rds_cache = StrictRedis(Config.REDIS_HOST, decode_responses=False, charset="utf-8")
redis_cache = RedisCache(redis_client=rds_cache, prefix="rc", serializer=pkl.dumps, deserializer=pkl.loads)
dlm = Redlock([{"host": Config.REDIS_HOST}])
TASK_LOCK_MSG = "Task execution skipped -- another task already has the lock"
DEFAULT_ASSET_EXPIRATION = 8 * 24 * 60 * 60 # by default keep cached values around for 8 days
DEFAULT_CACHE_EXPIRATION = 1 * 24 * 60 * 60 # we can keep cached values around for a shorter period of time
REMOVE_ONLY_IF_OWNER_SCRIPT = """
if redis.call("get",KEYS[1]) == ARGV[1] then
return redis.call("del",KEYS[1])
else
return 0
end
"""
#contextmanager
def redis_lock(lock_name, expires=60):
# https://breadcrumbscollector.tech/what-is-celery-beat-and-how-to-use-it-part-2-patterns-and-caveats/
random_value = str(uuid.uuid4())
lock_acquired = bool(
rds.set(lock_name, random_value, ex=expires, nx=True)
)
yield lock_acquired
if lock_acquired:
rds.eval(REMOVE_ONLY_IF_OWNER_SCRIPT, 1, lock_name, random_value)
def argument_signature(*args, **kwargs):
arg_list = [str(x) for x in args]
kwarg_list = [f"{str(k)}:{str(v)}" for k, v in kwargs.items()]
return base64.b64encode(f"{'_'.join(arg_list)}-{'_'.join(kwarg_list)}".encode()).decode()
def task_lock(func=None, main_key="", timeout=None):
def _dec(run_func):
def _caller(*args, **kwargs):
with redis_lock(f"{main_key}_{argument_signature(*args, **kwargs)}", timeout) as acquired:
if not acquired:
return TASK_LOCK_MSG
return run_func(*args, **kwargs)
return _caller
return _dec(func) if func is not None else _dec
Implementation in our task definitions file:
#celery.task(name="async_test_task_lock")
#task_lock(main_key="async_test_task_lock", timeout=UPDATE_GAME_DATA_TIMEOUT)
def async_test_task_lock(game_id):
print(f"processing game_id {game_id}")
time.sleep(TASK_LOCK_TEST_SLEEP)
How we test against a local celery cluster:
from backend.tasks.definitions import async_test_task_lock, TASK_LOCK_TEST_SLEEP
from backend.tasks.redis_handlers import rds, TASK_LOCK_MSG
class TestTaskLocking(TestCase):
def test_task_locking(self):
rds.flushall()
res1 = async_test_task_lock.delay(3)
res2 = async_test_task_lock.delay(5)
self.assertFalse(res1.ready())
self.assertFalse(res2.ready())
res3 = async_test_task_lock.delay(5)
res4 = async_test_task_lock.delay(5)
self.assertEqual(res3.get(), TASK_LOCK_MSG)
self.assertEqual(res4.get(), TASK_LOCK_MSG)
time.sleep(TASK_LOCK_TEST_SLEEP)
res5 = async_test_task_lock.delay(3)
self.assertFalse(res5.ready())
(as a goodie there's also a quick example of how to setup a redis_cache)
With this setup, you should still expect to see workers receiving the task, since the lock is checked inside of the task itself. The only difference will be that the work won't be performed if the lock is acquired by another worker.
In the example given in the docs, this is the desired behavior; if a lock already exists, the task will simply do nothing and finish as successful. What you want is slightly different; you want the work to be queued up instead of ignored.
In order to get the desired effect, you would need to make sure that the task will be picked up by a worker and performed some time in the future. One way to accomplish this would be with retrying.
#task(bind=True, name='my-task')
def my_task(self):
lock_id = self.name
with memcache_lock(lock_id, self.app.oid) as acquired:
if acquired:
# do work if we got the lock
print('acquired is {}'.format(acquired))
return 'result'
# otherwise, the lock was already in use
raise self.retry(countdown=60) # redeliver message to the queue, so the work can be done later

Categories

Resources