Django and concurrent tasks in Celery

Django and concurrent tasks in Celery - python

I have a complicated scenario I need to tackle.
I'm using Celery to run tasks in parallel, my tasks involve with HTTP requests and I'm planning to use Celery along with eventlet for such purpose.
Let me explain my scenario:
I have 2 tasks that can run in parallel and third task that needs to work on the output of those 2 tasks therefore I'm using Celery group to run the 2 tasks and Celery chain to pass the
output to the third task to work on it when they finish.
Now it gets complicated, the third task needs to spawn multiple tasks that I would like to run in parallel and I would like to collect all outputs together and to process it in another task.
So I created a group for the multiple tasks together with a chain to process all information.
I guess I'm missing basic information about Celery concurrent primitives, I was having a 1 celery task that worked well but I needed to make it faster.
This is a simplified sample of the code:
#app.task
def task2():
return "aaaa"
#app.task
def task3():
return "bbbb"
#app.task
def task4():
work = group(...) | task5.s(...)
work()
#app.task
def task1():
tasks = [task2.s(a, b), task3.s(c, d)]
work = group(tasks) | task4.s()
return work()
This is how I start this operation:
task = tasks1.apply_async(kwargs=kwargs, queue='queue1')
I save task.id and pull the server every 30 seconds to see if results available by doing:
results = tasks1.AsyncResult(task_id)
if results.ready():
res = results.get()

Related

How to configure celery for concurrent execution with multi-process?

I have a task that talks to an external API, the json response is quite large and I have to make this call multiple times followed by further python processing. To make this less time-consuming, I initially tried:
def make_call(*args, **kwargs):
pass
def make_another(*args, **kwargs):
pass
def get_calls():
return make_call, make_another
def task(*args, **kwargs):
procs = [Process(target=get_calls()[i], args=(,),
kwargs={}) for i in range(3)]
_start = [proc.start() for proc in procs]
_join = [proc.join() for proc in procs]
#
transaction.on_commit(lambda: task.delay())
However, I ran into an AssertionError: daemonic processes are not allowed to have children. What would be my best approach to speed up a celery task with additional processes?

A Celery worker already creates many processes. Take advantage of the many worker processes instead of creating child processes. You can delegate work amongst the celery workers instead. This will result in a more stable/reliable execution.
You could either just create many tasks from your client code or make use of celery's primitives like chains or chords to parallelize the work. These can also be composed with other primitives like groups, etc.
For example, in your scenario, you may have two tasks: one to make the API call(s) make_api_call and another to parse the response parse_response. You can chain these together.
# chain another task when a task completes successfully
res = make_api_call.apply_async((0,), link=parse_response.s())
# chain syntax 1
result_1 = chain(make_api_call.s(1), parse_response.s())
# syntax 2 with | operator
result_b = make_api_call.s(2) | parse_response.s()
# can group chains
job = group([
chain(make_api_call.s(i), parse_response.s())
for i in range(3)
]
)
result = job.apply_async()
This is just a generic example. You can create task(s) and compose them to however your workflow needs. See: Canvas: Designing Work-flows for more information.

Run a same celery task in loop

how to run this kind of celery task properly?
#app.task
def add(x)
x + 1
def some_func():
result = 'result'
for i in range(10):
task_id = uuid()
add.apply_async((i,)), task_id=task_id)
return result
I need all tasks to be performed sequentially after the previous one is completed.
I tried using time.sleep() but in this case returning result waits until all tasks are completed. But I need the result returned and all 10 tasks are running sequentially in the background.
there is a group() in celery, but it runs tasks in parallel

Finally, I solved it by using immutable signature and chain
tasks = [
add.si(x).set(task_id=uuid())
for x in range(10)
]
chain(*tasks).apply_async()

If some_func() is executed outside Celery (say a script is used as "producer" to just send those tasks to be executed), then nothing stops you from calling .get() on AsyncResult to wait for task to finish, and loop that as much as you like.
If, however, you want to execute that loop as some sort of Celery workflow, then you have to build a Chain and use it.

Can Celery pass a Status Update to a non-Blocking Caller?

I am using Celery to asynchronously perform a group of operations. There are a lot of these operations and each may take a long time, so rather than send the results back in the return value of the Celery worker function, I'd like to send them back one at a time as custom state updates. That way the caller can implement a progress bar with a change state callback, and the return value of the worker function can be of constant size rather than linear in the number of operations.
Here is a simple example in which I use the Celery worker function add_pairs_of_numbers to add a list of pairs of numbers, sending back a custom status update for every added pair.
#!/usr/bin/env python
"""
Run worker with:
celery -A tasks worker --loglevel=info
"""
from celery import Celery
app = Celery("tasks", broker="pyamqp://guest#localhost//", backend="rpc://")
#app.task(bind=True)
def add_pairs_of_numbers(self, pairs):
for x, y in pairs:
self.update_state(state="SUM", meta={"x":x, "y":y, "x+y":x+y})
return len(pairs)
def handle_message(message):
if message["status"] == "SUM":
x = message["result"]["x"]
y = message["result"]["y"]
print(f"Message: {x} + {y} = {x+y}")
def non_looping(*pairs):
task = add_pairs_of_numbers.delay(pairs)
result = task.get(on_message=handle_message)
print(result)
def looping(*pairs):
task = add_pairs_of_numbers.delay(pairs)
print(task)
while True:
pass
if __name__ == "__main__":
import sys
if sys.argv[1:] and sys.argv[1] == "looping":
looping((3,4), (2,7), (5,5))
else:
non_looping((3,4), (2,7), (5,5))
If you run just ./tasks it executes the non_looping function. This does the standard Celery thing: makes a delayed call to the worker function and then uses get to wait for the result. A handle_message callback function prints each message, and the number of pairs added is returned as the result. This is what I want.
$ ./task.py
Message: 3 + 4 = 7
Message: 2 + 7 = 9
Message: 5 + 5 = 10
3
Though the non-looping scenario is sufficient for this simple example, the real world task I'm trying to accomplish is processing a batch of files instead of adding pairs of numbers. Furthermore the client is a Flask REST API and therefore cannot contain any blocking get calls. In the script above I simulate this constraint with the looping function. This function starts the asynchronous Celery task, but does not wait for a response. (The infinite while loop that follows simulates the web server continuing to run and handle other requests.)
If you run the script with the argument "looping" it runs this code path. Here it immediately prints the Celery task ID then drops into the infinite loop.
$ ./tasks.py looping
a39c54d3-2946-4f4e-a465-4cc3adc6cbe5
The Celery worker logs show that the add operations are performed, but the caller doesn't define a callback function, so it never gets the results.
(I realize that this particular example is embarrassingly parallel, so I could use chunks to divide this up into multiple tasks. However, in my non-simplified real-world case I have tasks that cannot be parallelized.)
What I want is to be able to specify a callback in the looping scenario. Something like this.
def looping(*pairs):
task = add_pairs_of_numbers.delay(pairs, callback=handle_message) # There is no such callback.
print(task)
while True:
pass
In the Celery documentation and all the examples I can find online (for example this), there is no way to define a callback function as part of the delay call or its apply_async equivalent. You can only specify one as part of a get callback. That's making me think this is an intentional design decision.
In my REST API scenario I can work around this by having the Celery worker process send a "status update" back to the Flask server in the form of an HTTP post, but this seems weird because I'm starting to replicate messaging logic in HTTP that already exists in Celery.
Is there any way to write my looping scenario so that the caller receives callbacks without making a blocking call, or is that explicitly forbidden in Celery?

It's a pattern that is not supported by celery although you can (somewhat) trick it out by posting custom state updates to your task as described here.
Use update_state() to update a task’s state:.
def upload_files(self, filenames):
for i, file in enumerate(filenames):
if not self.request.called_directly:
self.update_state(state='PROGRESS',
meta={'current': i, 'total': len(filenames)})```
The reason that celery does not support such a pattern is that task producers (callers) are strongly decoupled from the task consumers (workers) with the only communications between the two being the broker to support communication from producers to consumers and the result backend supporting communications from consumers to producers. The closest you can get currently is with polling a task state or writing a custom result backend that will allow you to post events either via AMP RPC or redis subscriptions.

What is the cleanest way to write and run a DAG of tasks?

I want to write and run a directed acyclic graph (DAG) with several tasks running in serial or parallel. Ideally it would look like:
def task1():
# ...
def task2():
# ...
graph = Sequence([
task1,
task2,
Parallel([
task3,
task4
]),
task5
]
graph.run()
It would run 1 -> 2 -> (3 and 4 concurrently) -> 5. The tasks need to access the global scope to store results, write logs and access command line parameters.
My use case is writing a deployment script. Parallel tasks are IO-bound: typically waiting on a remote server to complete a step.
I looked into threading, asyncio, Airflow, but did not find any simple library that would allow this without some boilerplate code to traverse and control the graph's execution. Does anything like that exist?

Here's a quick proof-of-concept implementation. It can be used like:
graph = sequence(
lambda: print(1),
lambda: print(2),
parallel(
lambda: print(3),
lambda: print(4),
sequence(
lambda: print(5),
lambda: print(6))),
lambda: print(7)
graph()
1
2
3
5
6
4
7
sequence produces a function that wraps a for loop, and parallel produces a function that wraps use of a thread pool:
from typing import Callable
from multiprocessing.pool import ThreadPool
Task = Callable[[], None]
_pool: ThreadPool = ThreadPool()
def sequence(*tasks: Task) -> Task:
def run():
for task in tasks:
task()
return run # Returning "run" to be used as a task by other "sequence" and "parallel" calls
def parallel(*tasks: Task) -> Task:
def run():
_pool.map(lambda f: f(), tasks) # Delegate to a pool used for IO tasks
return run
Each call to sequence and parallel returns a new "Task" (a function taking no arguments and returning nothing). That task can then be called by other, outer calls to sequence and parallel.
Things to note about the ThreadPool:
While this does use a thread pool for parallel, due to the GIL, this will still only execute one thing at a time. This means parallel is essentially useless for CPU-bound tasks.
I haven't specified how many threads the pool should begin with. I think it defaults to the number of cores you have available to you. You could specify how many you want to start with using the first parameter to ThreadPool if you want more.
For brevity, I'm not cleaning up the ThreadPool. You should definitely do that though if you use this.
Even though ThreadPool is a part of multiprocessing, confusingly it uses threads not processes.

You mentioned that yours tasks are IO bound, that means that asycnio would be a good candidate for this. You can try the aiodag library, which is an extremely light interface on top of asycnio that lets you easily define asynchronous dags:
import asyncio
from aiodag import task
#task
async def task1(x):
...
#task
async def task2(x):
...
#task
async def task3(x):
...
#task
async def task4(x):
...
#task
async def task5(x, y):
...
# rest of task funcs
async def main():
t1 = task1()
t2 = task2(t1)
t3 = task3(t2) # t3/t4 take t2, when t2 finishes, will run concurrently
t4 = task4(t2)
t5 = task5(t3, t4) # will wait until t3/t4 finish to execute
await t5
loop = asyncio.new_event_loop()
asyncio.run_until_complete(main())
Check out the readme on the github page for aiodag for a bit of detail on how the dag is constructed/optimally executed.
https://github.com/aa1371/aiodag
If you don't want to be tied to async functions, then check out dask's delayed interface. The definition of the dag works the same way as aiodag's, where the dag is constructed by function invocations. Dask will seamlessly handle executing your dag in the optimal parallel scheme, and can distribute over an arbitrarily large cluster to perform the parallel executions as well.
https://docs.dask.org/en/latest/delayed.html

Running a task after all tasks have been completed

I'm writing an application which needs to run a series of tasks in parallel and then a single task with the results of all the tasks run:
#celery.task
def power(value, expo):
return value ** expo
#celery.task
def amass(values):
print str(values)
It's a very contrived and oversimplified example, but hopefully the point comes across well. Basically, I have many items which need to run through power, but I only want to run amass on the results from all of the tasks. All of this should happen asynchronously, and I don't need anything back from the amass method.
Does anyone know how to set this up in celery so that everything is executed asynchronously and a single callback with a list of the results is called after all is said and done?
I've setup this example to run with a chord as Alexander Afanasiev recommended:
from time import sleep
import random
tasks = []
for i in xrange(10):
tasks.append(power.s((i, 2)))
sleep(random.randint(10, 1000) / 1000.0) # sleep for 10-1000ms
callback = amass.s()
r = chord(tasks)(callback)
Unfortunately, in the above example, all tasks in tasks are started only when the chord method is called. Is there a way that each task can start separately and then I could add a callback to the group to run when everything has finished?

Here's a solution which worked for my purposes:
tasks.py:
from time import sleep
import random
#celery.task
def power(value, expo):
sleep(random.randint(10, 1000) / 1000.0) # sleep for 10-1000ms
return value ** expo
#celery.task
def amass(results, tasks):
completed_tasks = []
for task in tasks:
if task.ready():
completed_tasks.append(task)
results.append(task.get())
# remove completed tasks
tasks = list(set(tasks) - set(completed_tasks))
if len(tasks) > 0:
# resend the task to execute at least 1 second from now
amass.delay(results, tasks, countdown=1)
else:
# we done
print results
Use Case:
tasks = []
for i in xrange(10):
tasks.append(power.delay(i, 2))
amass.delay([], tasks)
What this should do is start all of the tasks as soon as possible asynchronously. Once they've all been posted to the queue, the amass task will also be posted to the queue. The amass task will keep reposting itself until all of the other tasks have been completed.

Celery has plenty of tools for most of workflows you can imagine.
It seems you need to get use of chord. Here's a quote from docs:
A chord is just like a group but with a callback. A chord consists of
a header group and a body, where the body is a task that should
execute after all of the tasks in the header are complete.

Taking a look at this snippet from your question, it looks like you are passing a list as the chord header, rather than a group:
from time import sleep
import random
tasks = []
for i in xrange(10):
tasks.append(power.s((i, 2)))
sleep(random.randint(10, 1000) / 1000.0) # sleep for 10-1000ms
callback = amass.s()
r = chord(tasks)(callback)
Converting the list to a group should result in the behaviour you're expecting:
...
callback = amass.s()
tasks = group(tasks)
r = chord(tasks)(callback)

The answer that #alexander-afanasiev gave you is essentially right: use a chord.
Your code is OK, but tasks.append(power.s((i, 2))) is not actually executing the subtask, just adding subtasks to a list. It's chord(...)(...) the one that send as many messages to the broker as subtasks you have defined in tasks list, plus one more message for the callback subtask. When you call chord it returns as soon as it can.
If you want to know when the chord has finished you can poll for completion like with a single task using r.ready() in your sample.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django and concurrent tasks in Celery - python

Related

How to configure celery for concurrent execution with multi-process?

Run a same celery task in loop

Can Celery pass a Status Update to a non-Blocking Caller?

What is the cleanest way to write and run a DAG of tasks?

Running a task after all tasks have been completed

Categories

Resources