Is there a way to track the progress of a chord, preferably in a tqdm bar?
For example if we take the documentation exemple, we would create this file:
#proj/tasks.py
#app.task
def add(x, y):
return x + y
#app.task
def tsum(numbers):
return sum(numbers)
and then run this script:
from celery import chord
from proj.tasks import add, tsum
chord(add.s(i, i)
for i in range(100))(tsum.s()).get()
How could we track the progression on the chord?
We cannot use update_state since the chord() object is not a function.
We cannot use collect() since chord()(callback) blocks the script until the results are ready.
Ideally I would envision something like this custom tqdm subclass for Dask, however I've been unable to find a similar solution.
Any help or hint much appreciated!
So I found a way around it.
First, chord()(callback) doesn't actually block the script, only the .get() part does. It just might take a long time to publish all tasks to the broker. Luckily, there's a simple way to track this publishing process through signals. We can create a progress bar before the publishing begins and modify the example handler from the documentation to update it:
from tqdm import tqdm
from celery.signals import after_task_publish
publish_pbar = tqdm(total=100, desc="Publishing tasks")
#after_task_publish.connect(sender='tasks.add')
def task_sent_handler(sender=None, headers=None, body=None, **kwargs):
publish_pbar.update(1)
c = chord(add.s(i, i)
for i in range(100))(tsum.s())
# The script will resume once all tasks are published so close the pbar
publish_pbar.close()
However this only works for publishing tasks since this signal is executed in the signal that sent the task. The task_success signal is executed in the worker process, so this trick can only be used in the worker log (to the best of my understanding).
So to track progress once all tasks have been published and the script resumes, I turned to worker stats from app.control.inspect().stats(). This returns a dict with various stats, among which are the completed tasks. Here's my implementation:
tasks_pbar = tqdm(total=100, desc="Executing tasks")
previous_total = 0
current_total = 0
while current_total<100:
current_total = 0
for key in app.control.inspect().stats():
current_total += app.control.inspect().stats()[key]['total']['tasks.add']
if current_total > previous_total:
tasks_pbar.update(current_total-previous_total)
previous_total = current_total
results = c.get()
tasks_pbar.close()
Finally, I think it might be necessary to give names to the tasks, both for filtering by the signal handler and for the stats() dict, so do not forget to add this to your tasks:
#proj/tasks.py
#app.task(name='tasks.add')
def add(x, y):
return x + y
If someone can find a better solution, please do share!
Related
I am trying to understand how to collect data for computing resource use, for example, the average number of customers waiting in line. I looked at the documentation at the following link, but it is just too much for me. I am looking for an example of how it is used and how to compute time-based average line length. I appreciate any guidance.
https://simpy.readthedocs.io/en/latest/topical_guides/monitoring.html#monitoring-your-processes
Resources do not have utilization logging. You will need to collect that yourself.
Monkey patching is a way to wrap resource requests with code to collects the stats without changing how resource requests are called. A more simple way is to just make a logger, and add a log call where you need it. That is how I did it in my example. The down side is you have to remember to add the logging code were you need it.
simple resources have the following properties for collecting stats: capacity, count (number of users with a resource, users (list of users with a resource), queue (list of pending resource requests)
"""
A quick example on how to get average line length
using a custom logger class
programmer: Michael R. Gibbs
"""
import simpy
import numpy as np
import pandas as pd
class LineLogger():
"""
logs the size of a resource line
"""
def __init__(self, env):
self.env = env
# the log
self.samples_df = pd.DataFrame(columns=['time','len'])
def log(self, time, len):
"""
log a time and length of the resoure request queue
time: time the messure is taken
len: length of the queue
"""
self.samples_df = self.samples_df.append({'time':time,'len':len},ignore_index=True)
def get_ave_line(self):
"""
finds the time weighted average of the queue length
"""
# use the next row to figure out how long the queue was at that length
self.samples_df['time_span'] = self.samples_df['time'].shift(-1) - self.samples_df['time']
# drop the last row because it would have a infinate time span
trimed_samples_df = self.samples_df[0:-1]
ave = np.average(trimed_samples_df['len'], weights=trimed_samples_df['time_span'])
return ave
def task(env, res, line_logger):
"""
A simple task that grabs a resouce for a bit of time
"""
with res.request() as req: # Generate a request event
# requester enters queue for resouce
line_logger.log(env.now,len(res.queue))
yield req
# requester got a resource and leaves requeuest queue
# if the resource was available when the request was made, then time in queue will be 0
line_logger.log(env.now,len(res.queue))
# keep resource to build a queue
yield env.timeout(3.5)
def gen_tasks(env, res, line_logger):
"""
generates 5 tasks to seize a resource
building a queue over time
"""
for i in range(5):
env.process(task(env,res,line_logger))
# put some time between requests
yield env.timeout(1)
if __name__ == '__main__':
env = simpy.Environment()
res = simpy.Resource(env, capacity=1)
line_logger = LineLogger(env)
env.process(gen_tasks(env,res,line_logger))
env.run(100)
print("finish sim")
print("average queue length is: ",line_logger.get_ave_line())
print()
print("log data")
print(line_logger.samples_df)
print()
print("done")
Use a process running in parallel with your main process to monitor utilization. Here is boilerplate code for a generator function you can use in the monitoring process.
data = []
def monitor_process(env, resource):
"""
Generator for monitoring process
that shares the environment with the main process
and collects information.
"""
while True:
item = (env.now,
resource.count,
len(resource.queue))
data.append(item)
yield env.timeout(0.25)
This generator function is set up to poll the resource object 4 times each simulation step and puts the result in an array. You can change the polling frequency. Call this generator like so:
env.process(monitor_process(env, target_resource))
When you call env.run(until=120) (for example) to run your main process, this process will run parallel and log resource statistics.
I have implemented monkey-patching for comparison to this approach. Monkey-patching decorates some of a resource's methods with logging features. The code is more elegant but also more complex. Moreover, with monkey-patching, the resource stats will be logged each time an event occurs, i.e. any of the target resource's get, put, request or release methods is called. The approach I have shown here will log resource stats at regular time intervals and the code is relatively simpler.
Hope this helps.
Cheers!
I am using Celery to asynchronously perform a group of operations. There are a lot of these operations and each may take a long time, so rather than send the results back in the return value of the Celery worker function, I'd like to send them back one at a time as custom state updates. That way the caller can implement a progress bar with a change state callback, and the return value of the worker function can be of constant size rather than linear in the number of operations.
Here is a simple example in which I use the Celery worker function add_pairs_of_numbers to add a list of pairs of numbers, sending back a custom status update for every added pair.
#!/usr/bin/env python
"""
Run worker with:
celery -A tasks worker --loglevel=info
"""
from celery import Celery
app = Celery("tasks", broker="pyamqp://guest#localhost//", backend="rpc://")
#app.task(bind=True)
def add_pairs_of_numbers(self, pairs):
for x, y in pairs:
self.update_state(state="SUM", meta={"x":x, "y":y, "x+y":x+y})
return len(pairs)
def handle_message(message):
if message["status"] == "SUM":
x = message["result"]["x"]
y = message["result"]["y"]
print(f"Message: {x} + {y} = {x+y}")
def non_looping(*pairs):
task = add_pairs_of_numbers.delay(pairs)
result = task.get(on_message=handle_message)
print(result)
def looping(*pairs):
task = add_pairs_of_numbers.delay(pairs)
print(task)
while True:
pass
if __name__ == "__main__":
import sys
if sys.argv[1:] and sys.argv[1] == "looping":
looping((3,4), (2,7), (5,5))
else:
non_looping((3,4), (2,7), (5,5))
If you run just ./tasks it executes the non_looping function. This does the standard Celery thing: makes a delayed call to the worker function and then uses get to wait for the result. A handle_message callback function prints each message, and the number of pairs added is returned as the result. This is what I want.
$ ./task.py
Message: 3 + 4 = 7
Message: 2 + 7 = 9
Message: 5 + 5 = 10
3
Though the non-looping scenario is sufficient for this simple example, the real world task I'm trying to accomplish is processing a batch of files instead of adding pairs of numbers. Furthermore the client is a Flask REST API and therefore cannot contain any blocking get calls. In the script above I simulate this constraint with the looping function. This function starts the asynchronous Celery task, but does not wait for a response. (The infinite while loop that follows simulates the web server continuing to run and handle other requests.)
If you run the script with the argument "looping" it runs this code path. Here it immediately prints the Celery task ID then drops into the infinite loop.
$ ./tasks.py looping
a39c54d3-2946-4f4e-a465-4cc3adc6cbe5
The Celery worker logs show that the add operations are performed, but the caller doesn't define a callback function, so it never gets the results.
(I realize that this particular example is embarrassingly parallel, so I could use chunks to divide this up into multiple tasks. However, in my non-simplified real-world case I have tasks that cannot be parallelized.)
What I want is to be able to specify a callback in the looping scenario. Something like this.
def looping(*pairs):
task = add_pairs_of_numbers.delay(pairs, callback=handle_message) # There is no such callback.
print(task)
while True:
pass
In the Celery documentation and all the examples I can find online (for example this), there is no way to define a callback function as part of the delay call or its apply_async equivalent. You can only specify one as part of a get callback. That's making me think this is an intentional design decision.
In my REST API scenario I can work around this by having the Celery worker process send a "status update" back to the Flask server in the form of an HTTP post, but this seems weird because I'm starting to replicate messaging logic in HTTP that already exists in Celery.
Is there any way to write my looping scenario so that the caller receives callbacks without making a blocking call, or is that explicitly forbidden in Celery?
It's a pattern that is not supported by celery although you can (somewhat) trick it out by posting custom state updates to your task as described here.
Use update_state() to update a task’s state:.
def upload_files(self, filenames):
for i, file in enumerate(filenames):
if not self.request.called_directly:
self.update_state(state='PROGRESS',
meta={'current': i, 'total': len(filenames)})```
The reason that celery does not support such a pattern is that task producers (callers) are strongly decoupled from the task consumers (workers) with the only communications between the two being the broker to support communication from producers to consumers and the result backend supporting communications from consumers to producers. The closest you can get currently is with polling a task state or writing a custom result backend that will allow you to post events either via AMP RPC or redis subscriptions.
I have a python app where user can initiate a certain task.
The whole purpose of a task is too execute a given number of POST/GET requests with a particular interval to a given URL.
So user gives N - number of requests, V - number of requests per second.
How is it better to design such a task taking into account that due to a I/O latency the actual r/s speed could bigger or smaller.
First of all I decided to use Celery with Eventlet because otherwise I would need dozen of works which is not acceptable.
My naive approach:
Client starts a task using task.delay()
Inside task I do something like this:
#task
def task(number_of_requests, time_period):
for _ in range(number_of_requests):
start = time.time()
params_for_concrete_subtask = ...
# .... do some IO with monkey_patched eventlet requests library
elapsed = (time.time() - start)
# If we completed this subtask to fast
if elapsed < time_period / number_of_requests:
eventlet.sleep(time_period / number_of_requests)
A working example is here.
if we are too fast we try to wait to keep the desired speed. If we are too slow it's ok from client's prospective. We do not violate requests/second requirement. But will this resume correctly if I restart Celery?
I think this should work but I thought there is a better way.
In Celery I can define a task with a particular rate limit which will almost match my needs guarantee. So I could use Celery group feature and write:
#task(rate_limit=...)
def task(...):
#
task_executor = task.s(number_of_requests, time_period)
group(task_executor(params_for_concrete_task) for params_for_concrete_task in ...).delay()
But here I hardcode the the rate_limit which is dynamic and I do not see a way of changing it. I saw an example:
task.s(....).set(... params ...)
But I tried to pass rate_limit to the set method it it did not work.
Another maybe a bettre idea was to use Celery's periodic task scheduler. With the default implementation periods and tasks to be executed periodically is fixed.
I need to be able to dynamically create tasks, which run periodically a given number of times with a specific rate limit. Maybe I need to run my own Scheduler which will take tasks from DB? But I do not see any documentation around this.
Another approach was to try to use a chain function, but I could not figure out is there a delay between tasks parameter.
If you want to adjust the rate_limit dynamically you can do it using the following code. It is also creating the chain() at runtime.
Run this you will see that we successfully override the rate_limit of 5/sec to 0.5/sec.
test_tasks.py
from celery import Celery, signature, chain
import datetime as dt
app = Celery('test_tasks')
app.config_from_object('celery_config')
#app.task(bind=True, rate_limit=5)
def test_1(self):
print dt.datetime.now()
app.control.broadcast('rate_limit',
arguments={'task_name': 'test_tasks.test_1',
'rate_limit': 0.5})
test_task = signature('test_tasks.test_1').set(immutable=True)
l = [test_task] * 100
chain = chain(*l)
res = chain()
I also tried to override the attribute from within the class, but IMO the rate_limit is set when the task is registered by the worker, that is why the .set() has no effects. I'm speculating here, one would have to check the source code.
Solution 2
Implement your own waiting mechanism using the end time of the previous call, in the chain the return of the function is passed to the next one.
So it would look like this:
from celery import Celery, signature, chain
import datetime as dt
import time
app = Celery('test_tasks')
app.config_from_object('celery_config')
#app.task(bind=True)
def test_1(self, prev_endtime=dt.datetime.now(), wait_seconds=5):
wait = dt.timedelta(seconds=wait_seconds)
print dt.datetime.now() - prev_endtime
wait = wait - (dt.datetime.now() - prev_endtime)
wait = wait.seconds
print wait
time.sleep(max(0, wait))
now = dt.datetime.now()
print now
return now
#app.control.rate_limit('test_tasks.test_1', '0.5')
test_task = signature('test_tasks.test_1')
l = [test_task] * 100
chain = chain(*l)
res = chain()
I think this is actually more reliable than the broadcast.
Maybe I am misunderstanding something about celery but I have been stuck on this for quite a while.
I have a bunch of simple subtasks that I want to run in parallel, and I want to iterate over them as they complete rather than waiting for all of them to complete. I tried this:
def task_generator():
for row in db:
yield mytask.s(row)
from celery.result import ResultSet
r = ResultSet(t.delay() for t in task_generator())
for result in r.iterate():
print result
However celery runs all of the tasks first and the iteration begins only after all of the tasks are completed, despite the docs for ResultSet.iterate reading "Iterate over the return values of the tasks as they finish one by one."
So how do I iterate over the task results as they are completed?
I've implemented this, and have used it in the past. I think .iterate() is deprecated though, so I am working on a new solution myself.
#task
def task_one(foo, bar):
return foo + bar
subtasks = []
for x in range(10):
subtasks.append(task_one.s(x, x+1))
results = celery.Group(subtasks)() # call this
for result in results.iterate(propagate=False):
answer = result.iteritems().next()
I'm writing an application which needs to run a series of tasks in parallel and then a single task with the results of all the tasks run:
#celery.task
def power(value, expo):
return value ** expo
#celery.task
def amass(values):
print str(values)
It's a very contrived and oversimplified example, but hopefully the point comes across well. Basically, I have many items which need to run through power, but I only want to run amass on the results from all of the tasks. All of this should happen asynchronously, and I don't need anything back from the amass method.
Does anyone know how to set this up in celery so that everything is executed asynchronously and a single callback with a list of the results is called after all is said and done?
I've setup this example to run with a chord as Alexander Afanasiev recommended:
from time import sleep
import random
tasks = []
for i in xrange(10):
tasks.append(power.s((i, 2)))
sleep(random.randint(10, 1000) / 1000.0) # sleep for 10-1000ms
callback = amass.s()
r = chord(tasks)(callback)
Unfortunately, in the above example, all tasks in tasks are started only when the chord method is called. Is there a way that each task can start separately and then I could add a callback to the group to run when everything has finished?
Here's a solution which worked for my purposes:
tasks.py:
from time import sleep
import random
#celery.task
def power(value, expo):
sleep(random.randint(10, 1000) / 1000.0) # sleep for 10-1000ms
return value ** expo
#celery.task
def amass(results, tasks):
completed_tasks = []
for task in tasks:
if task.ready():
completed_tasks.append(task)
results.append(task.get())
# remove completed tasks
tasks = list(set(tasks) - set(completed_tasks))
if len(tasks) > 0:
# resend the task to execute at least 1 second from now
amass.delay(results, tasks, countdown=1)
else:
# we done
print results
Use Case:
tasks = []
for i in xrange(10):
tasks.append(power.delay(i, 2))
amass.delay([], tasks)
What this should do is start all of the tasks as soon as possible asynchronously. Once they've all been posted to the queue, the amass task will also be posted to the queue. The amass task will keep reposting itself until all of the other tasks have been completed.
Celery has plenty of tools for most of workflows you can imagine.
It seems you need to get use of chord. Here's a quote from docs:
A chord is just like a group but with a callback. A chord consists of
a header group and a body, where the body is a task that should
execute after all of the tasks in the header are complete.
Taking a look at this snippet from your question, it looks like you are passing a list as the chord header, rather than a group:
from time import sleep
import random
tasks = []
for i in xrange(10):
tasks.append(power.s((i, 2)))
sleep(random.randint(10, 1000) / 1000.0) # sleep for 10-1000ms
callback = amass.s()
r = chord(tasks)(callback)
Converting the list to a group should result in the behaviour you're expecting:
...
callback = amass.s()
tasks = group(tasks)
r = chord(tasks)(callback)
The answer that #alexander-afanasiev gave you is essentially right: use a chord.
Your code is OK, but tasks.append(power.s((i, 2))) is not actually executing the subtask, just adding subtasks to a list. It's chord(...)(...) the one that send as many messages to the broker as subtasks you have defined in tasks list, plus one more message for the callback subtask. When you call chord it returns as soon as it can.
If you want to know when the chord has finished you can poll for completion like with a single task using r.ready() in your sample.