I writing an app based on the asyncio framework. This app interacts with an API that has a rate limit(maximum 2 calls per sec). So I moved methods which interact with an API to the celery for using it as rate limiter. But it is looks like as an overhead.
There are any ways to create a new asyncio event loop(or something else) that guarantees execution of a coroutins not more then n per second?
The accepted answer is accurate. Note however that, usually, one would want to get as close to 2QPS as possible. This method doesn't offer any parallelisation, which could be a problem if make_io_call() takes longer than a second to execute. A better solution would be to pass a semaphore to make_io_call, that it can use to know whether it can start executing or not.
Here is such an implementation: RateLimitingSemaphore will only release its context once the rate limit drops below the requirement.
import asyncio
from collections import deque
from datetime import datetime
class RateLimitingSemaphore:
def __init__(self, qps_limit, loop=None):
self.loop = loop or asyncio.get_event_loop()
self.qps_limit = qps_limit
# The number of calls that are queued up, waiting for their turn.
self.queued_calls = 0
# The times of the last N executions, where N=qps_limit - this should allow us to calculate the QPS within the
# last ~ second. Note that this also allows us to schedule the first N executions immediately.
self.call_times = deque()
async def __aenter__(self):
self.queued_calls += 1
while True:
cur_rate = 0
if len(self.call_times) == self.qps_limit:
cur_rate = len(self.call_times) / (self.loop.time() - self.call_times[0])
if cur_rate < self.qps_limit:
break
interval = 1. / self.qps_limit
elapsed_time = self.loop.time() - self.call_times[-1]
await asyncio.sleep(self.queued_calls * interval - elapsed_time)
self.queued_calls -= 1
if len(self.call_times) == self.qps_limit:
self.call_times.popleft()
self.call_times.append(self.loop.time())
async def __aexit__(self, exc_type, exc, tb):
pass
async def test(qps):
executions = 0
async def io_operation(semaphore):
async with semaphore:
nonlocal executions
executions += 1
semaphore = RateLimitingSemaphore(qps)
start = datetime.now()
await asyncio.wait([io_operation(semaphore) for i in range(5*qps)])
dt = (datetime.now() - start).total_seconds()
print('Desired QPS:', qps, 'Achieved QPS:', executions / dt)
if __name__ == "__main__":
asyncio.get_event_loop().run_until_complete(test(100))
asyncio.get_event_loop().close()
Will print Desired QPS: 100 Achieved QPS: 99.82723898022084
I believe you are able to write a cycle like this:
while True:
t0 = loop.time()
await make_io_call()
dt = loop.time() - t0
if dt < 0.5:
await asyncio.sleep(0.5 - dt, loop=loop)
Related
I would like to know how can I execute the tasks group 'tg_fast' immediately, and after, continue the tasks group 'tg_main'(or start again if not possible to continue).
In use asyncio.gather(), the result is like TaskGroup.
import asyncio
async def another_coro(i):
print(i)
await asyncio.sleep(.1)
async def coro(i):
if i == 1:
async with asyncio.TaskGroup() as tg_fast:
tg_fast.create_task(another_coro(i * 10))
tg_fast.create_task(another_coro(i * 100))
# await asyncio.gather(*[another_coro(i * 10), another_coro(i * 100)])
else:
print(i)
await asyncio.sleep(.1)
async def main():
async with asyncio.TaskGroup() as tg_main:
for i in range(0, 3):
tg_main.create_task(coro(i))
asyncio.run(main(), debug=True)
printing is 0 => 2 => 10 => 100
But I would a method to get: 0 => 10 => 100 => ... OR 0 => 100 => 10 => ...
The goal being to initiate 10 and 100 after 0 and before 2.
Thanks you very much for your help.
Edit:
I want to call 'another_coro' simultaneously. Not wait for one and start the second one after.
And I don't need to finish them, I can execute both until await 'asyncio.sleep(.1') and continue the event loop.
For this to work, you have to deliberately add another mechanism to prioritize tasks, and it has to be done explicitly to your other tasks in the "non priority" group.
It could be done by, for example, subclassing asyncio.TaskGroup, and add a priority mechanism to the __aexit__ method, so that when a group is intended to be exited (and all its tasks intended to be awaited), it could check in a central registry for all your instances of your specialized TaskGroup if there is a TaskGroup with greater priority running, and then wait until that one exits -
That would work without needing to change any code in your tasks -just how you instantiate your groups - but on the other hand, if would not prevent the non-prioritized tasks from step and run parts in any other point inthe code they await (or otherwise yield to the asyncio loop).
Another approach, for which I wrote the snippet bellow, requires you to change the tasks that are to have lower priority at points, and call a specialized sleep in them (it can be called with "0" delay, just as asyncio.sleep) . The points where these calls are placed become explicit points where your tasks will yield priority to the tasks that should run first.
This allows greater flexibility, is more explicit, and is guaranteed to pause your lower priority work - the downside being you have to explicitly add the "checkpoints" in your code.
Perceive that this works by the modified .sleep method simply not returning while there is any other higher priority task running.
import asyncio
from heapq import heappush, heapify
granularity = 0.01
class PriorityGroups:
def __init__(self):
self.priority_queue = []
self.counter = 0
async def sleep(self, delay, priority=10):
counter = self.counter
self.counter += 1
steps = delay / granularity
step_delay = delay / steps
step = 0
heappush(self.priority_queue, (priority, counter))
try:
while step < steps or (self.priority_queue and self.priority_queue[0][0] < priority):
await asyncio.sleep(step_delay)
step += 1
finally:
self.priority_queue.remove((priority, counter))
heapify(self.priority_queue)
priority_group = PriorityGroups()
async def another_coro(i, priority=1):
await priority_group.sleep(.1, priority)
print(i)
async def coro(i):
if i == 1:
async with asyncio.TaskGroup() as tg_fast:
tg_fast.create_task(another_coro(i * 10))
tg_fast.create_task(another_coro(i * 100))
# await asyncio.gather(*[another_coro(i * 10), another_coro(i * 100)])
else:
await priority_group.sleep(.1)
print(i)
async def main():
async with asyncio.TaskGroup() as tg_main:
for i in range(0, 3):
tg_main.create_task(coro(i))
asyncio.run(main(), debug=True)
So - just place calls for the same instance of PriorityGroups.sleep, optionally passing a lower number for the priority (==more prioritary), for things that should run first. Having the control placed in an instance of PriorityGroups even means you can have parallel nested groups of tasks and priority tasks, and one group won't interfere with the others.
I'm currently migrating some Python code that used to be blocking to use asyncio with async/await. It is a lot of code to migrate at once so I would prefer to do it gradually and have metrics. With that thing in mind I want to create a decorator to wrap some functions and know how long they are blocking the event loop. For example:
def measure_blocking_code(f):
def wrapper(*args, **kwargs):
# ?????
# It should measure JUST 1 second
# not 5 which is what the whole async function takes
return wrapper
#measure_blocking_code
async def my_function():
my_blocking_function() # Takes 1 seconds
await my_async_function() # Takes 2 seconds
await my_async_function_2() # Takes 2 seconds
I know the event loop has a debug function that already report this, but I need to get that information for specific functions.
TLDR;
This decorator does the job:
def measure_blocking_code(f):
async def wrapper(*args, **kwargs):
t = 0
coro = f()
try:
while True:
t0 = time.perf_counter()
future = coro.send(None)
t1 = time.perf_counter()
t += t1 - t0
while not future.done():
await asyncio.sleep(0)
future.result() # raises exceptions if any
except StopIteration as e:
print(f'Function took {t:.2e} sec')
return e.value
return wrapper
Explanation
This workaround exploits the conventions used in asyncio implementation in cPython. These conventions are a superset of PEP-492. In other words:
You can generally use async/await without knowing these details.
This might not work with other async libraries like trio.
An asyncio coro object (coro) can be executed by calling .send() member. This will only run the blocking code, until an async call yields a Future object. By only measuring the time spent in .send(), the duration of the blocking code can be determined.
I finally found the way. I hope it helps somebody
import asyncio
import time
def measure(f):
async def wrapper(*args, **kwargs):
coro_wrapper = f(*args, **kwargs).__await__()
fut = asyncio.Future()
total_time = 0
def done(arg=None):
try:
nonlocal total_time
start_time = time.perf_counter()
next_fut = coro_wrapper.send(arg)
end_time = time.perf_counter()
total_time += end_time - start_time
next_fut.add_done_callback(done)
except StopIteration:
fut.set_result(arg)
except Exception as e:
fut.set_exception(e)
done()
res = await fut
print('Blocked for: ' + str(total_time) + ' seconds')
return res
return wrapper
It's well known that asyncio is designed to speed up server ,enhance it's ability to carry up more requests as a web server. However according to my test today, I shockedly found that for the puropse of switching between tasks ,using Thread is much more faster than using coroutine (eventhough under a thread lock as guarantee). Is that means it meaningless using coroutine?
Wondering why ,could anyone please help me figure out?
Here's my testting code : add a global variable 2 000 000 times in two tasks by turns.
from threading import Thread , Lock
import time , asyncio
def thread_speed_test():
def add1():
nonlocal count
for i in range(single_test_num):
mutex.acquire()
count += 1
mutex.release()
mutex = Lock()
count = 0
thread_list = list()
for i in range(thread_num):
thread_list.append(Thread(target = add1))
st_time = time.time()
for thr in thread_list:
thr.start()
for thr in thread_list:
thr.join()
ed_time = time.time()
print("runtime" , count)
print(f'threading finished in {round(ed_time - st_time,4)}s ,speed {round(single_test_num * thread_num / (ed_time - st_time),4)}q/s' ,end='\n\n')
def asyncio_speed_test():
count = 0
#asyncio.coroutine
def switch():
yield
async def add1():
nonlocal count
for i in range(single_test_num):
count += 1
await switch()
async def main():
tasks = asyncio.gather( *(add1() for i in range(thread_num))
)
st_time = time.time()
await tasks
ed_time = time.time()
print("runtime" , count)
print(f'asyncio finished in {round(ed_time - st_time,4)}s ,speed {round(single_test_num * thread_num / (ed_time - st_time),4)}q/s')
asyncio.run(main())
if __name__ == "__main__":
single_test_num = 1000000
thread_num = 2
thread_speed_test()
asyncio_speed_test()
got the following result in my pc:
2000000
threading finished in 0.9332s ,speed 2143159.1985q/s
2000000
asyncio finished in 16.044s ,speed 124657.3379q/s
append:
I realized that when thread number increase , threading mode goes slower but async mode goes faster.
here's my test results:
# asyncio #
thread_num numbers of switching in 1sec average time of a single switch(ns)
2 122296 8176
32 243502 4106
128 252571 3959
512 253258 3948
4096 239334 4178
# threading #
thread_num numbers of switching in 1sec average time of a single switch(ns)
2 2278386 438
4 737829 1350
8 393786 2539
16 367123 2720
32 369260 2708
64 381061 2624
512 381403 2622
To make a more fair comparison, I changed your code slightly.
I replaced your simple Lock with a Condition. This allowed me to force a thread switch after each iteration of the counter. The Condition.wait() function call always blocks the thread where the call is made; the thread continues only when another thread calls Condition.notify(). Therefore a thread switch must occur.
This is not the case with your test. A task switch will only occur when the thread scheduler causes one, since the logic of your code never causes a thread to block. The Lock.release() function does not block the caller, unlike Condition.wait().
There is one small difficulty: the last running thread will block forever when it calls Condition.wait() for the last time. That is why I introduced a simple counter to keep track of how many running threads are left. Also, when a thread is finished with its loop it has to make one final call to Condition.notify() in order to release the next thread.
The only change I made to your async test is to replace the "yield" statement with await asyncio.sleep(0). This was for compatibility with Python 3.8. I also reduced the number of trials by a factor of 10.
Timings were on a fairly old Win10 machine with Python 3.8.
As you can see, the threading code is quite a bit slower. That's what I would expect. One of the reasons to have async/await is because it's more lightweight than the threading mechanism.
from threading import Thread , Condition
import time , asyncio
def thread_speed_test():
def add1():
nonlocal count
nonlocal thread_count
for i in range(single_test_num):
with mutex:
mutex.notify()
count += 1
if thread_count > 1:
mutex.wait()
thread_count -= 1
with mutex:
mutex.notify()
mutex = Condition()
count = 0
thread_count = thread_num
thread_list = list()
for i in range(thread_num):
thread_list.append(Thread(target = add1))
st_time = time.time()
for thr in thread_list:
thr.start()
for thr in thread_list:
thr.join()
ed_time = time.time()
print("runtime" , count)
print(f'threading finished in {round(ed_time - st_time,4)}s ,speed {round(single_test_num * thread_num / (ed_time - st_time),4)}q/s' ,end='\n\n')
def asyncio_speed_test():
count = 0
async def switch():
await asyncio.sleep(0)
async def add1():
nonlocal count
for i in range(single_test_num):
count += 1
await switch()
async def main():
tasks = asyncio.gather(*(add1() for i in range(thread_num)) )
st_time = time.time()
await tasks
ed_time = time.time()
print("runtime" , count)
print(f'asyncio finished in {round(ed_time - st_time,4)}s ,speed {round(single_test_num * thread_num / (ed_time - st_time),4)}q/s')
asyncio.run(main())
if __name__ == "__main__":
single_test_num = 100000
thread_num = 2
thread_speed_test()
asyncio_speed_test()
runtime 200000
threading finished in 4.0335s ,speed 49584.7548q/s
runtime 200000
asyncio finished in 1.7519s ,speed 114160.9466q/s
I am not sure, you might be comparing apples to oranges.
You are basically punishing async, sort of forcing it to switch contexts, which takes time, while the threads are allowed to run freely.
asyncio is thought for tasks that have to wait for input for some time. This is not the case in your benchmark.
For a fair comparison you should simulate some realistic delay.
I am using a script that handle ws requests and response them as json. How ever at the same time those information must be insert/update to DB. But i don t want to wait DB. I should return the response asap. I am using "bottle" in python to do so. How can i reach the solution.
I found the answer asyncio. However its working on python3. This is the site that i reach. They have good explantation there.
https://medium.freecodecamp.org/a-guide-to-asynchronous-programming-in-python-with-asyncio-232e2afa44f6
An this is an example from thier resources.
import asyncio
import time
from datetime import datetime
async def custom_sleep():
print('SLEEP', datetime.now())
time.sleep(1)
async def factorial(name, number):
f = 1
for i in range(2, number+1):
print('Task {}: Compute factorial({})'.format(name, i))
await custom_sleep()
f *= i
print('Task {}: factorial({}) is {}\n'.format(name, number, f))
start = time.time()
loop = asyncio.get_event_loop()
tasks = [
asyncio.ensure_future(factorial("A", 3)),
asyncio.ensure_future(factorial("B", 4)),
]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
end = time.time()
print("Total time: {}".format(end - start))
I am running pool.map on big data array and i want to print report in console every minute.
Is it possible? As i understand, python is synchronous language, it can't do this like nodejs.
Perhaps it can be done by threading.. or how?
finished = 0
def make_job():
sleep(1)
global finished
finished += 1
# I want to call this function every minute
def display_status():
print 'finished: ' + finished
def main():
data = [...]
pool = ThreadPool(45)
results = pool.map(make_job, data)
pool.close()
pool.join()
You can use a permanent threaded timer, like those from this question: Python threading.timer - repeat function every 'n' seconds
from threading import Timer,Event
class perpetualTimer(object):
# give it a cycle time (t) and a callback (hFunction)
def __init__(self,t,hFunction):
self.t=t
self.stop = Event()
self.hFunction = hFunction
self.thread = Timer(self.t,self.handle_function)
def handle_function(self):
self.hFunction()
self.thread = Timer(self.t,self.handle_function)
if not self.stop.is_set():
self.thread.start()
def start(self):
self.stop.clear()
self.thread.start()
def cancel(self):
self.stop.set()
self.thread.cancel()
Basically this is just a wrapper for a Timer object that creates a new Timer object every time your desired function is called. Don't expect millisecond accuracy (or even close) from this, but for your purposes it should be ideal.
Using this your example would become:
finished = 0
def make_job():
sleep(1)
global finished
finished += 1
def display_status():
print 'finished: ' + finished
def main():
data = [...]
pool = ThreadPool(45)
# set up the monitor to make run the function every minute
monitor = PerpetualTimer(60,display_status)
monitor.start()
results = pool.map(make_job, data)
pool.close()
pool.join()
monitor.cancel()
EDIT:
A cleaner solution may be (thanks to comments below):
from threading import Event,Thread
class RepeatTimer(Thread):
def __init__(self, t, callback, event):
Thread.__init__(self)
self.stop = event
self.wait_time = t
self.callback = callback
self.daemon = True
def run(self):
while not self.stop.wait(self.wait_time):
self.callback()
Then in your code:
def main():
data = [...]
pool = ThreadPool(45)
stop_flag = Event()
RepeatTimer(60,display_status,stop_flag).start()
results = pool.map(make_job, data)
pool.close()
pool.join()
stop_flag.set()
One way to do this, is to use main thread as the monitoring one. Something like below should work:
def main():
data = [...]
results = []
step = 0
pool = ThreadPool(16)
pool.map_async(make_job, data, callback=results.extend)
pool.close()
while True:
if results:
break
step += 1
sleep(1)
if step % 60 == 0:
print "status update" + ...
I've used .map() instead of .map_async() as the former is synchronous one. Also you probably will need to replace results.extend with something more efficient. And finally, due to GIL, speed improvement may be much smaller than expected.
BTW, it is little bit funny that you wrote that Python is synchronous in a question that asks about ThreadPool ;).
Consider using the time module. The time.time() function returns the current UNIX time.
For example, calling time.time() right now returns 1410384038.967499. One second later, it will return 1410384039.967499.
The way I would do this would be to use a while loop in the place of results = pool(...), and on every iteration to run a check like this:
last_time = time.time()
while (...):
new_time = time.time()
if new_time > last_time+60:
print "status update" + ...
last_time = new_time
(your computation here)
So that will check if (at least) a minute has elapsed since your last status update. It should print a status update approximately every sixty seconds.
Sorry that this is an incomplete answer, but I hope this helps or gives you some useful ideas.