I'm using python 3.6 and trying to use asyncio to run tasks concurrently. I thought asyncio.gather and ensure future would be the tools to use, but it does not seem to be working as I thought it would. Could someone give me pointers?
Here is my code:
import time
import asyncio
async def f1():
print('Running func 1')
time.sleep(4)
print('Returning from func 1')
return 1
async def f2():
print('Running func 2')
time.sleep(6)
print('Returning from func 2')
return 2
async def f3():
print('Running func 3')
time.sleep(1)
print('Returning from func 3')
return 3
async def foo():
calls = [
asyncio.ensure_future(f())
for f in [f1, f2, f3]
]
res = await asyncio.gather(*calls)
print(res)
loop = asyncio.get_event_loop()
start = time.time()
loop.run_until_complete(foo())
end = time.time()
print(f'Took {end - start} seconds')
print('done')
I would expect the 3 functions to run independently of each other, but each one seems to be blocked behind the other. This is the ouput I get
Running func 1
Returning from func 1
Running func 2
Returning from func 2
Running func 3
Returning from func 3
[1, 2, 3]
Took 11.009816884994507 seconds
done
I would have expected it to take 6 seconds, with the bottleneck being f2.
First off, welcome to StackOverflow.
When you run code inside event loop, this code MUST use async libraries or be run in executor if you want not to block the entire process.
In this way, event loop can send the task background to be executed in a worker or in the event loop itself in case you use async libraries. Meanwhile, event loop can attend new function o portion of code and repeat the same process.
Once any background task has finished, event loop catch them and return its value.
In your case, if you use async library of sleep you should obtain expected results. For example:
async def f1():
print('Running func 1')
await asyncio.sleep(4)
print('Returning from func 1')
return 1
I didn't read this tutorial, but I hope it should be interesting for finding your solution.
Related
I've run two variants of code that, to me, should run exactly identically - so I'm very surprised to see different output from each...
First up:
from concurrent.futures import ThreadPoolExecutor
from time import sleep
executor = ThreadPoolExecutor(max_workers=2)
def func(x):
print(f"In func {x}")
sleep(1)
return True
foo = executor.map(func, range(0, 10))
for f in foo:
print(f"blah {f}")
if f:
break
print("Shutting down")
executor.shutdown(wait=False)
print("Shut down")
this outputs the following - showing remaining futures being run to completion. While that surprised me at first, I believe it's consistent with the docs (in the absence of cancel_futures being set to True), as per https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor.shutdown "Regardless of the value of wait, the entire Python program will not exit until all pending futures are done executing."
In func 0
In func 1
In func 2
In func 3
blah True
Shutting down
Shut down
In func 4
In func 5
In func 6
In func 7
In func 8
In func 9
So that's fine. But here's the odd thing - if I refactor to call that within a function, it behaves differently. See minor tweak:
from concurrent.futures import ThreadPoolExecutor
from time import sleep
def run_test():
executor = ThreadPoolExecutor(max_workers=2)
def func(x):
print(f"In func {x}")
sleep(1)
return True
foo = executor.map(func, range(0, 10))
for f in foo:
print(f"blah {f}")
if f:
break
print("Shutting down")
executor.shutdown(wait=False)
print("Shut down")
run_test()
this outputs the following, suggesting the future are cancelled in this case
In func 0
In func 1
In func 2
blah True
Shutting down
In func 3
Shut down
So I guess something is happening as the executor falls out of scope at the end of run_test()? But this seems to contradict the docs (which don't mention this), and surely the executor similarly falls out of scope at the end of the first script??
Seen at both Python 3.8 and 3.9.
I expected the same output in the two cases, but they mis-matched
This surprised me too. This code also reproduces your behaviour
from concurrent.futures import ThreadPoolExecutor
from time import sleep
def run_test():
executor = ThreadPoolExecutor(max_workers=2)
def func(x):
print(f"In func {x}")
sleep(1)
foo = executor.map(func, range(0, 10))
# a
x = next(foo)
# b
print("Shutting down")
executor.shutdown(wait=False)
print("Shut down")
run_test()
If you run it as-is, it will run for first couple of integers between 0 and 10 and then exit. If you comment out the line between #a and #b then it runs all 10.
The reason, as far as I can tell, is that if you loop over the generator object (foo) at all (or call next() on it) then the code ends up in this iterator function in the CPython concurrent.futures._base source code.
When the run_test() function exits and foo goes out of scope, then you end up in this finally block, which cancels all pending futures.
In your example without a function, I believe your guess is correct that it is related to the order in which objects go out of scope. You can see this by commenting / un-commenting the line between # a and # b below
from concurrent.futures import ThreadPoolExecutor
from time import sleep
executor = ThreadPoolExecutor(max_workers=2)
def func(x):
print(f"In func {x}")
sleep(1)
return True
foo = executor.map(func, range(0, 10))
next(foo)
# a
# del foo
# b
print("Shutting down")
executor.shutdown(wait=False)
print("Shut down")
I'm currently migrating some Python code that used to be blocking to use asyncio with async/await. It is a lot of code to migrate at once so I would prefer to do it gradually and have metrics. With that thing in mind I want to create a decorator to wrap some functions and know how long they are blocking the event loop. For example:
def measure_blocking_code(f):
def wrapper(*args, **kwargs):
# ?????
# It should measure JUST 1 second
# not 5 which is what the whole async function takes
return wrapper
#measure_blocking_code
async def my_function():
my_blocking_function() # Takes 1 seconds
await my_async_function() # Takes 2 seconds
await my_async_function_2() # Takes 2 seconds
I know the event loop has a debug function that already report this, but I need to get that information for specific functions.
TLDR;
This decorator does the job:
def measure_blocking_code(f):
async def wrapper(*args, **kwargs):
t = 0
coro = f()
try:
while True:
t0 = time.perf_counter()
future = coro.send(None)
t1 = time.perf_counter()
t += t1 - t0
while not future.done():
await asyncio.sleep(0)
future.result() # raises exceptions if any
except StopIteration as e:
print(f'Function took {t:.2e} sec')
return e.value
return wrapper
Explanation
This workaround exploits the conventions used in asyncio implementation in cPython. These conventions are a superset of PEP-492. In other words:
You can generally use async/await without knowing these details.
This might not work with other async libraries like trio.
An asyncio coro object (coro) can be executed by calling .send() member. This will only run the blocking code, until an async call yields a Future object. By only measuring the time spent in .send(), the duration of the blocking code can be determined.
I finally found the way. I hope it helps somebody
import asyncio
import time
def measure(f):
async def wrapper(*args, **kwargs):
coro_wrapper = f(*args, **kwargs).__await__()
fut = asyncio.Future()
total_time = 0
def done(arg=None):
try:
nonlocal total_time
start_time = time.perf_counter()
next_fut = coro_wrapper.send(arg)
end_time = time.perf_counter()
total_time += end_time - start_time
next_fut.add_done_callback(done)
except StopIteration:
fut.set_result(arg)
except Exception as e:
fut.set_exception(e)
done()
res = await fut
print('Blocked for: ' + str(total_time) + ' seconds')
return res
return wrapper
I get this error:
D:\pythonstuff\demo.py:28: DeprecationWarning: The explicit passing of coroutine objects to asyncio.wait() is deprecated since Python 3.8, and scheduled for removal in Python 3.11.
await asyncio.wait([
Waited 1 second!
Waited 5 second!
Time passed: 0hour:0min:5sec
Process finished with exit code 0
When I run the code:
import asyncio
import time
class class1():
async def function_inside_class(self):
await asyncio.sleep(1)
print("Waited 1 second!")
async def function_inside_class2(self):
await asyncio.sleep(5)
print("Waited 5 second!")
def tic():
global _start_time
_start_time = time.time()
def tac():
t_sec = round(time.time() - _start_time)
(t_min, t_sec) = divmod(t_sec,60)
(t_hour,t_min) = divmod(t_min,60)
print('Time passed: {}hour:{}min:{}sec'.format(t_hour,t_min,t_sec))
object = class1()
async def main():
tic()
await asyncio.wait([
object.function_inside_class(),
object.function_inside_class2()
])
tac()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()
Are there any good alternatives to asyncio.wait? I don't want a warning in console every time I launch my application.
Edit: I don't want to just hide the error, that's bad practice, and I'm looking for other ways to do the same or a similar thing, not another async library to restore the old functionality.
You can just call it this way as it recommends in the docs here
Example from the docs:
async def foo():
return 42
task = asyncio.create_task(foo())
done, pending = await asyncio.wait({task})
So your code would become:
await asyncio.wait([
asyncio.create_task(object.function_inside_class()),
asyncio.create_task(object.function_inside_class2())
])
I want speed up some API requests... for that I try to figure out how to do and copy some code which run but when I try my own code its no longer asynchrone. Maybe someone find the fail?
Copy Code (guess from stackoverflow):
#!/usr/bin/env python3
import asyncio
#asyncio.coroutine
def func_normal():
print('A')
yield from asyncio.sleep(5)
print('B')
return 'saad'
#asyncio.coroutine
def func_infinite():
for i in range(10):
print("--%d" % i)
return 'saad2'
loop = asyncio.get_event_loop()
tasks = func_normal(), func_infinite()
a, b = loop.run_until_complete(asyncio.gather(*tasks))
print("func_normal()={a}, func_infinite()={b}".format(**vars()))
loop.close()
My "own" code (I need at the end a list returned and merge the results of all functions):
import asyncio
import time
#asyncio.coroutine
def say_after(start,count,say,yep=True):
retl = []
if yep:
time.sleep(5)
for x in range(start,count):
retl.append(x)
print(say)
return retl
def main():
print(f"started at {time.strftime('%X')}")
loop = asyncio.get_event_loop()
tasks = say_after(10,20,"a"), say_after(20,30,"b",False)
a, b = loop.run_until_complete(asyncio.gather(*tasks))
print("func_normal()={a}, func_infinite()={b}".format(**vars()))
loop.close()
c = a + b
#print(c)
print(f"finished at {time.strftime('%X')}")
main()
Or I m completly wrong and should solve that with multithreading? What would be the best way for API requests that returns a list that I need to merge?
Added comment for each section that needs improvement. Removed some to simply code.
In fact, I didn't find any performance uplift with using range() wrapped in coroutine and using async def, might worth with heavier operations.
import asyncio
import time
# #asyncio.coroutine IS DEPRECATED since python 3.8
#asyncio.coroutine
def say_after(wait=True):
result = []
if wait:
print("I'm sleeping!")
time.sleep(5)
print("'morning!")
# This BLOCKs thread, but release GIL so other thread can run.
# But asyncio runs in ONE thread, so this still harms simultaneity.
# normal for is BLOCKING operation.
for i in range(5):
result.append(i)
print(i, end='')
print()
return result
def main():
start = time.time()
# Loop argument will be DEPRECATED from python 3.10
# Make main() as coroutine, then use asyncio.run(main()).
# It will be in asyncio Event loop, without explicitly passing Loop.
loop = asyncio.get_event_loop()
tasks = say_after(), say_after(False)
# As we will use asyncio.run(main()) from now on, this should be await-ed.
a, b = loop.run_until_complete(asyncio.gather(*tasks))
print(f"Took {time.time() - start:5f}")
loop.close()
main()
Better way:
import asyncio
import time
async def say_after(wait=True):
result = []
if wait:
print("I'm sleeping!")
await asyncio.sleep(2) # 'await' a coroutine version of it instead.
print("'morning!")
# wrap iterator in generator - or coroutine
async def asynchronous_range(end):
for _i in range(end):
yield _i
# use it with async for
async for i in asynchronous_range(5):
result.append(i)
print(i, end='')
print()
return result
async def main():
start = time.time()
tasks = say_after(), say_after(False)
a, b = await asyncio.gather(*tasks)
print(f"Took {time.time() - start:5f}")
asyncio.run(main())
Result
Your code:
DeprecationWarning: "#coroutine" decorator is deprecated since Python 3.8, use "async def" instead
def say_after(wait=True):
I'm sleeping!
'morning!
01234
01234
Took 5.003802
Better async code:
I'm sleeping!
01234
'morning!
01234
Took 2.013863
Note that fixed code now finish it's job while other task is sleeping.
I need to create an Observable stream which emits the result of a async coroutine at regular intervals.
intervalRead is a function which returns an Observable, and takes as parameters the interval rate and an async coroutine function fun, which needs to be called at the defined interval.
My first aproach was to create an observable with the interval factory method, and then use map to call the coroutine, using from_future to wrap it in a Observable, and then get the value returned by the coroutine.
async def foo():
await asyncio.sleep(1)
return 42
def intervalRead(rate, fun) -> Observable:
loop = asyncio.get_event_loop()
return rx.interval(rate).pipe(
map(lambda i: rx.from_future(loop.create_task(fun()))),
)
async def main():
obs = intervalRead(5, foo)
obs.subscribe(
on_next= lambda item: print(item)
)
loop = asyncio.get_event_loop()
loop.create_task(main())
loop.run_forever()
Yet the output I get is not the result of the coroutine, but the Observable returned by from_future, emited at the specified interval
output: <rx.core.observable.observable.Observable object at 0x033B5650>
How could I could get the actual value returned by that Observable? I would expect 42
My second aproach was to create a custom observable:
def intervalRead(rate, fun) -> rx.Observable:
interval = rx.interval(rate)
def subs(observer: Observer, scheduler = None):
loop = asyncio.get_event_loop()
def on_timer(i):
task = loop.create_task(fun())
from_future(task).subscribe(
on_next= lambda i: observer.on_next(i),
on_error= lambda e: observer.on_error(e),
on_completed= lambda: print('coro completed')
)
interval.subscribe(on_next= on_timer, on_error= lambda e: print(e))
return rx.create(subs)
However, on subscription from_future(task) never emits a value, why does this happen?
Yet if i write intervalRead like this:
def intervalRead(rate, fun):
loop = asyncio.get_event_loop()
task = loop.create_task(fun())
return from_future(task)
I get the expected result: 42. Obviously this doesn´t solve my issue, but it confuses me why it doesn´t work in my second approach?
Finally, I experimented with a third approch using the rx.concurrency CurrentThreadScheduler and schedule an action perdiocally with the schedule_periodic method. Yet i'm facing the same issue I get with the second approach.
def funWithScheduler(rate, fun):
loop = asyncio.get_event_loop()
scheduler = CurrentThreadScheduler()
subject = rx.subjects.Subject()
def action(param):
obs = rx.from_future(loop.create_task(fun())).subscribe(
on_next= lambda item: subject.on_next(item),
on_error= lambda e: print(f'error in action {e}'),
on_completed= lambda: print('action completed')
)
obs.dispose()
scheduler.schedule_periodic(rate,action)
return subject
Would appreciate any insight into what am I missing or any other suggestions to accomplish what I need. This is my first project with asyncio and RxPY, I have only use RxJS in the context of an angular project so any help is welcome.
Your first example almost works. There are only two changes needed to get it working:
First the result of from_future is an observable that emits a single item (the value of the future when it completes). So the output of map is a higher order observable (an observable that emits observables). These children observables can be flattened by using the merge_all operator after map, or by using flat_map instead of map.
Then the interval operator must schedule its timer on the AsyncIO loop, which is not the case by default: The default scheduler is the TimeoutScheduler, and it spawns a new thread. So in the original code, the task cannot be scheduled on the AsyncIO event loop because create_task is called from another thread. Using the scheduler parameter on the call to subscribe declares the default scheduler to use for the whole operator chain.
The following code works (42 is printed every 5 seconds):
import asyncio
import rx
import rx.operators as ops
from rx.scheduler.eventloop import AsyncIOScheduler
async def foo():
await asyncio.sleep(1)
return 42
def intervalRead(rate, fun) -> rx.Observable:
loop = asyncio.get_event_loop()
return rx.interval(rate).pipe(
ops.map(lambda i: rx.from_future(loop.create_task(fun()))),
ops.merge_all()
)
async def main(loop):
obs = intervalRead(5, foo)
obs.subscribe(
on_next=lambda item: print(item),
scheduler=AsyncIOScheduler(loop)
)
loop = asyncio.get_event_loop()
loop.create_task(main(loop))
loop.run_forever()