Python asyncio (aiohttp, aiofiles) - python

I seem to be having a difficult time understanding pythons asyncio. I have not written any code, as all the examples I see are for one-off runs. Create a few coroutine's, add them to an event loop, then run the loop, they run the tasks switching between them, done. Which does not seem all that helpful for me.
I want to use asyncio to not interrupt the operation in my application (using pyqt5). I want to create some functions that when called run in the asyncio event loop, then when they are done they do a callback.
What I imagine is. Create a separate thread for asyncio, create the loop and run it forever. Create some functions getFile(url, fp), get(url), readFile(file), etc. Then in the UI, I have a text box with a submit button, user enters url, clicks submit, it downloads the file.
But, every example I see, I cannot see how to add a coroutine to a running loop. And I do not see how I could do what I want without adding to a running loop.
#!/bin/python3
import asyncio
import aiohttp
import threading
loop = asyncio.get_event_loop()
def async_in_thread(loop):
asyncio.set_event_loop(loop)
loop.run_forever()
async def _get(url, callback):
print("get: " + url)
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
result = await response.text()
callback(result)
return
def get(url, callback):
asyncio.ensure_future(_get(url, callback))
thread = threading.Thread(target=async_in_thread, args=(loop, ))
thread.start()
def stop():
loop.close()
def callme(data):
print(data)
stop()
get("http://google.com", callme)
thread.join()
This is what I imagine, but it does not work.

To add a coroutine to a loop running in a different thread, use asyncio.run_coroutine_threadsafe:
def get(url, callback):
asyncio.run_coroutine_threadsafe(_get(url, callback))
In general, when you are interacting with the event loop from outside the thread that runs it, you must run everything through either run_coroutine_threadsafe (for coroutines) or loop.call_soon_threadsafe (for functions). For example, to stop the loop, use loop.call_soon_threadsafe(loop.stop). Also note that loop.close() must not be invoked inside a loop callback, so you should place that call in async_in_thread, right after the call to run_forever(), at which point the loop has definitely stopped running.
Another thing with asyncio is that passing explicit when_done callbacks isn't idiomatic because asyncio exposes the concept of futures (akin to JavaScript promises), which allow attaching callbacks to a not-yet-available result. For example, one could write _get like this:
async def _get(url):
print("get: " + url)
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
return await response.text()
It doesn't need a callback argument because any interested party can convert it to a task using loop.create_task and use add_done_callback to be notified when the task is complete. For example:
def _get_with_callback(url, callback):
loop = asyncio.get_event_loop()
task = loop.create_task(_get(url))
task.add_done_callback(lambda _fut: callback(task.result()))
In your case you're not dealing with the task directly because your code aims to communicate with the event loop from another thread. However, run_coroutine_threadsafe returns a very useful value - a full-fledged concurrent.futures.Future which you can use to register done callbacks. Instead of accepting a callback argument, you can expose the future object to the caller:
def get(url):
return asyncio.run_coroutine_threadsafe(_get(url), loop)
Now the caller can choose a callback-based approach:
future = get(url)
# call me when done
future.add_done_callback(some_callback)
# ... proceed with other work ...
or, when appropriate, they can even wait for the result:
# give me the response, I'll wait for it
result = get(url).result()
The latter is by definition blocking, but since the event loop is safely running in a different thread, it is not affected by the blocking call.

Install QualMash to smooth integration between Qt and asyncio.
Example from the project's README gives an inspiration for how it looks like:
import sys
import asyncio
import time
from PyQt5.QtWidgets import QApplication, QProgressBar
from quamash import QEventLoop, QThreadExecutor
app = QApplication(sys.argv)
loop = QEventLoop(app)
asyncio.set_event_loop(loop) # NEW must set the event loop
progress = QProgressBar()
progress.setRange(0, 99)
progress.show()
async def master():
await first_50()
with QThreadExecutor(1) as exec:
await loop.run_in_executor(exec, last_50)
async def first_50():
for i in range(50):
progress.setValue(i)
await asyncio.sleep(.1)
def last_50():
for i in range(50,100):
loop.call_soon_threadsafe(progress.setValue, i)
time.sleep(.1)
with loop: ## context manager calls .close() when loop completes, and releases all resources
loop.run_until_complete(master())

Related

Python Process blocking the rest of application

i have a program that basically does 2 things:
opens a websocket and remains on listening for messages and starting a video streaming in a forever loop.
I was trying to use multiprocess to manage both things but one piece stops the other from running.
The app is
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(start_client())
async def start_client():
async with WSClient() as client:
pass
class WSClient:
async def __aenter__(self):
async with websockets.connect(url,max_size= None) as websocket:
self.websocket = websocket
await self.on_open() ## it goes
p = Process(target = init(self)) # This is the streaming method
p.start()
async for message in websocket:
await on_message(message, websocket) ## listen for websocket messages
return self
the init method is
def init(ws):
logging.info('Firmware Version: ' + getVersion())
startStreaming(ws)
return
basically startStreaming has an infinite loop in it.
In this configuration, the stream starts but the on_message of the websocket it's not called because the Process function freezes the rest of the application.
How can I run both methods?
Thanks
In your code, you're telling multiprocessing.Process to take the function returned by init and call it in a new process. What you want is for the process to call init itself (with an argument). Here's how you can do that:
p = Process(target=init, args=(self,))
I have to note that you're passing an asynchronous websocket object to your init function. This will likely break as asyncio stuff aren't usually meant to be used in two threads, let alone two processes. Unless you're somehow recreating the websocket object in the new process and making a new loop there too, what you're actually looking for is how to create an asyncio task.
Assuming startStreaming is already an async function, you should change the init function to this:
async def init(ws): # note the async
logging.info('Firmware Version: ' + getVersion())
await startStreaming(ws) # note the await
return
and change the line creating and starting the process to this:
asyncio.create_task(init(self))
This will run your startStreaming function in a new task while you also read incoming messages at (basically) the same time.
Also, I'm not sure what you're trying to do with the async context manager as everything could be just in a normal async function. If you're interested in using one for learning purposes, I'd suggest you to check out contextlib.asynccontextmanager and have your message reading code inside the async with statement in start_client rather than inside __aenter__.

How to convert a function in a third party library to be async?

I am using my Raspberry Pi and the pigpio and websockets libraries.
I want my program to run asynchronously (i.e. I will use async def main as the entry point).
The pigpio library expects a synchronous callback function to be called in response to events, which is fine, but from within that callback I want to call another, asynchronous function from the websocket library.
So it would look like:
def sync_cb(): # <- This can not be made async, therefore I can not use await
[ws.send('test') for ws in connected_ws] # <- This is async and has to be awaited
Currently I can get it to work with:
def sync_cb():
asyncio.run(asyncio.wait([ws.send('test') for ws in connected_ws]))
but the docs say this use of asyncio.run is discouraged.
So my synchronous callback needs to call ws.send (also from a third party library) which is async from a function that is synchronous.
Another option that works is:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(asyncio.gather(*[ws.send(json.dumps(message)) for ws in connected_ws]))
But the three lines of creating and setting an even loop sounds like a lot just to run a simple async function.
My questions are:
Is it possible to substitute an async function where a synchronous callback is required (i.e. is there a way to make cb async in this example)
And, what kind of overhead am I incurring by using asyncio.run and asyncio.wait just to call a simple async method (in the list comprehension)
You could use run_coroutine_threadsafe function returning concurrent.furures.Future, which can be waited synchronously, to wrap coroutine to regular function and call it from synchronous code.
As I understand it, this approach is more appropriate if sync code (of third party lib) is executed in separate thread, but it can be adapted to single-threaded execution with some modifications.
An example to illustrate the approach:
import asyncio
def async_to_sync(loop, foo):
def foo_(*args, **kwargs):
return asyncio.run_coroutine_threadsafe(foo(*args, **kwargs), loop).result()
return foo_
def sync_code(cb):
for i in range(10):
cb(i)
async def async_cb(a):
print("async callback:", a)
async def main():
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, sync_code, async_to_sync(loop, async_cb))
asyncio.run(main())
Output:
async callback: 0
async callback: 1
async callback: 2
...
Is it possible to substitute an async function where a synchronous callback is required
It is possible. You can run event loop in separate thread and emit async code there, but you have to consider GIL.
import asyncio
import threading
class Portal:
def __init__(self, stop_event):
self.loop = asyncio.get_event_loop()
self.stop_event = stop_event
async def _call(self, fn, args, kwargs):
return await fn(*args, **kwargs)
async def _stop(self):
self.stop_event.set()
def call(self, fn, *args, **kwargs):
return asyncio.run_coroutine_threadsafe(self._call(fn, args, kwargs), self.loop)
def stop(self):
return self.call(self._stop)
def create_portal():
portal = None
async def wait_stop():
nonlocal portal
stop_event = asyncio.Event()
portal = Portal(stop_event)
running_event.set()
await stop_event.wait()
def run():
asyncio.run(wait_stop())
running_event = threading.Event()
thread = threading.Thread(target=run)
thread.start()
running_event.wait()
return portal
Usage example:
async def test(msg):
await asyncio.sleep(0.5)
print(msg)
return "HELLO " + msg
# it'll run a new event loop in separate thread
portal = create_portal()
# it'll call `test` in the separate thread and return a Future
print(portal.call(test, "WORLD").result())
portal.stop().result()
In your case:
def sync_cb():
calls = [portal.call(ws.send, 'test') for ws in connected_ws]
# if you want to get results from these calls:
# [c.result() for c in calls]
And, what kind of overhead am I incurring by using asyncio.run and asyncio.wait just to call a simple async method
asyncio.run will create a new event loop and close it then. Most likely if the callback is not called often it won't be a problem. But if you will use asyncio.run in another callback too, then they won't be able to work concurrently.

Forcing an ayncio coroutine to start

I'm currently writing some unit tests for a system that uses asyncio so I'd like to be able to force an asyncio coroutine to run to an await point. As an example, consider the following:
import asyncio
event = asyncio.Event()
async def test_func():
print('Test func')
await event.wait()
async def main():
w = test_func()
await asyncio.sleep(0)
print('Post func')
event.set()
print('Post set')
await w
print('Post wait')
asyncio.run(main())
If I run this program with Python 3.7 I see the following output
Post func
Post set
Test func
Post wait
I'd like to be able to test the case where the event isn't set before the coroutine starts running - i.e. have the output
Test func
Post func
Post set
Post wait
Is there a way to force the coroutine to start running until it reaches the await point. I've tried using an asyncio.sleep(0) statement but even if I sleep for a number of seconds the test_func coroutine doesn't start until await is hit in main.
If this isn't possible is there another option for creating this test case?
I need to create a task to execute the coroutine so there is another task for asyncio to schedule. If instead of calling w = test_func() I use w = asyncio.create_task(test_func()) and follow that with asyncio.sleep(0). I get the behaviour I desire. I'm not sure how deterministic the asyncio event loop at scheduling tasks but it seems to be working reliably for this example.

Removing async pollution from Python

How do I remove the async-everywhere insanity in a program like this?
import asyncio
async def async_coro():
await asyncio.sleep(1)
async def sync_func_1():
# This is blocking and synchronous
await async_coro()
async def sync_func_2():
# This is blocking and synchronous
await sync_func_1()
if __name__ == "__main__":
# Async pollution goes all the way to __main__
asyncio.run(sync_func_2())
I need to have 3 async markers and asyncio.run at the top level just to call one async function. I assume I'm doing something wrong - how can I clean up this code to make it use async less?
FWIW, I'm interested mostly because I'm writing an API using asyncio and I don't want my users to have to think too much about whether their functions need to be def or async def depending on whether they're using a async part of the API or not.
After some research, one answer is to manually manage the event loop:
import asyncio
async def async_coro():
await asyncio.sleep(1)
def sync_func_1():
# This is blocking and synchronous
loop = asyncio.get_event_loop()
coro = async_coro()
loop.run_until_complete(coro)
def sync_func_2():
# This is blocking and synchronous
sync_func_1()
if __name__ == "__main__":
# No more async pollution
sync_func_2()
If you must do that, I would recommend an approach like this:
import asyncio, threading
async def async_coro():
await asyncio.sleep(1)
_loop = asyncio.new_event_loop()
threading.Thread(target=_loop.run_forever, daemon=True).start()
def sync_func_1():
# This is blocking and synchronous
return asyncio.run_coroutine_threadsafe(async_coro(), _loop).result()
def sync_func_2():
# This is blocking and synchronous
sync_func_1()
if __name__ == "__main__":
sync_func_2()
The advantage of this approach compared to one where sync functions run the event loop is that it supports nesting of sync functions. It also only runs a single event loop, so that if the underlying library wants to set up e.g. a background task for monitoring or such, it will work continuously rather than being spawned each time anew.

Python Asyncio run_forever() and Tasks

I adapted this code for using Google Cloud PubSub in Async Python: https://github.com/cloudfind/google-pubsub-asyncio
import asyncio
import datetime
import functools
import os
from google.cloud import pubsub
from google.gax.errors import RetryError
from grpc import StatusCode
async def message_producer():
""" Publish messages which consist of the current datetime """
while True:
await asyncio.sleep(0.1)
async def proc_message(message):
await asyncio.sleep(0.1)
print(message)
message.ack()
def main():
""" Main program """
loop = asyncio.get_event_loop()
topic = "projects/{project_id}/topics/{topic}".format(
project_id=PROJECT, topic=TOPIC)
subscription_name = "projects/{project_id}/subscriptions/{subscription}".format(
project_id=PROJECT, subscription=SUBSCRIPTION)
subscription = make_subscription(
topic, subscription_name)
def create_proc_message_task(message):
""" Callback handler for the subscription; schedule a task on the event loop """
print("Task created!")
task = loop.create_task(proc_message(message))
subscription.open(create_proc_message_task)
# Produce some messages to consume
loop.create_task(message_producer())
print("Subscribed, let's do this!")
loop.run_forever()
def make_subscription(topic, subscription_name):
""" Make a publisher and subscriber client, and create the necessary resources """
subscriber = pubsub.SubscriberClient()
try:
subscriber.create_subscription(subscription_name, topic)
except:
pass
subscription = subscriber.subscribe(subscription_name)
return subscription
if __name__ == "__main__":
main()
I basically removed the publishing code and only use the subscription code.
However, initially I did not include the loop.create_task(message_producer()) line. I figured that tasks were created as they were supposed to however they never actually run themselves. Only if I add said line the code properly executes and all created Tasks run. What causes this behaviour?
PubSub is calling the create_proc_message_task callback from a different thread. Since create_task is not thread-safe, it must only be called from the thread that runs the event loop (typically the main thread). To correct the issue, replace loop.create_task(proc_message(message)) with asyncio.run_coroutine_threadsafe(proc_message(message), loop) and message_producer will no longer be needed.
As for why message_producer appeared to fix the code, consider that run_coroutine_threadsafe does two additional things compared to create_task:
It operates in a thread-safe fashion, so the event loop data structures are not corrupted when this is done concurrently.
It ensures that the event loop wakes up at the soonest possible opportunity, so that it can process the new task.
In your case create_task added the task to the loop's runnable queue (without any locking), but failed to ensure the wakeup, because that is not needed when running in the event loop thread. The message_producer then served to force the loop to wake up in regular intervals, which is when it also checks and executes the runnable tasks.

Categories

Resources