Making a logging.Handler with async emit - python

I have a Python log handler that writes using asyncio (it's too much work to write to this particular service any other way). I also want to be able to log messages from background threads, since a few bits of code do that. So my code looks basically like this (minimal version):
class AsyncEmitLogHandler(logging.Handler):
def __init__(self):
self.loop = asyncio.get_running_loop()
super().__init__()
def emit(self, record):
self.format(record)
asyncio.run_coroutine_threadsafe(
coro=self._async_emit(record.message),
loop=self.loop,
)
async def _async_emit(message):
await my_async_write_function(message)
Mostly it works fine but when processes exit I get a lot some warnings like this: "coroutine 'AsyncEmitLogHandler._async_emit' was never awaited"
Any suggestions on a cleaner way to do this? Or some way to catch shutdown and kill pending writes? Or just suppress the warnings?
Note: the full code is [here][1]
[1]: https://github.com/lsst-ts/ts_salobj/blob/c0c6473f7ff7c71bd3c84e8e95b4ad7c28e67721/python/lsst/ts/salobj/sal_log_handler.py

You could keep a reference to the coro, and override the handler's close() method to call close() on it. A general way to manage coros is to keep a list of them in the handler, and override the handler's close() method to call close() on each coro, or else create tasks from them and call cancel() on each of the tasks.

Related

In an event-loop based server (FastAPI), are async/await functions atomic?

I've read so much about async/await, co-routines, and single-threaded, non-blocking servers. But I'm still not %100 about this situation, and I need to be.
I'm running FastAPI.
I have an endpoint:
#router.get("/get-next-id")
async def get_next_id():
return await id_module.get_next_id()
It calls (and awaits) ths async function in 'id_module' (I know there are safer ways to generate a sequential ID. This is just an example to demonstrate a possible concurrency issue.):
async def get_next_id():
next_id = None
with open("id_flat_file.json") as serialized_data_file:
id_dict = json.load(serialized_data_file)
next_id = id_dict["id"] + 1
with open("id_flat_file.json", 'w') as serialized_data_file:
json.dump({"id": next_id}, serialized_data_file)
return next_id
Questions:
Is get_next_id() guaranteed to run on the main thread of the event loop?
If so, is it thread safe? I.e., is it atomic to the main thread? Could it ever be 'paused' and interrupted by the main thread (which could also be running the same code for another request)? I.e., in my code, do I have to worry about a race condition and somehow otherwise protect (lock) against that?

awaiting future never reached, although set_result() is called

My example class uses only two methods,
async def run(): creates a asyncio.Future() and awaits it
def stop(): sets the result of the Future() created by the run() method
The idea is to use the python signal handler to call the stop() function once a signal is received. The stop() then sets the result of the Future() the run() is waiting for. So far so good, but this does not work. Actually, the run() method never notices that the Future() is done.
Example code:
import asyncio
import signal
class Foo:
def __init__(self):
self._stop = None
async def run(self):
print(f"1. starting foo")
self._stop = asyncio.Future()
await self._stop
print(f"4. 'stop' called, canceling running tasks...")
def stop(self):
print(f"3. stopping foo")
self._stop.set_result(None)
f = Foo()
loop = asyncio.new_event_loop()
def signal_handler(_, __):
print(f"2. signal handler: call Foo.stop()")
f.stop()
signal.signal(signal.SIGINT, signal_handler)
loop.run_until_complete(f.run())
print(f"5. bye")
and the output is:
1. starting foo
2. signal handler: call Foo.stop()
3. stopping foo
That's it. The fourth print entry is never called. Although the self._stop Future is done after setting the result.
Does anyone have any idea what I am doing wrong and what I am missing?
Any help would be greatly appreciated!
I cannot reproduce your problem, but it is quite possible that it'll happen on different operating systems.
Per loop.add_signal_handler:
Unlike signal handlers registered using signal.signal(), a callback registered with this function is allowed to interact with the event loop.
The usual culprit that causes those issues is the event loop not waking up from outside input.
There are 2 solutions for your issue:
If you're using unix, change signal.signal() to loop.add_signal_handler().
If not, try changing f.stop() to loop.call_soon_threadsafe(f.stop). It will make sure that the event loop wakes up correctly.
Irrespective to that, you have a couple of different issues arising from using asyncio.new_event_loop() and not assigning the loop to the thread or cleaning up correctly. I suggest you to use asyncio.run().

How can I have a synchronous facade over asyncpg APIs with Python asyncio?

Imagine an asynchronous aiohttp web application that is supported by a Postgresql database connected via asyncpg and does no other I/O. How can I have a middle-layer hosting the application logic, that is not async? (I know I can simply make everything async -- but imagine my app to have massive application logic, only bound by database I/O, and I cannot touch everything of it).
Pseudo code:
async def handler(request):
# call into layers over layers of application code, that simply emits SQL
...
def application_logic():
...
# This doesn't work, obviously, as await is a syntax
# error inside synchronous code.
data = await asyncpg_conn.execute("SQL")
...
# What I want is this:
data = asyncpg_facade.execute("SQL")
...
How can a synchronous façade over asyncpg be built, that allows the application logic to make database calls? The recipes floating around like using async.run() or asyncio.run_coroutine_threadsafe() etc. do not work in this case, as we're coming from an already asynchronous context. I'd assume this cannot be impossible, as there already is an event loop that could in principle run the asyncpg coroutine.
Bonus question: what is the design rationale of making await inside sync a syntax error? Wouldn't it be pretty useful to allow await from any context that originated from a coroutine, so we'd have simple means to decompose an application in functional building blocks?
EDIT Extra bonus: beyond Paul's very good answer, that stays inside the "safe zone", I'd be interested in solutions that avoid blocking the main thread (leading to something more gevent-ish). See also my comment on Paul's answer ...
You need to create a secondary thread where you run your async code. You initialize the secondary thread with its own event loop, which runs forever. Execute each async function by calling run_coroutine_threadsafe(), and calling result() on the returned object. That's an instance of concurrent.futures.Future, and its result() method doesn't return until the coroutine's result is ready from the secondary thread.
Your main thread is then, in effect, calling each async function as if it were a sync function. The main thread doesn't proceed until each function call is finished. BTW it doesn't matter if your sync function is actually running in an event loop context or not.
The calls to result() will, of course, block the main thread's event loop. That can't be avoided if you want to get the effect of running an async function from sync code.
Needless to say, this is an ugly thing to do and it's suggestive of the wrong program structure. But you're trying to convert a legacy program, and it may help with that.
import asyncio
import threading
from datetime import datetime
def main():
def thr(loop):
asyncio.set_event_loop(loop)
loop.run_forever()
loop = asyncio.new_event_loop()
t = threading.Thread(target=thr, args=(loop, ), daemon=True)
t.start()
print("Hello", datetime.now())
t1 = asyncio.run_coroutine_threadsafe(f1(1.0), loop).result()
t2 = asyncio.run_coroutine_threadsafe(f1(2.0), loop).result()
print(t1, t2)
if __name__ == "__main__":
main()
>>> Hello 2021-10-26 20:37:00.454577
>>> Hello 1.0 2021-10-26 20:37:01.464127
>>> Hello 2.0 2021-10-26 20:37:03.468691
>>> 1.0 2.0

Run tornado.testing.AsyncTestCase using asyncio event loop

I have an asyncio based class which I want to unit test. Using tornado.testing.AsyncTestCase this works quite well and easily. However, one specific method of my class uses asyncio.ensure_future to schedule execution of another method. This never finishes in the AsyncTestCase, because the default test runner uses the tornado KQueueIOLoop event loop, not an asyncio event loop.
class TestSubject:
def foo(self):
asyncio.ensure_future(self.bar())
async def bar(self):
pass
class TestSubjectTest(AsyncTestCase):
def test_foo(self):
t = TestSubject()
# here be somewhat involved setup with MagicMock and self.stop
t.foo()
self.wait()
$ python -m tornado.testing baz.testsubject_test
...
[E 160627 17:48:22 testing:731] FAIL
[E 160627 17:48:22 base_events:1090] Task was destroyed but it is pending!
task: <Task pending coro=<TestSubject.bar() running at ...>>
.../asyncio/base_events.py:362: RuntimeWarning: coroutine 'TestSubject.bar' was never awaited
How can I use a different event loop to run the tests on to ensure my task will actually be executed? Alternatively, how can I make my implementation event loop-independent and cross-compatible?
Turns out to be simple enough...
class TestSubjectTest(AsyncTestCase):
def get_new_ioloop(self): # override this method
return tornado.platform.asyncio.AsyncIOMainLoop()
I was trying this before, but directly returned asyncio.get_event_loop(), which didn't work. Returning Tornado's asyncio loop wrapper does the trick.

Make my own function as asyncio function in python

I would like to use asyncio module in Python to achieve doing request tasks in parallel because my current request tasks works in sequence, which means it is blocking.
I have read the documents of asyncio module in Python, and I have wrote some simple code as follows, however it doesn't work as I thought.
import asyncio
class Demo(object):
def demo(self):
loop = asyncio.get_event_loop()
tasks = [task1.verison(), task2.verison()]
result = loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print(result)
class Task():
#asyncio.coroutine
def version(self):
print('before')
result = yield from differenttask.GetVersion()
# result = yield from asyncio.sleep(1)
print('after')
I found out that all the example they give use asyncio function to make the non-blocking works, how to make own function works as a asyncio?
What I want to achieve is that for a task it will execute the request and doesn't wait the response then it switch to next task. When I tried this: I get RuntimeError: Task got bad yield: 'hostname', which hostname is one item in my expected result.
so as #AndrewSvetlov said, differentask.GetVersion() is a regular synchronous function. I have tried the second method suggested in similar post, --- the one Keep your synchronous implementation of searching...blabla
#asyncio.coroutine
def version(self):
return (yield from asyncio.get_event_loop().run_in_executor(None, self._proxy.GetVersion()))
And it still doesn't work, Now the error is
Task exception was never retrieved
future: <Task finished coro=<Task.version() done, defined at /root/syi.py:34> exception=TypeError("'dict' object is not callable",)>
I'm not sure if I understand if it right, please advice.
Change to
#asyncio.coroutine
def version(self):
return (yield from asyncio.get_event_loop()
.run_in_executor(None, self._proxy.GetVersion))
Please pay attention self._proxy.GetVersion is not called here but a reference to function is passed into the loop executor.
Now all IO performed by GetVersion() is still synchronous but executed in a thread pool.
It may have benefits for you or may not.
If the whole program uses thread pool based solution only you need concurrent.futures.ThreadPool perhaps, not asyncio.
If the most part of the application is built on top of asynchronous libraries but only relative small part uses thread pools -- that's fine.

Categories

Resources