Python asyncio, aioamqp callbacks and pytest - python

I'm trying to write some tests for some asynchronous Python code using the aioamqp message broker, but pytest and callbacks fail me.
Simply put, when the aioamqp basic_consume() function receives a message and calls the assigned asynchronous callback, inside the callback I can do whatever I like -- reference unassigned variables, assert something outrageous -- and pytest happily passes the test. Clearly an exception gets raised under the hood and the test is interrupted, since the callback function never runs further than the first failing line, but the failure never rises all the way to pytest.
Here's a code snippet to demonstrate:
import aioamqp
import asyncio
import pytest
MQ_HOST = '0.0.0.0'
MQ_PORT = 5672
MQ_LOGIN = 'login'
MQ_PASSWORD = 'password'
class MockMQ:
def __init__(self):
self.loop = asyncio.get_event_loop()
self.transport = None
self.protocol = None
async def connect(self):
try:
self.transport, self.protocol = await aioamqp.connect(
host=MQ_HOST, port=MQ_PORT, login=MQ_LOGIN, password=MQ_PASSWORD
)
self.channel = await self.protocol.channel()
except aioamqp.AmqpClosedConnection:
print('closed connection')
return
async def close(self):
await self.protocol.close()
self.transport.close()
async def publish(self, data, queue_name, exchange='', properties=None):
queue = await self.channel.queue_declare(queue_name)
await self.channel.publish(data, exchange, queue_name, properties=properties)
async def consume(self, callback, queue_name):
await self.channel.basic_consume(callback, queue_name=queue_name)
#pytest.mark.asyncio
async def test_mq():
"""Basic ping-pong test for RabbitMQ."""
QUEUE_NAME = 'my_queue'
#pytest.mark.asyncio
async def callback(channel, body, envelope, properties):
"""This is the callback called when a MQ message is consumed."""
print('we are here')
await channel.basic_client_ack(envelope.delivery_tag)
print(body) # this gets printed as well
foo = bar * 2 # this is where we fail
assert body == b'bar'
print('we never arrive here')
mq = MockMQ()
await mq.connect()
await mq.consume(callback, QUEUE_NAME)
await mq.publish(b'foo', QUEUE_NAME)
await asyncio.sleep(1.0)
await mq.close()
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(test_mq())
Running this via the main program with IPython results correctly in an exception, since it doesn't get swallowed by pytest.
What is the proper way of writing tests for pytest in this case? pytest-asyncio does not seem to affect this issue in the least.
EDIT: I might as well add that my dev environment uses Django and pytest-django, but removing it doesn't change the result either.

Related

cannot perform operation: another operation is in progress in pytest

I want to test some function, that work with asyncpg. If I run one test at a time, it works fine. But if I run several tests at a time, all tests except the first one crash with the error asyncpg.exceptions._base.InterfaceError: cannot perform operation: another operation is in progress.
Tests:
#pytest.mark.asyncio
async def test_project_connection(superuser_id, project_id):
data = element_data_random(project_id)
element_id = (await resolve_element_create(data=data, user_id=superuser_id))["id"]
project_elements = (await db_projects_element_ids_get([project_id]))[project_id]
assert element_id in project_elements
#pytest.mark.asyncio
async def test_project_does_not_exist(superuser_id):
data = element_data_random(str(uuid.uuid4()))
with pytest.raises(ObjectWithIdDoesNotExistError):
await resolve_element_create(data=data, user_id=superuser_id)
All functions for work with db use pool look like:
async def <some_db_func>(*args):
pool = await get_pool()
await pool.execute(...) # or fetch/fetchrow/fetchval
How I get the pool:
db_pool = None
async def get_pool():
global db_pool
async def init(con):
await con.set_type_codec('jsonb', encoder=ujson.dumps, decoder=ujson.loads, schema='pg_catalog')
await con.set_type_codec('json', encoder=ujson.dumps, decoder=ujson.loads, schema='pg_catalog')
if not db_pool:
dockerfiles_dir = os.path.join(src_dir, 'dockerfiles')
env_path = os.path.join(dockerfiles_dir, 'dev.env')
try:
# When code and DB inside docker containers
host = 'postgres-docker'
socket.gethostbyname(host)
except socket.error:
# When code on localhost, but DB inside docker container
host = 'localhost'
load_dotenv(dotenv_path=env_path)
db_pool = await asyncpg.create_pool(
database=os.getenv("POSTGRES_DBNAME"),
user=os.getenv("POSTGRES_USER"),
password=os.getenv("POSTGRES_PASSWORD"),
host=host,
init=init
)
return db_pool
As far as I understand under the hood, asynсpg creates a new connection and runs the request inside that connection if you run the request through pool. Which makes it clear that each request should have its own connection. However, this error occurs, which is caused when one connection tries to handle two requests at the same time
Okay, thanks to #Adelin I realized that I need to run each asynchronous test synchronously. I I'm new to asyncio so I didn't understand it right away and found a solution.
It was:
#pytest.mark.asyncio
async def test_...(*args):
result = await <some_async_func>
assert result == excepted_result
It become:
def test_...(*args):
async def inner()
result = await <some_async_func>
assert result == excepted_result
asyncio.get_event_loop().run_until_complete(inner())
The problem happens because each test function create it's own event-loop and it make asyncpg-pool confused with what event-loop is for it.
You can change event-loop scope to "session" from "function" by below on conftest.py.
You don't need to make it sequentially.
import asyncio
import pytest
#pytest.yield_fixture(scope="session")
def event_loop(request):
loop = asyncio.get_event_loop_policy().new_event_loop()
yield loop
loop.close()

Why is an asyncio task garbage collected when opening a connection inside it?

I am creating a server which needs to make an external request while responding. To handle concurrent requests I'm using Python's asyncio library. I have followed some examples from the standard library. It seems however that some of my tasks are destroyed, printing Task was destroyed but it is pending! to my terminal. After some debugging and research I found a stackoverflow answer which seemed to explain why.
I have created a minimal example demonstrating this effect below. My question is in what way should one counteract this effect? Storing a hard reference to the task, by for example storing asyncio.current_task() in a global variable mitigates the issue. It also seems to work fine if I wrap the future remote_read.read() as await asyncio.wait_for(remote_read.read(), 5). However I do feel like these solutions are ugly.
# run and visit http://localhost:8080/ in your browser
import asyncio
import gc
async def client_connected_cb(reader, writer):
remote_read, remote_write = await asyncio.open_connection("google.com", 443, ssl=True)
await remote_read.read()
async def cleanup():
while True:
gc.collect()
await asyncio.sleep(1)
async def main():
server = await asyncio.start_server(client_connected_cb, "localhost", 8080)
await asyncio.gather(server.serve_forever(), cleanup())
asyncio.run(main())
I am running Python 3.10 on macOS 10.15.7.
It looks that by the time being, the only way is actually keeping
a reference manually.
Maybe a decorator is something more convenient than having
to manually add the code in each async function.
I opted for the class design, so that a class attribute
can hold the hard-references while the tasks run. (A
local variable in the wrapper function would be part
of the task-reference cycle, and the garbage collection
would trigger all the same):
# run and visit http://localhost:8080/ in your browser
import asyncio
import gc
from functools import wraps
import weakref
class Shielded:
registry = set()
def __init__(self, func):
self.func = func
async def __call__(self, *args, **kw):
self.registry.add(task:=asyncio.current_task())
try:
result = await self.func(*args, **kw)
finally:
self.registry.remove(task)
return result
def _Shielded(func):
# Used along with the print sequence to assert the task was actually destroyed without commenting
async def wrapper(*args, **kwargs):
ref = weakref.finalize(asyncio.current_task(), lambda: print("task destroyed"))
return await func(*args, **kwargs)
return wrapper
#Shielded
async def client_connected_cb(reader, writer):
print("at task start")
#registry.append(asyncio.current_task())
# I've connected this to a socket in an interactive session, I'd explictly .close() for debugging:
remote_read, remote_write = await asyncio.open_connection("localhost", 8060, ssl=False)
print("comensing remote read")
await remote_read.read()
print("task complete")
async def cleanup():
while True:
gc.collect()
await asyncio.sleep(1)
async def main():
server = await asyncio.start_server(client_connected_cb, "localhost", 8080)
await asyncio.gather(server.serve_forever(), cleanup())
asyncio.run(main())
Moreover, I wanted to "really see it", so I created a "fake" _Shielded
decorator that would just log something when the underlying task
got deleted: "task complete" is never printed with it, indeed.

Closing ClientSession when keyboard interrupt

I am making a discord bot with python. When a user types a command, this bot brings data from url and shows it. I use aiohttp for asynchronous http request, but documentation of discord.py says,
Since it is better to not create a session for every request, you should store it in a variable and then call session.close on it when it needs to be disposed.
So i changed all my codes from
async with aiohttp.ClientSession() as session:
async with session.get('url') as response:
# something to do
into
# Global variable
session = aiohttp.ClientSession()
async with session.get('url') as response:
# something to do
All http requests use globally defined session. But when i run this code and stop by keyboard interrupt(Ctrl + C), this warning messages are appeared.
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x0000015A45ADBDD8>
Unclosed connector
connections: ['[(<aiohttp.client_proto.ResponseHandler object at 0x0000015A464925E8>, 415130.265)]']
connector: <aiohttp.connector.TCPConnector object at 0x0000015A454B3320>
How can i close ClientSession when program stops by keyboard interrupt?
What I tried:
I tried following but nothing worked well.
Making a class and close in its __del__. __del__ was not called when keyboard interrupt.
class Session:
def __init__(self):
self._session = aiohttp.ClientSession()
def __del__(self):
self._session.close()
Infinite loop in main, and catch KeyboardInterrupt. Program is blocked with bot.run() so cannot reach to code.
from discord.ext import commands
if __name__ == "__main__":
bot = commands.Bot()
bot.run(token) # blocked
try:
while(True):
sleep(1)
except KeyboardInterrupt:
session.close()
Close session when bot is disconnected. on_disconnect had been not called when keyboard interrupt.
#bot.event
async def on_disconnect():
await session.close()
edit: I missed await before session.close(), but this was just my mistake when I wrote this question. All I tried also didn't work well as i expected when i wrote correctly with await.
You must await the closing of a ClientSession object:
await session.close()
Notice coroutine in the docs here. Your attempt #3 is probably best suited for this problem, as it is naturally an async function.
I tried the following code and it seems to work well.
import asyncio
import aiohttp
class Session:
def __init__(self):
self._session = aiohttp.ClientSession()
def __del__(self):
loop = asyncio.get_event_loop()
loop.run_until_complete(self.close()
async def close(self):
await self._session.close()
session = Session()

Variable in Async function not re-evaluated in while-True loop

I made a dummy server to test my websockets application against. It listens to subscription messages, then gives information about those subscriptions through the socket.
The class' subscriptions attribute is empty at initialisation and should fill up as the listen() function receives subscription messages. However, it seems as though self.subscriptions in talk() is never appended to, leaving it stuck in its infinite while-loop and never transmitting messages.
The problem is solved by adding a line await asyncio.sleep(1) after the for-loop. But why? Shouldn't self.subscriptions be re-evaluated every time the for-loop is started?
Code below:
class DummyServer:
def __init__(self):
self.subscriptions = []
def start(self):
return websockets.serve(self.handle, 'localhost', 8765)
async def handle(self, websocket, path):
self.ws = websocket
listen_task = asyncio.ensure_future(self.listen())
talk_task = asyncio.ensure_future(self.talk())
done, pending = await asyncio.wait(
[listen_task, talk_task],
return_when=asyncio.FIRST_COMPLETED
)
for task in pending:
task.cancel()
async def listen(self):
while True:
try:
msg = await self.ws.recv()
msg = json.loads(msg)
await self.triage(msg) # handles subscriptions
except Exception as e:
await self.close()
break
async def talk(self):
while True:
for s in self.subscriptions:
dummy_data = {
'product_id': s
}
try:
await self.send(json.dumps(dummy_data))
except Exception as e:
await self.close()
break
await asyncio.sleep(1) # without this line, no message is ever sent
At the start of your function, subscriptions is empty and the for body is not evaluated. As a result, your coroutine is virtually the same as:
async def talk(self):
while True:
pass
The while-loop does not contain a "context switching point", meaning that the asyncio event loop basically hangs there, forever executing the blocking while loop.
Adding await sleep() breaks the magic circle; even await sleep(0) could help.
Clever code should probably make use of asyncio.Condition in combination with self.subscriptions, but this matter goes beyond the scope of your original question.

Python asyncio: unreferenced tasks are destroyed by garbage collector?

I am writing a program that accepts RPC requests over AMQP for executing network requests (CoAP). When processing RPC requests, the aioamqp callback generates tasks that are responsible for network IO. These tasks can be considered background tasks, that will run indefinitely for streaming network responses over AMQP (in this case one RPC requests triggers a RPC response and data streaming).
I noticed that in my original code the network task would be destroyed after seemingly random time intervals (before it was finished), asyncio would then print the following warning "Task was destroyed but it is pending". This issue is similar to the one described here: https://bugs.python.org/issue21163.
For now I have circumvented the issue by storing a hard reference in a module-level list, which prevents the GC from destroying the task object. However, I was wondering if there is a better work around? Ideally I would want to call await task in the RPC callback, but I noticed that this prevents any further AMQP operations from completing -> e.g. creating a new amqp channel stalls and receiving rpc requests over amqp also stalls. I am unsure what is causing this stalling however (as the callback is itself a coroutine, I would expect waiting would not stall the entire aioamqp library).
I am posting the source below for the RPC client and server, both are based on the aioamqp/aiocoap examples. In the server, on_rpc_request is the amqp rpc callback and send_coap_obs_request is the networking coroutine that gets destroyed when the 'obs_tasks.append(task)' statement is removed.
client.py:
"""
CoAP RPC client, based on aioamqp implementation of RPC examples from RabbitMQ tutorial
"""
import base64
import json
import uuid
import asyncio
import aioamqp
class CoAPRpcClient(object):
def __init__(self):
self.transport = None
self.protocol = None
self.channel = None
self.callback_queue = None
self.waiter = asyncio.Event()
async def connect(self):
""" an `__init__` method can't be a coroutine"""
self.transport, self.protocol = await aioamqp.connect()
self.channel = await self.protocol.channel()
result = await self.channel.queue_declare(queue_name='', exclusive=True)
self.callback_queue = result['queue']
await self.channel.basic_consume(
self.on_response,
no_ack=True,
queue_name=self.callback_queue,
)
async def on_response(self, channel, body, envelope, properties):
if self.corr_id == properties.correlation_id:
self.response = body
self.waiter.set()
async def call(self, n):
if not self.protocol:
await self.connect()
self.response = None
self.corr_id = str(uuid.uuid4())
await self.channel.basic_publish(
payload=str(n),
exchange_name='',
routing_key='coap_request_rpc_queue',
properties={
'reply_to': self.callback_queue,
'correlation_id': self.corr_id,
},
)
await self.waiter.wait()
await self.protocol.close()
return json.loads(self.response)
async def rpc_client():
coap_rpc = CoAPRpcClient()
request_dict = {}
request_dict_json = json.dumps(request_dict)
print(" [x] Send RPC coap_request({})".format(request_dict_json))
response_dict = await coap_rpc.call(request_dict_json)
print(" [.] Got {}".format(response_dict))
asyncio.get_event_loop().run_until_complete(rpc_client())
server.py:
"""
CoAP RPC server, based on aioamqp implementation of RPC examples from RabbitMQ tutorial
"""
import base64
import json
import sys
import logging
import warnings
import asyncio
import aioamqp
import aiocoap
amqp_protocol = None
coap_client_context = None
obs_tasks = []
AMQP_COAP_NOTIFICATIONS_EXCHANGE_NAME = 'topic_coap'
AMQP_COAP_NOTIFICATIONS_TOPIC_NAME = 'topic'
AMQP_COAP_NOTIFICATIONS_ROUTING_KEY = 'coap.response'
def create_response_dict(coap_request, coap_response):
response_dict = {'request_uri': "", 'code': 0}
response_dict['request_uri'] = coap_request.get_request_uri()
response_dict['code'] = coap_response.code
if len(coap_response.payload) > 0:
response_dict['payload'] = base64.b64encode(coap_response.payload).decode('utf-8')
return response_dict
async def handle_coap_response(amqp_envelope, amqp_properties, coap_request, coap_response):
# create response dict:
response_dict = create_response_dict(coap_request, coap_response)
message = json.dumps(response_dict)
# create new channel:
global amqp_protocol
amqp_channel = await amqp_protocol.channel()
await amqp_channel.basic_publish(
payload=message,
exchange_name='',
routing_key=amqp_properties.reply_to,
properties={
'correlation_id': amqp_properties.correlation_id,
},
)
await amqp_channel.basic_client_ack(delivery_tag=amqp_envelope.delivery_tag)
print(" [.] handle_coap_response() published response: {}".format(response_dict))
def incoming_observation(coap_request, coap_response):
asyncio.async(handle_coap_notification(coap_request, coap_response))
async def handle_coap_notification(coap_request, coap_response):
# create response dict:
response_dict = create_response_dict(coap_request, coap_response)
message = json.dumps(response_dict)
# create new channel:
global amqp_protocol
amqp_channel = await amqp_protocol.channel()
await amqp_channel.exchange(AMQP_COAP_NOTIFICATIONS_EXCHANGE_NAME, AMQP_COAP_NOTIFICATIONS_TOPIC_NAME)
await amqp_channel.publish(message, exchange_name=AMQP_COAP_NOTIFICATIONS_EXCHANGE_NAME, routing_key=AMQP_COAP_NOTIFICATIONS_ROUTING_KEY)
print(" [.] handle_coap_notification() published response: {}".format(response_dict))
async def send_coap_obs_request(amqp_envelope, amqp_properties, request_dict, coap_request):
observation_is_over = asyncio.Future()
try:
global coap_client_context
requester = coap_client_context.request(coap_request)
requester.observation.register_errback(observation_is_over.set_result)
requester.observation.register_callback(lambda data, coap_request=coap_request: incoming_observation(coap_request, data))
try:
print(" [..] Sending CoAP obs request: {}".format(request_dict))
coap_response = await requester.response
except socket.gaierror as e:
print("Name resolution error:", e, file=sys.stderr)
return
except OSError as e:
print("Error:", e, file=sys.stderr)
return
if coap_response.code.is_successful():
print(" [..] Received CoAP response: {}".format(coap_response))
await handle_coap_response(amqp_envelope, amqp_properties, coap_request, coap_response)
else:
print(coap_response.code, file=sys.stderr)
if coap_response.payload:
print(coap_response.payload.decode('utf-8'), file=sys.stderr)
sys.exit(1)
exit_reason = await observation_is_over
print("Observation is over: %r"%(exit_reason,), file=sys.stderr)
finally:
if not requester.response.done():
requester.response.cancel()
if not requester.observation.cancelled:
requester.observation.cancel()
async def on_rpc_request(amqp_channel, amqp_body, amqp_envelope, amqp_properties):
print(" [.] on_rpc_request(): received RPC request: {}".format(amqp_body))
request_dict = {} # hardcoded to vdna.be for SO example
aiocoap_code = aiocoap.GET
aiocoap_uri = "coap://vdna.be/obs"
aiocoap_payload = ""
# as we are ready to send the CoAP request, ack the client already indicating we have received the RPC request
await amqp_channel.basic_client_ack(delivery_tag=amqp_envelope.delivery_tag)
coap_request = aiocoap.Message(code=aiocoap_code, uri=aiocoap_uri, payload=aiocoap_payload)
coap_request.opt.observe = 0
task = asyncio.ensure_future(send_coap_obs_request(amqp_envelope, amqp_properties, request_dict, coap_request))
# we have to keep a hard ref to this task, otherwise the python garbage collector destroyes the task before it is completed. See https://bugs.python.org/issue21163
# this is apparent from the "Task was destroyed but it is pending" exception thrown after random (lengthy) time intervals, probably the time interval is related to when the gc is triggered
# await task # this does not seem to work, as it prevents new amqp operations from executing (e.g. amqp channels do not get created)
# we are actually not interested in waiting for the task anyway, so instead just keep a hard ref to the task in the obs_tasks list
obs_tasks.append(task) # TODO: when do we remove the task from the list?
async def amqp_connect():
try:
(transport, protocol) = await aioamqp.connect('localhost', 5672)
print(" [x] Connected to AMQP broker")
return (transport, protocol)
except aioamqp.AmqpClosedConnection as ex:
print("closed connections: {}".format(ex))
raise ex
async def main():
"""Open AMQP connection to broker, subscribe to coap_request_rpc_queue and setup aiocoap client context """
try:
global amqp_protocol
(amqp_transport, amqp_protocol) = await amqp_connect()
channel = await amqp_protocol.channel()
await channel.queue_declare(queue_name='coap_request_rpc_queue')
await channel.basic_qos(prefetch_count=10, prefetch_size=0, connection_global=False)
await channel.basic_consume(on_rpc_request, queue_name='coap_request_rpc_queue')
print(" [x] Awaiting CoAP request RPC requests")
except aioamqp.AmqpClosedConnection as ex:
print("amqp_connect: closed connections: {}".format(ex))
exit()
global coap_client_context
coap_client_context = await aiocoap.Context.create_client_context()
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.set_debug(True)
asyncio.async(main())
loop.run_forever()
When a task is scheduled, it's _step callback is scheduled in the loop. That callback maintains a reference to the task through self. I have not checked the code, but I have high confidence that the loop maintains a reference to its callbacks. However, when a task awaits some awaitable or future, the _step callback is not scheduled. In that case, the task adds a done callback that retains a reference to the task, but the loop does not retain references to tasks waiting for futures.
So long as something retains a reference to the future that the task is waiting on, all is well. However, if nothing retains a hard reference to the future, then the future can get garbage collected, and when that happens the task can get garbage collected.
So, I'd look for things that your task calls where the future the task is waiting on might not be referenced.
In general the future needs to be referenced so someone can set its result eventually, so it is very likely a bug if you have unreferenced futures.

Categories

Resources