How can I make a coroutine stop with timeout?
I don't understand why asyncio.wait_for() doesn't work for me.
I have such piece of code (planning to make my implementation of telnet client):
def expect(self, pattern, timeout=20):
if type(pattern) == str:
pattern = pattern.encode('ascii', 'ignore')
return self.loop.run_until_complete(asyncio.wait_for(self.asyncxpect(pattern), timeout))
async def asyncxpect(self, pattern): #receives data in a cumulative way until match is found
regexp = re.compile(b'(?P<payload>[\s\S]*)(?P<pattern>%s)' %pattern)
self.buffer = b''
while True:
# add timeout
# add exception handling for unexpectedly closed connections
data = await self.loop.sock_recv(self.sock, 10000)
self.buffer += data
m = re.match(regexp, self.buffer)
if m:
payload = m.group('payload')
match = m.group('pattern')
return payload, match
As I thought this code, at some point (in await statement) returns control to event loop. I thought it should happen when there is no more data to receive.
And if event loop has control, it can stop with timeout.
But if server doesn't send anything useful (that matched) my code just stumbles in this loop, right at await point.
I think it is different from this problem Python asyncio force timeout, because I'm not using blocking statements like time.sleep(n).
Here is my code
When the server closes the connection, sock_recv returns an empty bytearray (b''), indicating end of file. Since you don't handle that condition, your code ends up stuck in an infinite loop processing the same buffer.
To correct it, add something like:
if data == b'':
break
...after the data = await loop.sock_recv(...) line.
But the above still doesn't explain why wait_for is unable to cancel the rogue coroutine. The problem is that await doesn't mean "pass control to the event loop", as it is sometimes understood. It means "request value from the provided awaitable object, yielding control to the event loop if (and as long as) the object indicates that it does not have a value ready." The if is crucial: if the object does have a value ready when first asked, this value will be used immediately without ever deferring to the event loop. In other words, await doesn't guarantee that the event loop will get a chance to run.
For example, the following coroutine completely blocks the event loop and prevents any other coroutine from ever running, despite its inner loop consisting of nothing but awaiting:
async def busy_loop():
while True:
await noop()
async def noop():
pass
In your example, since the socket does not block at all when it is at end-of-file, the coroutine is never suspended, and (in collusion with the above bug) your coroutine never exits.
To ensure that other tasks get a chance to run, you can add await asyncio.sleep(0) in a loop. This should not be necessary for most code, where requesting IO data will soon result in a wait, at which point the event loop will kick in. (In fact, needing to do so often indicates a design flaw.) In this case it is only in combination with the EOF-handling bug that the code gets stuck.
Related
I appreciate that the question I am about to ask is rather broad but, as a newcomer to Python, I am struggling to find the [best] way of doing something which would be trivial in, say, Node.js, and pretty trivial in other environments such as C#.
Let's say that there is a warehouse full of stuff. And let's say that there is a websocket interface onto that warehouse with two characteristics: on client connection it pumps out a full list of the warehouse's current inventory, and it then follows that up with further streaming updates when the inventory changes.
The web is full of examples of how, in Python, you connect to the warehouse and respond to changes in its state. But...
What if I want to connect to two warehouses and do something based on the combined information retrieved separately from each one? And what if I want to do things based on factors such as time, rather than solely being driven by inventory changes and incoming websocket messages?
In all the examples I've seen - and it's beginning to feel like hundreds - there is, somewhere, in some form, a run() or a run_forever() or a run_until_complete() etc. In other words, the I/O may be asynchronous, but there is always a massive blocking operation in the code, and always two fundamental assumptions which don't fit my case: that there will only be one websocket connection, and that all processing will be driven by events sent out by the [single] websocket server.
It's very unclear to me whether the answer to my question is some sort of use of multiple event loops, or of multiple threads, or something else.
To date, experimenting with Python has felt rather like being on the penthouse floor, admiring the quirky but undeniably elegant decor. But then you get in the elevator, press the button marked "parallelism" or "concurrency", and the evelator goes into freefall, eventually depositing you in a basement filled with some pretty ugly and steaming pipes.
... Returning from flowery metaphors back to the technical, the key thing I'm struggling with is the Python equivalent of, say, Node.js code which could be as trivially simple as the following example [left inelegant for simplicity]:
var aggregateState = { ... some sort of representation of combined state ... };
var socket1 = new WebSocket("wss://warehouse1");
socket1.on("message", OnUpdateFromWarehouse);
var socket2 = new WebSocket("wss://warehouse2");
socket2.on("message", OnUpdateFromWarehouse);
function OnUpdateFromWarehouse(message)
{
... Take the information and use it to update aggregate state from both warehouses ...
}
Answering my own question, in the hope that it may help other Python newcomers... asyncio seems to be the way to go (though there are gotchas such as the alarming ease with which you can deadlock the event loop).
Assuming the use of an asyncio-friendly websocket module such as websockets, what seems to work is a framework along the following lines - shorn, for simplicity, of logic such as reconnects. (The premise remains a warehouse which sends an initial list of its full inventory, and then sends updates to that initial state.)
class Warehouse:
def __init__(self, warehouse_url):
self.warehouse_url = warehouse_url
self.inventory = {} # Some description of the warehouse's inventory
async def destroy():
if (self.websocket.open):
self.websocket.close() # Terminates any recv() in wait_for_incoming()
await self.incoming_message_task # keep asyncio happy by awaiting the "background" task
async def start(self):
try:
# Connect to the warehouse
self.websocket = await connect(self.warehouse_url)
# Get its initial message which describes its full state
initial_inventory = await self.websocket.recv()
# Store the initial inventory
process_initial_inventory(initial_inventory)
# Set up a "background" task for further streaming reads of the web socket
self.incoming_message_task = asyncio.create_task(self.wait_for_incoming())
# Done
return True
except:
# Connection failed (or some unexpected error)
return False
async def wait_for_incoming(self):
while self.websocket.open:
try:
update_message = await self.websocket.recv()
asyncio.create_task(self.process_update_message(update_message))
except:
# Presumably, socket closure
pass
def process_initial_inventory(self, initial_inventory_message):
... Process initial_inventory_message into self.inventory ...
async def process_update_message(self, update_message):
... Merge update_message into self.inventory ...
... And fire some sort of event so that the object's
... creator can detect the change. There seems to be no ...
... consensus about what is a pythonic way of implementing events, ...
... so I'll declare that - potentially trivial - element as out-of-scope ...
After completing the initial connection logic, one key thing is setting up a "background" task which repeatedly reads further update messages coming in over the websocket. The code above doesn't include any firing of events, but there are all sorts of ways in which process_update_message() can/could do this (many of them trivially simple), allowing the object's creator to deal with notifications whenever and however it sees fit. The streaming messages will continue to be received, and any events will be continued to be fired, for as long as the object's creator continues to play nicely with asyncio and to participate in co-operative multitasking.
With that in place, a connection can be established along the following lines:
async def main():
warehouse1 = Warehouse("wss://warehouse1")
if await warehouse1.start():
... Connection succeeded. Update messages will now be processed
in the "background" provided that other users of the event loop
yield in some way ...
else:
... Connection failed ...
asyncio.run(main())
Multiple warehouses can be initiated in several ways, including doing a create_task(warehouse.start()) on each one and then doing a gather on the tasks to ensure/check that they're all okay.
When it's time to quit, to keep asyncio happy, and to stop it complaining about orphaned tasks, and to allow everything to shut down nicely, it's necessary to call destroy() on each warehouse.
But there's one common element which this doesn't cover. Extending the original premise above, let's say that the warehouse also accepts requests from our websocket client, such as "ship X to Y". The success/failure responses to these requests will come in alongside the general update messages; it generally won't be possible to guarantee that the first recv() after the send() of a request will be the response to that request. This complicates process_update_message().
The best answer I've found may or may not be considered "pythonic" because it uses a Future in a way which is strongly analogous to a TaskCompletionSource in .NET.
Let's invent a couple of implementation details; any real-world scenario is likely to look something like this:
We can supply a request_id when submitting an instruction to the warehouse
The success/failure response from the warehouse repeats the request_id back to us (and thus also distinguishing between command-response messages versus inventory-update messages)
The first step is to have a dictionary which maps the ID of pending, in-progress requests to Future objects:
def __init__(self, warehouse_url):
...
self.pending_requests = {}
The definition of a coroutine which sends a request then looks something like this:
async def send_request(self, some_request_definition)
# Allocate a unique ID for the request
request_id = <some unique request id>
# Create a Future for the pending request
request_future = asyncio.Future()
# Store the map of the ID -> Future in the dictionary of pending requests
self.pending_requests[request_id] = request_future
# Build a request message to send to the server, somehow including the request_id
request_msg = <some request definition, including the request_id>
# Send the message
await self.websocket.send(request_msg)
# Wait for the future to complete - we're now asynchronously awaiting
# activity in a separate function
await asyncio.wait_for(command_future, timeout = None)
# Return the result of the Future as the return value of send_request()
return request_future.result()
A caller can create a request and wait for its asynchronous response using something like the following:
some_result = await warehouse.send_request(<some request def>)
The key to making this all work is then to modify and extend process_update_message() to do the following:
Distinguish between request responses versus inventory updates
For the former, extract the request ID (which our invented scenario says gets repeated back to us)
Look up the pending Future for the request
Do a set_result() on it (whose value can be anything depending on what the server's response says). This releases send_request() and causes the await from it to be resolved.
For example:
async def process_update_message(self, update_message):
if <some test that update_message is a request response>:
request_id = <extract the request ID repeated back in update_message>
# Get the Future for this request ID
request_future = self.pending_requests[request_id]
# Create some sort of return value for send_request() based on the response
return_value = <some result of the request>
# Complete the Future, causing send_request() to return
request_future.set_result(return_value)
else:
... handle inventory updates as before ...
I've not used sockets with asyncio, but you're likely just looking for asyncio's open_connection
async def socket_activity(address, callback):
reader, _ = await asyncio.open_connection(address)
while True:
message = await reader.read()
if not message: # empty bytes on EOF
break # connection was closed
await callback(message)
Then add these to the event loop
tasks = [] # keeping a reference prevents these from being garbage collected
for address in ["wss://warehouse1", "wss://warehouse2"]:
tasks.append(asyncio.create_task(
socket_activity(address, callback)
))
# return tasks # or work with them
If you want to wait in a coroutine until N operations are complete, you can use .gather()
Alternatively, you may find Tornado does everything you want and more (I based my Answer off this one)
Tornado websocket client: how to async on_message? (coroutine was never awaited)
Code:
loop = asyncio.get_event_loop()
# add tasks to the loop
# ...
# and then run the loop
try:
loop.run_forever()
except KeyboardInterrupt:
print(loop)
# Here I need to run a cleanup, I still need to use the event loop
# Can I still use the event loop here? like:
loop.run_until_complete(some_cleanup_coro())
when I print the event loop in the except block, I see the output: WindowsSelectorEventLoop, with closed=False, running=False.
Does it means I can't use the event loop in the except block? then how can I run a cleanup coroutine?
The run_until_complete call hangs and not running. So pressing ctrl+c does not exit and I have to close the terminal itself.
What's the difference between loop.close() and loop.stop(), and should I call these? The docs say nothing about loop.stop().
My cleanup_coro() does mostly asyncio.open_connection(..) and just sends and receives one message. (From what I see, the message is not sent at all).
Does it means I can't use the event loop in the except block?
You can keep using the event loop until it closes.
So pressing ctrl+c does not exit and I have to close the terminal itself.
Why Ctrl+C doesn't work is because something keep running, so it did not give the loop a chance to accept the signal (Ctrl+C). Just like:
import asyncio
async def go():
await asyncio.sleep(60)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(go())
According to your last edit, I think there is a line of code blocking you before you send the message. But for me to tell you what it is, you would need to post your cleanup_coro() content. Or you can use print() to mark what happens where (i.e. which line of code is pending).
What's the difference between loop.close() and loop.stop(), and should I call these? The docs says nothing about loop.stop().
You can imagine that loop.close() and loop.stop() are similar to "next song" and "stop this song". If you don't need "this song", so play the "next one". If you want to continue the task later, use stop. However, when you don't need the running task at all, you can close the loop. Note: If any tasks is running you must stop before closing.
In your except block, the loop is no longer running, but not closed, so you can start it again (loop.run*). This state occurs when something calls loop.stop. asyncio will do this for you in the case of exceptions that bubble up through the code that started the loop. You can start and stop a loop as often as you’d like.
Without knowing what some_cleanup_coroutine does, or what other coroutines are still schedule with the event loop, it’s hard to say when your code is getting hung.
Calling loop.close will prevent you from starting it again. You’d need to get a new loop (asyncio.new_event_loop) if you want to do more work.
I have a leetle test program which does not work unless I do a client.flush() before client.close(). A naive reading of nats/aio/client.py seems to suggest that close() does some kind of flush(), so I do not understand why my test program fails. Is it necessary to always flush() before close()? Example programs don't seem to indicate so.
I can see that close() calls _flush_pending(), but this does something quite different from flush():
flush() calls _send_ping() to send a PING and waits for a PONG response. _send_ping() writes directly to self._io_writer, and then calls _flush_pending().
_flush_pending() pushes a None (anything will do, I guess) into self._flush_queue. This presumably wakes the _flusher() and causes it to write everything in self._pending to self._io_writer.
publish() calls _send_command() to push the messages onto self._pending, and then also calls _flush_pending() to cause the _flusher() to write everything.
The test program:
#!/usr/bin/env python3.5
import asyncio
import nats.aio.client
import nats.aio.errors
async def send_message(loop):
mq_url = "nats://nats:password#127.0.0.1:4222"
client = nats.aio.client.Client()
await client.connect(io_loop=loop, servers=[mq_url])
await client.publish("test_subject", "test1".encode())
#await client.flush()
await client.close()
def main():
loop = asyncio.get_event_loop()
loop.run_until_complete(send_message(loop))
loop.close()
if __name__ == '__main__':
main()
FWIW if I send a large number of messages then, at some point (haven't yet worked out the exact conditions), messages are sent. We noticed the behaviour when processing a collection of files and sending a message for each line in the file: some files were making it through, others weren't, and it turned out it was the larger files (more lines) that made it. So it looks as though some internal buffer is filled and this forces a flush.
Looks like was probably a bug which has just been fixed: https://github.com/nats-io/asyncio-nats/pull/35
I have python server that waits for a global flag to be set and exits.
In a few threads, I have code that waits using zmq.Poller for
a message. It times out, prints a heartbeat message, then waits on poller
for a new message:
def timed_recv(zock, msec=5000.0):
poller = zmq.Poller()
poller.register(zock, zmq.POLLIN)
events = dict(poller.poll(msec))
data = None
if events and events.get(zock) == zmq.POLLIN:
# if a message came in time, read it.
data = zock.recv()
return data
So in the above function, I wait for 5 seconds for a message to arrive. If none does, the function returns, the calling loop prints a message and waits for a new message:
while not do_exit():
timed_recv(zock)
print "Program still here!"
sys.exit()
do_exit() checks a global flag for exitting.
Now, if the flag is set, there can be a 5 second delay between it being set, and the loop exitting. How, can I poll for both zock input, and for the global flag being set so that the loop exits quickly?
I thought I can add to the poller, a file descriptor that closes upon global flag being set. Does that seem reasonable? It seems kind of hackish.
Is there a better way to wait for global flag and POLLIN on zock?
(We are using zmq version 3.0 on debian.)
thanks.
The easiest way is to drop the use of a flag, and use another 0mq socket to convey a message. The poller can then wait on both 0mq sockets. The message could be just a single byte; it's arrival in the poller is the message, not its content.
In doing that you're heading down the road to Actor Model programming.
It's a whole lot easier if a development sticks to one programming model; mixing stuff up (e.g. 0mq and POSIX condition variables) is inviting a lot of problems.
I'm making a chat application in twisted. Suppose my server is designed in such a way that whenever it detects a client online, it sends the client all the pending-messages (those messages of that client which were cached in a python-list on the server because it was offline) one-by-one in a while loop until the list is exhausted. Something like this:
class MyChat(LineReceiver):
def connectionMade(self):
self.factory.clients.append(self)
while True:
#retrieve first message from a list of pending-messages(queue) of "self"
msg = self.retrieveFromQueue(self)
if msg != "empty":
self.transport.write(msg)
else:
break
def lineReceived(self, line):
...
def connectionLost(self, reason):
...
def retrieveFromQueue(self, who):
msglist = []
if who in self.factory.userMessages:
msglist = self.factory.userMessages[who]
if msglist != []:
msg = msglist.pop(0) #msglist is a list of strings
self.factory.userMessages[self] = msglist
return msg
else:
return "empty"
factory.userMessages = {} #dict of list of incoming messages of users who aren't online
So according to my understanding of Twisted, the while loop will block the main reactor thread and any interaction from any other client with the server will not be registered by the server. If that's the case, I want an alternate code/method to this approach which will not block the twisted thread.
Update: There may be 2000-3000 pending messages per user because of the nature of the app.
I think that https://glyph.twistedmatrix.com/2011/11/blocking-vs-running.html addresses this point.
The answer here depends on what exactly self.retrieveFromQueue(self) does. You implied it's something like:
if self.list_of_messages:
return self.list_of_messages.pop(0)
return b"empty"
If this is the case, then the answer is one thing. On the other hand, if the implementation is something more like:
return self.remote_mq_client.retrieve_queue_item(self.queue_identifier)
then the answer might be something else entirely. However, note that it's the implementation of retrieveFromQueue upon which the answer appears to hinge.
That there is a while loop isn't quite as important. The while loop reflects the fact that (to use Glyph's words), this code is getting work done.
You may decide that the amount of work this loop represents is too great to all get done at one time. If there are hundreds of millions of queued messages then copying them one by one into the connection's send buffer will probably use both a noticable amount of time and memory. In this case, you may wish to consider the producer/consumer pattern and its support in Twisted. This won't make the code any less (or more) "blocking" but it will make it run for shorter periods of time at a time.
So the questions to answer here are really:
whether or not retrieveFromQueue blocks
if it does not block, whether or not there will be so many queued messages that processing them all will cause connectionMade to run for so long that other clients notice a disruption in service