I'm running into a somewhat perplexing problem in my setup and I'm not sure if it's because I'm doing something wrong with the asyncio socket primitives or there's a deeper issue with my networking setup.
Basically, I have a server (call it sleep-server) that instantiates a long lived TCP socket connection with another server (call it beacon-server). This is done using asyncio.open_connection:
beacon_reader, beacon_writer = await asyncio.open_connection(
beacon_server_endpoint.ip, BEACON_PORT)
The long lived connection is kicked off with an initial request/response to the beacon server. Here's that code:
beacon_writer.write(request)
await beacon_writer.drain()
# Read the beacon server's response
try:
beacon_response = await asyncio.wait_for(beacon_reader.read(4096), timeout=10)
except asyncio.TimeoutError as e:
print('Timed out waiting for beacon server sleep response')
raise SleepRequestError(e)
# Do some stuff with the response data
This works completely fine and well. After that, the application enters a flow where it periodically writes TCP data to the Beacon server on the same socket:
while True:
data = method_that_gets_some_data()
beacon_writer.write(data)
await beacon_writer.drain()
This also works completely fine. The beacon server receives its data and does what it needs to do. During this stage, the beacon server does not write anything back on the socket.
It's only when I try to receive data from the beacon server socket again that things go wrong. Basically, via a separate mechanism the beacon server is told to fetch the existing socket (the one that is currently being written to by the sleep server) and send a message down to it. For business reasons, the beacon server manages its socket connections using gevent and Python's built in socket library, but it's the same stuff, and based on the above flow we already know it works. Here's the code for sending a message back down onto the socket:
# client.connection is just a python Socket object
client.connection.setblocking(0)
client.connection.send(client.wake_up_secret.decode('hex'))
print('sending client shutdown')
client.connection.shutdown(socket.SHUT_WR)
And here's the code on the other server that receives this message:
data = beacon_reader.read()
Based on some logging I can see that the Beacon server is properly performing the send() to the socket. I also see that the sleep-server is getting to the code where it runs beacon_reader.read(). However, the sleep server just blocks there even after the Beacon server has sent down its packet. I've tried playing around with the size paramater in the read() command, even setting it as low as 2 just to see if I could get anything off the socket, but there's nothing. I've also tried removing the connection.shutdown(), which doesn't seem to help either.
What's baffling is clearly it worked before since I was able to get an initial response, and clearly it's tied to the correct socket since the beacon_writer is functioning with no problems. So I'm curious if I'm just doing something stupid with the underlying asyncio library or something.
Related
I'm implementing a file transfer protocol with the following use case:
The server sends the file chunk by chunk inside several frames.
The client might cancel the transfer: for this, it sends a message and disconnects at TCP level.
What happened in that case on server side (Python running on Windows) is that I catch a ConnectionResetException (this is normal, the client has disconnected the socket) while sending the data to the client. I would want to read the latest data sent by the client (the message used to abort the call), but calling mysocket.recv() still raises a ConnectionResetException.
With a wireshark capture, I can clearly see that the message was properly sent by the client prior to TCP disonnection.
Any idea floks? Thanks!
VR
In order to understand what to do about this situation, you need to understand how a TCP connection is closed (see, e.g. this) and how the socket API relates to a clean shutdown (without fail, see this).
Your client is most likely calling close to terminate the connection. The problem with this is that there may be unread data in the socket receive queue or data arriving shortly from the other end that you will no longer be able to read, which is basically an error condition. To signal to the other end that data sent cannot be delivered to the receiving application, a reset is sent (well, technically, "SHOULD be sent" as per the RFC) and the TCP connection is abnormally terminated.
You might think that enabling SO_LINGER will help (many, many bits have been spilt over this so I won't elaborate further), but it won't solve the problem of unread data by the client causing the reset.
The client needs to instead call shutdown(SHUT_WR) to indicate that it is done sending, and then continue to call recv() until it reads 0 bytes indicating the other side is done sending. You may then call close().
Note that the Python 2 socket documentation states that
Depending on the platform, shutting down one half of the connection can also close the opposite half (e.g. on Mac OS X, shutdown(SHUT_WR) does not allow further reads on the other end of the connection).
This sounds like a bug to me. To get around this, you would have to send your cancel message, then keep reading until you get 0 bytes so that you know the server received the cancel message. You may then close the socket.
The Python 3.8 docs make no such disclaimer.
I am implementing this example for using select in a simple echo server. Everything works fine, the client sends the message, receives the echo and disconnects.
This is the code I used for the client:
import socket
ECHOER_PORT = 10000
if __name__ == '__main__':
sockfd = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sockfd.connect(('localhost', ECHOER_PORT))
msg = input('Message: ').encode()
sockfd.send(msg)
response = sockfd.recv(len(msg))
print('Response: {}'.format(response))
sockfd.close()
The issue is with the server (full gist code is here), after sending the echo, select is called again one more time and the current client (which received the echo and disconnected) is returned from select as both readable and writable.
I understand why it's returned as readable, since according to the article:
A readable socket without data available is from a client that has
disconnected, and the stream is ready to be closed.
But my question is why does return as writable also?
But my question is why does return as writable also?
The main thing you want to have select() do when a client has disconnected is return immediately, so that your code can detect the disconnection-event and handle it ASAP (by closing the socket).
One way to do this is the common way, by having the server select-for-read on every socket, and when a client disconnects, select() will return ready-for-read on that socket, then the program will call recv() on the socket to find out what the data is, recv() will return EOF, and the server will close the socket. That is all well and good.
Now imagine the less-common (but not unheard of) case where the server writes to its client sockets, but doesn't want to read from them. In this case, the server has no need (or desire) to select for ready-to-read on its sockets; it only needs to select for ready-to-write, to know when there is some outgoing-data-buffer-space available to send more data to a client. That server still needs to know when a client has disconnected, though -- which is why the disconnected socket selects as ready-for-write as well, so that a select() that is only watching for ready-for-write can also detect and react to a disconnected socket immediately.
I am creating a file server in python using ftp sockets where multiple clients send images to a server. This is the method I am using to receive the image on the server side:
while True:
data = client.recv(512)
if not data:
break
file.write(data)
And this is the method I am using to send the image to the server:
while True:
data = file.read(512)
if not data:
break
server.send(data)
The problem I am running into is that on the server side the while loop is never exited which means the code is either stuck in the recv call or the if statement is never true. On the client side there are no problems, the loop is exited properly. I've heard that the client side will send something to the server to tell it to stop but the if statement doesn't seem to pick it up. How can I get the server to stop trying to receive without closing the connection?
https://docs.python.org/2/howto/sockets.html#disconnecting
On the client, close the socket to the server. On the server, check whether the recv returned 0 bytes.
Also from the documentation for using a socket:
When a recv returns 0 bytes, it means the other side has closed (or is
in the process of closing) the connection. You will not receive any
more data on this connection. Ever. You may be able to send data
successfully; I’ll talk more about this later.
Data will never be nothing unless the client closes the connection.
server.close()
I'm going crazy writing a little socket server in python. Everything was working fine, but I noticed that in the case where the client just disappears, the server can't tell. I simulate this by pulling the ethernet cable between the client and server, close the client, then plug the cable back in. The server never hears that the client disconnected and will wait forever, never allowing more clients to connect.
I figured I'd solve this by adding a timeout to the read loop so that it would try and read every 10 seconds. I thought maybe if it tried to read from the socket it would notice the client was missing. But then I realized there really is no way for the server to know that.
So I added a heartbeat. If the server goes 10 seconds without reading, it will send data to the client. However, even this is successful (meaning doesn't throw any kind of exception). So I am able to both read and write to a client that isn't there any more. Is there any way to know that the client is gone without implementing some kind of challenge/response protocol between the client and server? That would be a breaking change in this case and I'd like to avoid it.
Here is the core of my code for this:
def _loop(self):
command = ""
while True:
socket, address = self._listen_socket.accept()
self._socket = socket
self._socket.settimeout(10)
socket.sendall("Welcome\r\n\r\n")
while True:
try:
data = socket.recv(1)
except timeout: # Went 10 seconds without data
pass
except Exception as e: # Likely the client closed the connection
break
if data:
command = command + data
if data == "\n" or data == "\r":
if len(command.strip()) > 0:
self._parse_command(command.strip(), socket)
command = ""
if data == '\x08':
command = command[:-2]
else: # Timeout on read
try:
self._socket.sendall("event,heartbeat\r\n") # Send heartbeat
except:
self._socket.close()
break
The sendall for the heartbeat never throws an exception and the recv only throws a timeout (or another exception if the client properly closes the connection under normal circumstances).
Any ideas? Am I wrong that sending to a client that doesn't ACK should generate an exception eventually (I've tested for several minutes).
The behavior you are observing is the expected behavior for a TCP socket connection. In particular, in general the TCP stack has no way of knowing that an ethernet cable has been pulled or that the (now physically disconnected) remote client program has shut down; all it knows is that it has stopped receiving acknowledgement packets from the remote peer, and for all it knows the packets could just be getting dropped by an overloaded router somewhere and the issue will resolve itself momentarily. Given that, it does what TCP always does when its packets don't get acknowledged: it reduces its transmission rate and its number-of-packets-in-flight limit, and retransmits the unacknowledged packets in the hope that they will get through this time.
Assuming the server's socket has outgoing data pending, the TCP stack will eventually (i.e. after a few minutes) decide that no data has gone through for a long-enough time, and unilaterally close the connection. So if you're okay with a problem-detection time of a few minutes, the easiest way to avoid the zombie-connection problem is simply to be sure to periodically send a bit of heartbeat data over the TCP connection, as you described. When the TCP stack tries (and repeatedly fails) to get the outgoing data sent-and-acknowledged, that is what eventually will trigger it to close the connection.
If you want something quicker than that, you'll need to implement your own challenge/response system with timeouts (either over the TCP socket, or over a separate TCP socket, or over UDP), but note that in doing so you are likely to suffer from false positives yourself (e.g. you might end up severing a TCP connection that was not actually dead but only suffering from a temporary condition of lost packets due to congestion). Whether or not that's a worthwhile tradeoff depends on what sort of program you are writing. (Note also that UDP has its own issues, particularly if you want your system to work across firewalls, etc)
As far as I understand the basics of the client-server model, generally only client may initiate requests; server responds to them. Now I've run into a system where the server sends asynchronous messages back to the client via the same persistent TCP connection whenever it wants. So, a couple of questions:
Is it a right thing to do at all? It seems to really overcomplicate implementation of a client.
Are there any nice patterns/methodologies I could use to implement a client for such a system in Python? Changing the server is not an option.
Obviously, the client has to watch both the local request queue (i.e. requests to be sent to the server), and the incoming messages from the server. Launching two threads (Rx and Tx) per connection does not feel right to me. Using select() is a major PITA here. Do I miss something?
When dealing with asynchronous io in python I typically use a library such as gevent or eventlet. The objective of these libraries is allow for applications written in a synchronous to be multiplexed by a back-end reactor.
This basic example demonstrates the launching of two green threads/co-routines/fibers to handle either side of the TCP duplex. The send side of the duplex is listening on an asynchronous queue.
This is all performed within a single hardware thread. Both gevent && eventlet have more substantive examples in their documentation that what I have provided below.
If you run nc -l -p 8000 you will see "012" printed out. As soon netcat is exited, this code will be terminated.
from eventlet import connect, sleep, GreenPool
from eventlet.queue import Queue
def handle_i(sock, queue):
while True:
data = sock.recv(8)
if data:
print(data)
else:
queue.put(None) #<- signal send side of duplex to exit
break
def handle_o(sock, queue):
while True:
data = queue.get()
if data:
sock.send(data)
else:
break
queue = Queue()
sock = connect(('127.0.0.1', 8000))
gpool = GreenPool()
gpool.spawn(handle_i, sock, queue)
gpool.spawn(handle_o, sock, queue)
for i in range(0, 3):
queue.put(str(i))
sleep(1)
gpool.waitall() #<- waits until nc exits
I believe what you are trying to achieve is a bit similar to jsonp. While sending to the client, send through a callback method which you know of, that is existing in client.
like if you are sending "some data xyz", send it like server.send("callback('some data xyz')");. This suggestion is for javascript because it executes the returned code as if it were called through that method., and I believe you can port this theory to python with some difficulty. But I am not sure, though.
Yes this is very normal and Server can also send the messages to client after connection is made like in case of telnet server when you initiate a connection it sends you a message for the capability exchange and after that it asks you about your username & password.
You could very well use select() or if I were in your shoes I would have spawned a separate thread to receive the asynchronous messages from the server & would have left the main thread free to do further processing.