Im trying to write perl TCP server / python TCP client, and i have the such code now:
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_address = ("127.0.0.1", 9000)
sock.connect(server_address)
try:
message = unicode('Test')
sock.sendall(message)
data = sock.recv(1024)
print data
finally:
sock.close()
And i have noticed, that my TCP server (written in perl) is getting message not after sendall(message), but after close(). Server is working like an echo server, and sends data to client after getting a message. And that causes deadlock, server never gets a message, client never gets a response. What could be a problem? What is going to happen during close(), that message comes to server?
I'm going to hazard a guess that this is due to the server's implementation. There are many ways of writing an echo server:
receieve bytes in a loop (or async callback) until EOF; as the bytes are recieved (each loop iteration), echo them without any processing or buffering; when an EOF is found (the inbound stream is closed), close the outbound stream
read lines at a time (assume it is a text protocol), i.e. looking for CR / LF / EOF; when a line is found, return the line - when an EOF is found (the inbound stream is closed), close the outbound stream
read to an EOF; then return everything and close the outbound stream
If the echo server uses the first approach, it will work as expected already - so we can discount that.
For the second approach, you are sending text but no CR / LF, and you haven't closed the stream from client to server (EOF), so the server will never reply to this request. So yes, it will deadlock.
If it is the third approach, then again - unless you close the outbound stream, it will deadlock.
From your answer, it looks like adding a \n "fixes" it. From that, I conclude that your echo-server is line-based. So two solutions, and a third that would work in any scenario:
make the echo-server respond to raw data, rather than lines
add an end-of-line marker
close the outbound stream at the client, i.e. the client-to-server stream (many network APIs allow you to close the outbound and inbound streams separately)
Additionally: ensure Nagle is disabled (often called NO_DELAY) - this will prevent the bytes sitting at the client for a while, waiting to be composed into a decent sized packet (this applies to 1 & 2, but not 3; having Nagle enabled would add a delay, but will not usually cause a deadlock).
Related
I'm implementing a file transfer protocol with the following use case:
The server sends the file chunk by chunk inside several frames.
The client might cancel the transfer: for this, it sends a message and disconnects at TCP level.
What happened in that case on server side (Python running on Windows) is that I catch a ConnectionResetException (this is normal, the client has disconnected the socket) while sending the data to the client. I would want to read the latest data sent by the client (the message used to abort the call), but calling mysocket.recv() still raises a ConnectionResetException.
With a wireshark capture, I can clearly see that the message was properly sent by the client prior to TCP disonnection.
Any idea floks? Thanks!
VR
In order to understand what to do about this situation, you need to understand how a TCP connection is closed (see, e.g. this) and how the socket API relates to a clean shutdown (without fail, see this).
Your client is most likely calling close to terminate the connection. The problem with this is that there may be unread data in the socket receive queue or data arriving shortly from the other end that you will no longer be able to read, which is basically an error condition. To signal to the other end that data sent cannot be delivered to the receiving application, a reset is sent (well, technically, "SHOULD be sent" as per the RFC) and the TCP connection is abnormally terminated.
You might think that enabling SO_LINGER will help (many, many bits have been spilt over this so I won't elaborate further), but it won't solve the problem of unread data by the client causing the reset.
The client needs to instead call shutdown(SHUT_WR) to indicate that it is done sending, and then continue to call recv() until it reads 0 bytes indicating the other side is done sending. You may then call close().
Note that the Python 2 socket documentation states that
Depending on the platform, shutting down one half of the connection can also close the opposite half (e.g. on Mac OS X, shutdown(SHUT_WR) does not allow further reads on the other end of the connection).
This sounds like a bug to me. To get around this, you would have to send your cancel message, then keep reading until you get 0 bytes so that you know the server received the cancel message. You may then close the socket.
The Python 3.8 docs make no such disclaimer.
I am implementing this example for using select in a simple echo server. Everything works fine, the client sends the message, receives the echo and disconnects.
This is the code I used for the client:
import socket
ECHOER_PORT = 10000
if __name__ == '__main__':
sockfd = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sockfd.connect(('localhost', ECHOER_PORT))
msg = input('Message: ').encode()
sockfd.send(msg)
response = sockfd.recv(len(msg))
print('Response: {}'.format(response))
sockfd.close()
The issue is with the server (full gist code is here), after sending the echo, select is called again one more time and the current client (which received the echo and disconnected) is returned from select as both readable and writable.
I understand why it's returned as readable, since according to the article:
A readable socket without data available is from a client that has
disconnected, and the stream is ready to be closed.
But my question is why does return as writable also?
But my question is why does return as writable also?
The main thing you want to have select() do when a client has disconnected is return immediately, so that your code can detect the disconnection-event and handle it ASAP (by closing the socket).
One way to do this is the common way, by having the server select-for-read on every socket, and when a client disconnects, select() will return ready-for-read on that socket, then the program will call recv() on the socket to find out what the data is, recv() will return EOF, and the server will close the socket. That is all well and good.
Now imagine the less-common (but not unheard of) case where the server writes to its client sockets, but doesn't want to read from them. In this case, the server has no need (or desire) to select for ready-to-read on its sockets; it only needs to select for ready-to-write, to know when there is some outgoing-data-buffer-space available to send more data to a client. That server still needs to know when a client has disconnected, though -- which is why the disconnected socket selects as ready-for-write as well, so that a select() that is only watching for ready-for-write can also detect and react to a disconnected socket immediately.
I'm going crazy writing a little socket server in python. Everything was working fine, but I noticed that in the case where the client just disappears, the server can't tell. I simulate this by pulling the ethernet cable between the client and server, close the client, then plug the cable back in. The server never hears that the client disconnected and will wait forever, never allowing more clients to connect.
I figured I'd solve this by adding a timeout to the read loop so that it would try and read every 10 seconds. I thought maybe if it tried to read from the socket it would notice the client was missing. But then I realized there really is no way for the server to know that.
So I added a heartbeat. If the server goes 10 seconds without reading, it will send data to the client. However, even this is successful (meaning doesn't throw any kind of exception). So I am able to both read and write to a client that isn't there any more. Is there any way to know that the client is gone without implementing some kind of challenge/response protocol between the client and server? That would be a breaking change in this case and I'd like to avoid it.
Here is the core of my code for this:
def _loop(self):
command = ""
while True:
socket, address = self._listen_socket.accept()
self._socket = socket
self._socket.settimeout(10)
socket.sendall("Welcome\r\n\r\n")
while True:
try:
data = socket.recv(1)
except timeout: # Went 10 seconds without data
pass
except Exception as e: # Likely the client closed the connection
break
if data:
command = command + data
if data == "\n" or data == "\r":
if len(command.strip()) > 0:
self._parse_command(command.strip(), socket)
command = ""
if data == '\x08':
command = command[:-2]
else: # Timeout on read
try:
self._socket.sendall("event,heartbeat\r\n") # Send heartbeat
except:
self._socket.close()
break
The sendall for the heartbeat never throws an exception and the recv only throws a timeout (or another exception if the client properly closes the connection under normal circumstances).
Any ideas? Am I wrong that sending to a client that doesn't ACK should generate an exception eventually (I've tested for several minutes).
The behavior you are observing is the expected behavior for a TCP socket connection. In particular, in general the TCP stack has no way of knowing that an ethernet cable has been pulled or that the (now physically disconnected) remote client program has shut down; all it knows is that it has stopped receiving acknowledgement packets from the remote peer, and for all it knows the packets could just be getting dropped by an overloaded router somewhere and the issue will resolve itself momentarily. Given that, it does what TCP always does when its packets don't get acknowledged: it reduces its transmission rate and its number-of-packets-in-flight limit, and retransmits the unacknowledged packets in the hope that they will get through this time.
Assuming the server's socket has outgoing data pending, the TCP stack will eventually (i.e. after a few minutes) decide that no data has gone through for a long-enough time, and unilaterally close the connection. So if you're okay with a problem-detection time of a few minutes, the easiest way to avoid the zombie-connection problem is simply to be sure to periodically send a bit of heartbeat data over the TCP connection, as you described. When the TCP stack tries (and repeatedly fails) to get the outgoing data sent-and-acknowledged, that is what eventually will trigger it to close the connection.
If you want something quicker than that, you'll need to implement your own challenge/response system with timeouts (either over the TCP socket, or over a separate TCP socket, or over UDP), but note that in doing so you are likely to suffer from false positives yourself (e.g. you might end up severing a TCP connection that was not actually dead but only suffering from a temporary condition of lost packets due to congestion). Whether or not that's a worthwhile tradeoff depends on what sort of program you are writing. (Note also that UDP has its own issues, particularly if you want your system to work across firewalls, etc)
I am trying to get a python program and Allegro Common Lisp program to communicate over sockets. For now, I am trying to set up a Lisp server that listens for connections, get a python client to connect to the server, and then send a simple message from the client to the server. The Lisp server looks as follows:
(let ((socket (socket:make-socket :connect :passive
:type :stream
:address-family :internet
:local-port 45676)))
(format t "opened up socket ~A for connections~%" socket)
;; now wait for connections and accept
(let ((client (socket:accept-connection socket
:wait t)))
(when client
;; we've got a new connection from client
(format t "got a new connection from ~A~%" client)
;; now wait for data from client
(loop until (listen client) do
(format t "first character seen is ~A~%" (peek-char t client))
(let ((data (read-line client)))
(format t "data received: ~A~%" data))))))
The python client looks as follows:
import socket
import time
s = socket.socket (socket.AF_INET, socket.SOCK_STREAM)
s.connect (('', 45676))
time.sleep (1) # works if uncommented, does not work if commented
s.sendall ("(list A B)")
s.close ()
Since I want to have multiple messages pass over this stream, I listen for incoming data on
the server side and then echo it. However, I noticed a problem when I ran this with the sleep command commented. The output simply looked like this:
opened up socket #<MULTIVALENT stream socket waiting for connection
at */45676 # #x207b0cd2> for connections
got a new connection from #<MULTIVALENT stream socket connected from
localhost/45676 to localhost/60582 #
#x207b34f2>
In other words, it did not actually echo the data (in my case, "(list A B)"). If I uncommented the sleep command (to introduce some delay between the connection initiation and sending of the data), the output looked like this:
opened up socket #<MULTIVALENT stream socket waiting for connection
at */45676 # #x207b0bea> for connections
got a new connection from #<MULTIVALENT stream socket connected from
localhost/45676 to localhost/60572 #
#x207b340a>
data received: (list A B)
I'm not sure why this is the case. Does anyone have a solution for this? Is it a bad idea to reuse the same socket connection for multiple exchanges of data? If I remove the entire loop macro call (and thus make it a one-time exchange) the data is received without any problems and is echoed properly
EDIT 1: The last statement is true even with the sleep command commented.
I want to connect two programs via TCP. My main program is written with Qt and needs to talk to another program written in Python. I think about using TCP sockets and Google's protobuf to exchange the messages. In Qt, I use a QTcpSocket that accepts the connection and reads from the stream, as soon as its readyRead-Signal is triggered. In python, I also use a tcp-socket and send messages.
This works very well, as long as no side is killed. Currently, the python-side is sending messages to the C++ side. (socket.send(str(id)+"\ņ")) After every send, I check for exceptions (connection reset by peer, broken pipe, ...) to see if the message was received.
If I kill the C++ program, the next message send from the python client triggers no exception, but is obviously not received. The next message triggers the exception, but the last message is lost.
After a bit of experimenting, I found that sending an empty message (socket.send("\n")) after each message solves the problem. I do now
try:
s.send(str(id)+"\n");
s.send("\n")
sleep(0.5)
except socket.error,v:
print "FAILed to send",id,v[0],v[1]
and receive the exception as soon as the C++-Peer is killed (calling s.send(str(id)+"\n\n") however does not help).
Finally, my question is: Is this a reliable way to check if my message was received?
I don't want to switch to UDP as I don't want to implement my own ACK-messages for each message.
This is my first time I use sockets with python and C++ and can't really explain why my approach works, so I'm a bit uncomfortable using it.
Can someone tell me a a bit more? I guess that the python socket expects an ACK for the first send(int(id)+"\n") after sending the send("\n") and then realizes that the pipe is broken. Is this correct?
When a TCP connection is broken by the remote peer, your TCP socket will become ready-for-read, and then when you try to recv() from it, recv() will return 0.
Of course if your sending program is only calling send() (the way your Python program is), then it won't notice what's going on with the socket's recv-side, and you end up with the problem you described.
On the other hand, you don't want to just blindly call recv() either, because if recv() is called and the remote peer hasn't sent any data, recv() will block waiting for data and unless the remote peer ever actually sends some, you'll have a deadlock.
The simplest way to deal with that is to use select() to multiplex your I/O, so that your Python script can know when it's appropriate to call send() and/or recv(). Something like this:
import socket
import select
[...]
while 1:
socketsToReadFrom = [s]
if (you_still_have_more_data_to_send):
socketsToWriteTo = [s]
else:
socketsToWriteTo = None
# This select() call will block until there's something to do
socketsReadForRead, socketsReadyForWrite, junk = select.select(socketsToReadFrom, socketsToWriteTo, None)
if (s in socketsToReadFrom):
readBytes = s.recv(1024)
if (len(readBytes) > 0):
print "Read %i bytes from remote peer!" % readBytes
else:
print "Remote peer closed the TCP Connection!!"
break
if ((socketsToWriteTo != None) and (s in socketsToWriteTo)):
s.send(some_more_data)
As far as verifying whether your message was received, that's a bit tricky since TCP (and the network stack) do a fair amount of pipelining/buffering. In particular, a successful return from send() only tells you that your data has been handed off to your local TCP stack's outgoing-data buffer; it doesn't mean that the data has arrived at the remote peer already. If you really want a "receipt" that the remote peer has already processed the data, you'll have to have the remote peer send back some kind of acknowledgement. Note that under TCP that level of sophistication is often unnecessary though, since barring a network or hardware failure (or the remote peer closing the TCP connection), you can be fairly sure that the TCP stack will get your data there eventually; e.g. if a packet got dropped, the TCP stack will resend it automatically. Data loss will only occur if the network connectivity stops working for an extended period (e.g. several minutes), at which point the TCP stack will give up and close the TCP connection.