Explain socket buffers please - python

I was trying to find examples about socket programming and came upon this script:
http://stacklessexamples.googlecode.com/svn/trunk/examples/networking/mud.py
When reading through this script i found this line:
listenSocket.listen(5)
As i understand it - it reads 5 bytes from the buffer and then does stuff with it...
but what happens if more than 5 bytes were sent by the other end?
in the other place of that script it checks input against 4 commands and sees if there is \r\n in the string. dont commands like "look" plus \r\n make up for more than 5 bytes?
Alan

The following is applicable to sockets in general, but it should help answer your specific question about using sockets from Python.
socket.listen() is used on a server socket to listen for incoming connection requests.
The parameter passed to listen is called the backlog and it means how many connections should the socket accept and put in a pending buffer until you finish your call to accept(). That applies to connections that are waiting to connect to your server socket between the time you have called listen() and the time you have finished a matching call to accept().
So, in your example you're setting the backlog to 5 connections.
Note.. if you set your backlog to 5 connections, the following connections (6th, 7th etc.) will be dropped and the connecting socket will receive an error connecting message (something like a "host actively refused the connection" message)

This might help you understand the code: http://www.amk.ca/python/howto/sockets/

The argument 5 to listenSocket.listen isn't the number of bytes to read or buffer, it's the backlog:
socket.listen(backlog)
Listen for connections made to the
socket. The backlog argument specifies
the maximum number of queued
connections and should be at least 1;
the maximum value is system-dependent
(usually 5).

Related

Python - Read remaining data from socket after TCP RST

I'm implementing a file transfer protocol with the following use case:
The server sends the file chunk by chunk inside several frames.
The client might cancel the transfer: for this, it sends a message and disconnects at TCP level.
What happened in that case on server side (Python running on Windows) is that I catch a ConnectionResetException (this is normal, the client has disconnected the socket) while sending the data to the client. I would want to read the latest data sent by the client (the message used to abort the call), but calling mysocket.recv() still raises a ConnectionResetException.
With a wireshark capture, I can clearly see that the message was properly sent by the client prior to TCP disonnection.
Any idea floks? Thanks!
VR
In order to understand what to do about this situation, you need to understand how a TCP connection is closed (see, e.g. this) and how the socket API relates to a clean shutdown (without fail, see this).
Your client is most likely calling close to terminate the connection. The problem with this is that there may be unread data in the socket receive queue or data arriving shortly from the other end that you will no longer be able to read, which is basically an error condition. To signal to the other end that data sent cannot be delivered to the receiving application, a reset is sent (well, technically, "SHOULD be sent" as per the RFC) and the TCP connection is abnormally terminated.
You might think that enabling SO_LINGER will help (many, many bits have been spilt over this so I won't elaborate further), but it won't solve the problem of unread data by the client causing the reset.
The client needs to instead call shutdown(SHUT_WR) to indicate that it is done sending, and then continue to call recv() until it reads 0 bytes indicating the other side is done sending. You may then call close().
Note that the Python 2 socket documentation states that
Depending on the platform, shutting down one half of the connection can also close the opposite half (e.g. on Mac OS X, shutdown(SHUT_WR) does not allow further reads on the other end of the connection).
This sounds like a bug to me. To get around this, you would have to send your cancel message, then keep reading until you get 0 bytes so that you know the server received the cancel message. You may then close the socket.
The Python 3.8 docs make no such disclaimer.

How does a Python listening socket get setup?

When you setup a simple TCP listening socket using the Python 'socket' module, what are the different steps involved doing?
The code I'm talking about looks like this:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('localhost', 50000))
s.listen(1)
conn, addr = s.accept()
The s = ... seems pretty straightforward - you are expressing your intent to create an ipv4 TCP socket, without having done anything yet.
What I'm curious about is this:
What does it mean to bind to a socket, without listening?
How does limiting the number of unaccepted connections using listen(n) work?
If you have listen(1), you're in the middle of dealing with the first connection you accepted, and a second client tries to connect, is the second client waiting for the SYN-ACK? Or does the 3 way handshake happen, and he's waiting for actual data?
What happens if a third client tries to connect - does he immediately get a TCP RST?
Does setting the number of unaccepted connections here set some option in the kernel to indicate how many connections it should accept? Or is this all handled in Python?
How can you be listening without accepting? What does it mean to accept a connection?
Every article I've come across seems to just assume these steps make sense to everyone, without explaining what exactly it is that each one does. They just use generic terms like
listen() starts listening for connections
bind() binds to a socket
accept() just accepts the connection
Defining a word by using that word in the definition is kind of a dumb way to explain something.
it's basically a 1-to-1 from the POSIX c calls and as such I'm including links to the man pages, so that you can read their explanation and corresponding c code:
socket creates a communication endpoint by means of a file-descriptor in the namespace of the address-family you specified but assigns neither address nor port.
bind assigns an address and port to said socket, a port which may be chosen randomly if you request a port for which you do not have the privilige. (like < 1024 for non-root user)
listen makes the specific socket and hence address and port a passive one, meaning that it will accept incoming connections with the accept call. To handle multiple connections one after the other, you get to specify a backlog containing them, connections that arrive while you're handling one get appended. Once the backlog is full, the system will respond as such to those systems with an approach that makes them reconnect by withholding SYN, withholding ACK response etc..
As usual you can find someone explaining the previous to you a lot better.
accept then creates a new non-listening socket associated with a new file descriptor that you then use for communication with said connecting party.
accept also works as a director for your flow of execution, effectively blocking further progress until a connection is actually available in the queue for it to take, like a spinlock. The only way around that is to declare the socket non-blocking in which case it would return immediately with an error.

Spawning more than 5 client requests on socket

If we bind a server socket like this:
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind((host,port))
server.listen(5)
and use something like select() and loop over and over each client connection until the client closes it to exchange messages while the loop (for here) is run concurrently we can make the exchange of server-client messages or client-client exchange concurrent. Can we?
But the problem as I've read is that the server cannot enqueue more than 5 clients to handle one by one;
What methods are there to actually run multiple such server instances, provided the criteris that multiple such server processes start to listen iff the clients queued up reach the level of 5?
When you receive a connection you can either spawn a thread/process to handle that connection.
On the main thread go back to listen for another connection
The 5 bit is the length of the list that are one hold.
Similar to a switchboard operator
The 5 limitation you are concerned about is the size of listener backlog queue. This is how many connections the system will hold in abeyance until it starts rejecting new connections. When you accept a connection room is freed on that queue. So as long as you accept your connections in a timely manner this is not really a concern under normal load conditions. (BTW 5 is on the low side of things. IIR the default max per process on linux, for instance, is 128.)
Probably you misunderstood the function of the backlog argument.
The limit of 5 only applies to connection that are not already accepted.

Right way to do TCP connection between python and Qt?

I want to connect two programs via TCP. My main program is written with Qt and needs to talk to another program written in Python. I think about using TCP sockets and Google's protobuf to exchange the messages. In Qt, I use a QTcpSocket that accepts the connection and reads from the stream, as soon as its readyRead-Signal is triggered. In python, I also use a tcp-socket and send messages.
This works very well, as long as no side is killed. Currently, the python-side is sending messages to the C++ side. (socket.send(str(id)+"\ņ")) After every send, I check for exceptions (connection reset by peer, broken pipe, ...) to see if the message was received.
If I kill the C++ program, the next message send from the python client triggers no exception, but is obviously not received. The next message triggers the exception, but the last message is lost.
After a bit of experimenting, I found that sending an empty message (socket.send("\n")) after each message solves the problem. I do now
try:
s.send(str(id)+"\n");
s.send("\n")
sleep(0.5)
except socket.error,v:
print "FAILed to send",id,v[0],v[1]
and receive the exception as soon as the C++-Peer is killed (calling s.send(str(id)+"\n\n") however does not help).
Finally, my question is: Is this a reliable way to check if my message was received?
I don't want to switch to UDP as I don't want to implement my own ACK-messages for each message.
This is my first time I use sockets with python and C++ and can't really explain why my approach works, so I'm a bit uncomfortable using it.
Can someone tell me a a bit more? I guess that the python socket expects an ACK for the first send(int(id)+"\n") after sending the send("\n") and then realizes that the pipe is broken. Is this correct?
When a TCP connection is broken by the remote peer, your TCP socket will become ready-for-read, and then when you try to recv() from it, recv() will return 0.
Of course if your sending program is only calling send() (the way your Python program is), then it won't notice what's going on with the socket's recv-side, and you end up with the problem you described.
On the other hand, you don't want to just blindly call recv() either, because if recv() is called and the remote peer hasn't sent any data, recv() will block waiting for data and unless the remote peer ever actually sends some, you'll have a deadlock.
The simplest way to deal with that is to use select() to multiplex your I/O, so that your Python script can know when it's appropriate to call send() and/or recv(). Something like this:
import socket
import select
[...]
while 1:
socketsToReadFrom = [s]
if (you_still_have_more_data_to_send):
socketsToWriteTo = [s]
else:
socketsToWriteTo = None
# This select() call will block until there's something to do
socketsReadForRead, socketsReadyForWrite, junk = select.select(socketsToReadFrom, socketsToWriteTo, None)
if (s in socketsToReadFrom):
readBytes = s.recv(1024)
if (len(readBytes) > 0):
print "Read %i bytes from remote peer!" % readBytes
else:
print "Remote peer closed the TCP Connection!!"
break
if ((socketsToWriteTo != None) and (s in socketsToWriteTo)):
s.send(some_more_data)
As far as verifying whether your message was received, that's a bit tricky since TCP (and the network stack) do a fair amount of pipelining/buffering. In particular, a successful return from send() only tells you that your data has been handed off to your local TCP stack's outgoing-data buffer; it doesn't mean that the data has arrived at the remote peer already. If you really want a "receipt" that the remote peer has already processed the data, you'll have to have the remote peer send back some kind of acknowledgement. Note that under TCP that level of sophistication is often unnecessary though, since barring a network or hardware failure (or the remote peer closing the TCP connection), you can be fairly sure that the TCP stack will get your data there eventually; e.g. if a packet got dropped, the TCP stack will resend it automatically. Data loss will only occur if the network connectivity stops working for an extended period (e.g. several minutes), at which point the TCP stack will give up and close the TCP connection.

Python doesn't detect a closed socket until the second send

When I close the socket on one end of a connection, the other end gets an error the second time it sends data, but not the first time:
import socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(("localhost", 12345))
server.listen(1)
client = socket.create_connection(("localhost",12345))
sock, addr = server.accept()
sock.close()
client.sendall("Hello World!") # no error
client.sendall("Goodbye World!") # error happens here
I've tried setting TCP_NODELAY, using send instead of sendall, checking the fileno(), I can't find any way to get the first send to throw an error or even to detect afterwards that it failed. EDIT: calling sock.shutdown before sock.close doesn't help. EDIT #2: even adding a time.sleep after closing and before writing doesn't matter. EDIT #3: checking the byte count returned by send doesn't help, since it always returns the number of bytes in the message.
So the only solution I can come up with if I want to detect errors is to follow each sendall with a client.sendall("") which will raise an error. But this seems hackish. I'm on a Linux 2.6.x so even if a solution only worked for that OS I'd be happy.
This is expected, and how the TCP/IP APIs are implemented (so it's similar in pretty much all languages and on all operating systems)
The short story is, you cannot do anything to guarantee that a send() call returns an error directly if that send() call somehow cannot deliver data to the other end. send/write calls just delivers the data to the TCP stack, and it's up to the TCP stack to deliver it when it can.
TCP is also just a transport protocol, if you need to know if your application "messages" have reached the other end, you need to implement that yourself(some form of ACK), as part of your application protocol - there's no other free lunch.
However - if you read() from a socket, you can get notified immediatly when an error occurs, or when the other end closed the socket - you usually need to do this in some form of multiplexing event loop (that is, using select/poll or some other IO multiplexing facility).
Just note that you cannot read() from a socket to learn whether the most recent send/write succeded, Here's a few cases as of why (but it's the cases one doesn't think about that always get you)
several write() calls got buffered up due to network congestion, or because the tcp window was closed (perhaps a slow reader) and then the other end closes the socket or a hard network error occurs, thus you can't tell if if was the last write that didn't get through, or a write you did 30 seconds ago.
Network error, or firewall silently drops your packets (no ICMP replys are generated), You will have to wait until TCP times out the connection to get an error which can be many seconds, usually several minutes.
TCP is busy doing retransmission as you call send - maybe those retransmissions generate an error.(really the same as the first case)
As per the docs, try calling sock.shutdown() before the call to sock.close().

Categories

Resources