python socket and recv() returning empty data

python socket and recv() returning empty data - python

I have a c program that connects to a python server, sends a short string (less than about 100chars) and then closes socket. It does this at a periodic rate.
The python server accepts connection, spawns a thread, and in that thread calls:
data = sock.recv(4096)
data often turns out to be empty.
After reading through the python man pages, and some of the stack overflow posts (thanks guys!), I realize that the problem is the c program that opens, writes, closes, sockets so quickly, that by the time the python server accepts, spawns thread, the recv() returns no data, as documented.
The problem is, I don't know a workaround to this problem? I have very little control over the c-program. Is where a way to tell python to buffer the message for recv() even if the other side closes connection?
(caveat: I haven't verified my hunch yet on wireshark, but the logs in both programs strongly indicate the c-program closes before recv() is even called for most of the time.)
thanks.

Related

TCP - single connection, how to not block main application

I open a single TCP connection to a gateway PC. My application will send messages to the gateway which will process the payload and pass on the message to another computer (based on payload) for processing.
For example, I send message A over the TCP connection which will be routed to computer A for response. But I may also need to send message B which goes to computer B.
Currently I simply use send(messageA) and then use recv() to wait for the response. The downside is that the recv() will block which means I can't send message B until something is received (and I can't do any other tasks).
I have read about the following options but am confused to the best for my use case.
Make the socket non-blocking. I send message A, call recv() (let's assume there is some delay in processing such that nothing is received immediately; so code moves on), move on to send message B and again call recv(). Now, it could be A or B that responds, which I can handle. But I need to call recv() again since only one response received so far; but what if computer A is down and never responds -- at some point I need to decide to stop calling recv(), right? On what basis would I do this?
Set a timeout on the socket. Again, send message A but assume computer A is down, so the code will wait for timeout before moving on which is wasted time.
Use select. Since I have only one socket and I don't think that helps here; plus, I understand select will block unless a timeout is set so no different, in this case, to the option above?
Use multithreading. Have one thread to process the main application and do the sending. And another thread that just calls recv() in an infinite loop (or a long timeout) that calls a callback whenever data is available. But then if the connection is closed from the main thread, will the recv thread cause an exception or hang?
I am really not sure what best practices are or the pitfalls of the options above. Which would be best option or is there another option?
(I'm using Python, in case it makes a difference).

Python - Read remaining data from socket after TCP RST

I'm implementing a file transfer protocol with the following use case:
The server sends the file chunk by chunk inside several frames.
The client might cancel the transfer: for this, it sends a message and disconnects at TCP level.
What happened in that case on server side (Python running on Windows) is that I catch a ConnectionResetException (this is normal, the client has disconnected the socket) while sending the data to the client. I would want to read the latest data sent by the client (the message used to abort the call), but calling mysocket.recv() still raises a ConnectionResetException.
With a wireshark capture, I can clearly see that the message was properly sent by the client prior to TCP disonnection.
Any idea floks? Thanks!
VR

In order to understand what to do about this situation, you need to understand how a TCP connection is closed (see, e.g. this) and how the socket API relates to a clean shutdown (without fail, see this).
Your client is most likely calling close to terminate the connection. The problem with this is that there may be unread data in the socket receive queue or data arriving shortly from the other end that you will no longer be able to read, which is basically an error condition. To signal to the other end that data sent cannot be delivered to the receiving application, a reset is sent (well, technically, "SHOULD be sent" as per the RFC) and the TCP connection is abnormally terminated.
You might think that enabling SO_LINGER will help (many, many bits have been spilt over this so I won't elaborate further), but it won't solve the problem of unread data by the client causing the reset.
The client needs to instead call shutdown(SHUT_WR) to indicate that it is done sending, and then continue to call recv() until it reads 0 bytes indicating the other side is done sending. You may then call close().
Note that the Python 2 socket documentation states that
Depending on the platform, shutting down one half of the connection can also close the opposite half (e.g. on Mac OS X, shutdown(SHUT_WR) does not allow further reads on the other end of the connection).
This sounds like a bug to me. To get around this, you would have to send your cancel message, then keep reading until you get 0 bytes so that you know the server received the cancel message. You may then close the socket.
The Python 3.8 docs make no such disclaimer.

Proper way to close tcp sockets in python

I am currently working on a server + client combo on python and I'm using TCP sockets. From networking classes I know, that TCP connection should be closed step by step, first one side sends the signal, that it wants to close the connection and waits for confirmation, then the other side does the same. After that, socket can be safely closed.
I've seen in python documentation function socket.shutdown(flag), but I don't see how it could be used in this standard method, theoretical of closing TCP socket. As far as I know, it just blocks either reading, writing or both.
What is the best, most correct way to close TCP socket in python? Are there standard functions for closing signals or do I need to implement them myself?

shutdown is useful when you have to signal the remote client that no more data is being sent. You can specify in the shutdown() parameter which half-channel you want to close.
Most commonly, you want to close the TX half-channel, by calling shutdown(1). In TCP level, it sends a FIN packet, and the remote end will receive 0 bytes if blocking on read(), but the remote end can still send data back, because the RX half-channel is still open.
Some application protocols use this to signal the end of the message. Some other protocols find the EOM based on data itself. For example, in an interactive protocol (where messages are exchanged many times) there may be no opportunity, or need, to close a half-channel.
In HTTP, shutdown(1) is one method that a client can use to signal that a HTTP request is complete. But the HTTP protocol itself embeds data that allows to detect where a request ends, so multiple-request HTTP connections are still possible.
I don't think that calling shutdown() before close() is always necessary, unless you need to explicitly close a half-channel. If you want to cease all communication, close() does that too. Calling shutdown() and forgetting to call close() is worse because the file descriptor resources are not freed.
From Wikipedia: "On SVR4 systems use of close() may discard data. The use of shutdown() or SO_LINGER may be required on these systems to guarantee delivery of all data." This means that, if you have outstanding data in the output buffer, a close() could discard this data immediately on a SVR4 system. Linux, BSD and BSD-based systems like Apple are not SVR4 and will try to send the output buffer in full after close(). I am not sure if any major commercial UNIX is still SVR4 these days.
Again using HTTP as an example, an HTTP client running on SVR4 would not lose data using close() because it will keep the connection open after request to get the response. An HTTP server under SVR would have to be more careful, calling shutdown(2) before close() after sending the whole response, because the response would be partly in the output buffer.

According to the python documentation which says:
Strictly speaking, you’re supposed to use shutdown on a socket before
you close it. The shutdown is an advisory to the socket at the other
end. Depending on the argument you pass it, it can mean “I’m not going
to send anymore, but I’ll still listen”, or “I’m not listening, good
riddance!”. Most socket libraries, however, are so used to programmers
neglecting to use this piece of etiquette that normally a close is the
same as shutdown(); close(). So in most situations, an explicit
shutdown is not needed.
I think the most correct way to close a TCP connection would be to use shutdown before closing a connection, because close is not atomic! This can make some bugs. Suppose you're using close function without shutdown and the data didn't send to the server correctly, at the same time python closes the connection and server can't reply to client, now the socket at the other end may hang indefinitely.

Non-blocking sockets in Python

I was looking at the socket programming module of the python standard library and I noticed a fucntion socket.setblocking. The documentation mentioned that setting a socket to non blocking mode would mean that an error would be raised if the data was not sent out through the socket immediately or if data was not available upon trying to read from the socket.
I'm having trouble understanding usecases in which this function might be useful. I'm working on a Linux machine(just in case the answer to this would be OS dependent).
Thanks!

When you set the socket to blocking, the socket waits for the specified time on that socket. While it is waiting on the socket, your program cannot do anything. At the end of the wait time it raises an error. Sometimes you dont want blocking to occur.
A good use case for this might be when you are sending an receiving message on a single threaded program using multiple sockets. You don't want to block on a socket while waiting to send or receive messages rather you may want to check if there are messages to send or receive for each of the sockets hence you would want no blocking time or limited blocking time while you loop through the sockets. This will provide a more in depth discussion of python sockets.

pyserial - possible to write to serial port from thread a, do blocking reads from thread b?

I tried googling this, couldn't find an answer, searched here, couldn't find an answer. Has anyone looked into whether it's thread safe to write to a Serial() object (pyserial) from thread a and do blocking reads from thread b?
I know how to use thread synchronization primitives and thread-safe data structures, and in fact my current form of this program has a thread dedicated to reading/writing on the serial port and I use thread-safe data structures to coordinate activities in the app.
My app would benefit greatly if I could write to the serial port from the main thread (and never read from it), and read from the serial port using blocking reads in the second thread (and never write to it). If someone really wants me to go into why this would benefit the app I can add my reasons. In my mind there would be just one instance of Serial() and even while thread B sits in a blocking read on the Serial object, thread A would be safe to use write methods on the Serial object.
Anyone know whether the Serial class can be used this way?
EDIT: It occurs to me that the answer may be platform-dependent. If you have any experience with a platform like this, it'd be good to know which platform you were working on.
EDIT: There's only been one response but if anyone else has tried this, please leave a response with your experience.

I have done this with pyserial. Reading from one thread and writing from another should not cause problems in general, since there isn't really any kind of resource arbitration problem. Serial ports are full duplex, so reading and writing can happen completely independently and at the same time.

I've used pyserial in this way on Linux (and Windows), no problems !

I would recommend to modify Thread B from "blocking read" to "non blocking read/write". Thread B would become your serial port "Daemon".
Thread A could run at full speed for a friendly user interface or perform any real time operation.
Thread A would write a message to Thread B instead of trying to write directly to the serial port. If the size/frequency of the messages is low, a simple shared buffer for the message itself and a flag to indicate that a new message is present would work. If you need higher performance, you should use a stack. This is actually implemented simply using an array large enough to accumulate many message to be sent and two pointers. The write pointer is updated only by Thread A. The read pointer is updated only by Thread B.
Thread B would grab the message and sent it to the serial port. The serial port should use the timeout feature so that the read serial port function release the CPU, allowing you to poll the shared buffer and, if any new message is present, send it to the serial port. I would use a sleep at that point to limit the CPU time used by Thread B.. Then, you can make Thread B loop to the read serial port function. If the serial port timeout is not working right, like if the USB-RS232 cable get unplugged, the sleep function will make the difference between a good Python code versus the not so good one.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.