python recv losing first bytes of data

python recv losing first bytes of data - python

I have a problem when recieving messages sent over an ssl socket. On rare occasions I lose the first few bytes of data in the message. I am pretty certain this somehow is a speed problem since it only seems to happen when 2 messages are sent in rapid succession (1-2 milliseconds apart). I am running the recieving code in a separate thread with minimal code dumping the messages in a queue as they arrive.
queue = Queue()
...
def read_feed(session_key, hostname, port, ssl_socket):
''' READ whatever is coming on the stream '''
while (1):
try:
output = ssl_socket.recv(2048) # Message size always < 2048
except (ConnectionResetError, OSError):
logger.info("Connecting feed")
try:
ssl_socket.connect((hostname, port))
except ValueError: # Something's wrong, disconnect and do a new round
ssl_socket.close()
else:
cmd = {"cmd":"login", "args":{"session_key":session_key}}
data = str.encode(json.dumps(cmd) + "\n")
num_bytes = ssl_socket.send(data)
else:
queue.put(output)
...
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_socket = ssl.wrap_socket(s)
...
t3 = threading.Thread(target=read_feed, name = 'Read Feed', args=(session_key, hostname, port, ssl_socket))
t3.start()
I was first suspecting that somehow the other threads running was stealing too much CPU time so that the network buffer was filled before this thread got a chance to run, but I have tried to use a multi core machine and the problem persists.
In essense this should be the only code running when I am connected?
while (1):
output = ssl_socket.recv(2048) # Message size always < 2048
queue.put(output)
Or am I making the wrong assumptions here? Maybe the try:/except: construct is costly, or is the queue.put method slow and I should use something else? Or maybe Python is not the right tool for the job?
Any suggestions on how to improve the code so that I don't lose those few precious first bytes?

Related

How do I gracefully close a socket with a persistent HTTP connection?

I'm writing a very simple client in Python that fetches an HTML page from the WWW. This is the code I've come up with so far:
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("www.mywebsite.com", 80))
sock.send(b"GET / HTTP/1.1\r\nHost:www.mywebsite.com\r\n\r\n")
while True:
chunk = sock.recv(1024) # (1)
if len(chunk) == 0:
break
print(chunk)
sock.close()
The problem is: being an HTTP/1.1 connection persistent by default, the code gets stuck in # (1) waiting for more data from the server once the transmission is over.
I know I can solve this by a) adding the Connection: close request header, or by b) setting a timeout to the socket. A non-blocking socket here would not help, as the select() syscall would still hang (unless I set a timeout on it, but that's just another form of case b)).
So is there another way to do it, while keeping the connection persistent?

As has already been said in the comments, there's a lot to consider if you're trying to write an all-singing, all-dancing HTTP processor. However, if you're just practising with sockets then consider this.
Let's assume that you know how the response will end. For example, if we do essentially what you're doing in your code to the main Google page, we know that the response will end with '\r\n\r\n'. So, what we can do is just read 1 byte at a time and look out for that terminating sequence.
This code will NOT give you the full Google main page because, as you will see, the response is chunked - and that's a whole new ball game.
Having said all of that, you may find this instructive:
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
sock.connect(('www.google.com', 80))
sock.send(b'GET / HTTP/1.1\r\nHost:www.google.com\r\n\r\n')
end = [b'\r', b'\n', b'\r', b'\n']
d = []
while d[-len(end):] != end:
d.append(sock.recv(1))
print(''.join(b.decode() for b in d))
finally:
sock.close()

How can I get the Whole Message Without Closing Connection -- Resulting in Possibility to Get Multiple Messages?

I currently have a server at public IP: IP. I can use this Client Code to connect to the Server and establish a connection between the two.
Client.py
import socket
import select
import multiprocessing
class ClientAdminConnection:
# Port 80
def __init__(self):
self.ports = [80]
print("Ports: {}".format(self.ports))
self.sockets = []
self.createConnectionToServer()
# Multithread Listening
processes = []
for socket in self.sockets:
print("Creating Process")
process = multiprocessing.Process(target=self.startListeningForCommands, args=(socket,))
processes.append(process)
for process in processes:
print("Starting Process")
process.start()
def createConnectionToServer(self):
for port in self.ports:
client_admin_port = ('IP', port)
client_admin_socket = socket.socket()
# Connect to Server
client_admin_socket.connect(client_admin_port)
print("Connected to {}".format(client_admin_port))
self.sockets.append(client_admin_socket)
def startListeningForCommands(self, socket):
print("Listening")
wholeMsg = ""
while True:
data = socket.recv(1024)
if not data:
print("No data")
break
else:
dataBit = data.decode("utf-8")
print("There is Data: {}".format(dataBit))
wholeMsg += dataBit
print("Message: {}".format(wholeMsg))
I'd like to be able to send multiple messages and have the client listening after I .connect()
Currently, I can get it where I can read the data bit. But, I'd like to be able to send 1 message - see the entire thing (even if longer than 1024 bytes) and then send another message. Right now - I can only see the dataBits that come in. I never get to see wholeMsg. Generally, most of my messages are small - but I would like to use an arbitrary size and still see the entire message every time one is sent over the connection.
Current Results:
Ports: [80]
Connected to ('34.207.93.146', 80)
Creating Process
Starting Process
Listening
There is Data: Mary Had A Little Lamb
Mary Had A Little Lamb is sent from the server.
What I Expect:
Ports: [80]
Connected to ('34.207.93.146', 80)
Creating Process
Starting Process
Listening
There is Data: Mary Had A Little Lamb
Message: Mary Had A Little Lamb *THIS SECTION IS MISSING

TCP is a stream-oriented connection. It is not packet-oriented. A single transmission might be chopped into several pieces, or concatenated with other unread data. YOU have to establish a way to know when a message is finished, such as a \n. That way, the receiver keeps reading, and appending to the message buffer, until it sees the \n.

I appreciate Tim's answer, but wanted to highlight some simple concepts that I was missing as a new socket-eer in hopes of anyone else trying to understand.
def startListeningForCommands(self, socket):
print("Listening")
wholeMsg = ""
while True:
data = socket.recv(1024)
if not data:
print("No data")
break
else:
dataBit = data.decode("utf-8")
print("There is Data: {}".format(dataBit))
wholeMsg += dataBit
The above won't work in the sense that wholeMsg will never get called because
while True:
data = socket.recv(1024)
will continuously portion out segments of data into 1024 bytes IF there is data obtained. It is to the programmer's discretion to determine when a whole message has been received. This is where headers and footers come into play (to my understanding). But, there will never not be data, assuming the socket is open and listening.
For example, you would get:
Listening
"There is Data: I am the first message"
"There is Data: I am the second message"
If there was more than 1024 bytes, say 1200 bytes it would be
"There is Data: 'First 1024 bytes'"
"There is Data: 'Next 176 bytes'"
"There is Data: I am the second message"
It is up to you to determine where the start of one message is and the end of that message. How you do that... I am still researching. Just wanted to help out anyone else struggling with that concept.

What is a proper endless socket server loop in Python

I am a Python newbie and my first task is to create a small server program that will forward events from a network unit to a rest api.
The overall structure of my code seems to work, but I have one problem. After I receive the first package, nothing happens. Is something wrong with my loop such that new packages (from the same client) aren't accepted?
Packages look something like this: EVNTTAG 20190219164001132%0C%3D%E2%80h%90%00%00%00%01%CBU%FB%DF ... not that it matters, but I'm sharing just for clarity.
My code (I skipped the irrelevant init of rest etc. but the main loop is the complete code):
# Configure TAGP listener
ipaddress = ([l for l in ([ip for ip in socket.gethostbyname_ex(socket.gethostname())[2] if not ip.startswith("127.")][:1], [[(s.connect(('8.8.8.8', 53)), s.getsockname()[0], s.close()) for s in [socket.socket(socket.AF_INET, socket.SOCK_DGRAM)]][0][1]]) if l][0][0])
server_name = ipaddress
server_address = (server_name, TAGPListenerPort)
print ('starting TAGP listener on %s port %s' % server_address)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(server_address)
sock.listen(1)
sensor_data = {'tag': 0}
# Listen for TAGP data and forward events to ThingsBoard
try:
while True:
data = ""
connection, client_address = sock.accept()
data = str(connection.recv(1024))
if data.find("EVNTTAG") != -1:
timestamp = ((data.split())[1])[:17]
tag = ((data.split())[1])[17:]
sensor_data['tag'] = tag
client.publish('v1/devices/me/telemetry', json.dumps(sensor_data), 1)
print (data)
except KeyboardInterrupt:
# Close socket server (TAGP)
connection.shutdown(1)
connection.close()
# Close client to ThingsBoard
client.loop_stop()
client.disconnect()

There are multiple issues with your code:
First of all you need a loop over what client sends. So you first connection, client_address = sock.accept() and you now have a client. But in the next iteration of the loop you do .accept() again overwriting your old connection with a new client. If there is no new client this simply waits forever. And that's what you observe.
So this can be fixed like this:
while True:
conn, addr = sock.accept()
while True:
data = conn.recv(1024)
but this code has another issue: no new client can connect until the old one disconnects (well, at the moment it just loops indefinitly regardless of whether the client is alive or not, we'll deal with it later). To overcome it you can use threads (or async programming) and process each client independently. For example:
from threading import Thread
def client_handler(conn):
while True:
data = conn.recv(1024)
while True:
conn, addr = sock.accept()
t = Thread(target=client_handler, args=(conn,))
t.start()
Async programming is harder and I'm not gonna address it here. Just be aware that there are multiple advantages of async over threads (you can google those).
Now each client has its own thread and the main thread only worries about accepting connections. Things happen concurrently. So far so good.
Let's focus on the client_handler function. What you misunderstand is how sockets work. This:
data = conn.recv(1024)
does not read 1024 bytes from the buffer. It actually reads up to 1024 bytes with 0 being possible as well. Even if you send 1024 bytes it can still read say 3. And when you receive a buffer of length 0 then this is an indication that the client disconnected. So first of all you need this:
def client_handler(conn):
while True:
data = conn.recv(1024)
if not data:
break
Now the real fun begins. Even if data is nonempty it can be of arbitrary length between 1 and 1024. Your data can be chunked and may require multiple .recv calls. And no, there is nothing you can do about it. Chunking can happen due to some other proxy servers or routers or network lag or cosmic radiation or whatever. You have to be prepared for it.
So in order to work with that correctly you need a proper framing protocol. For example you have to somehow know how big is the incoming packet (so that you can answer the question "did I read everything I need?"). One way to do that is by prefixing each frame with (say) 2 bytes that combine into total length of the frame. The code may look like this:
def client_handler(conn):
while True:
chunk = conn.recv(1) # read first byte
if not chunk:
break
size = ord(chunk)
chunk = conn.recv(1) # read second byte
if not chunk:
break
size += (ord(chunk) << 8)
Now you know that the incoming buffer will be of length size. With that you can loop to read everything:
def handle_frame(conn, frame):
if frame.find("EVNTTAG") != -1:
pass # do your stuff here now
def client_handler(conn):
while True:
chunk = conn.recv(1)
if not chunk:
break
size = ord(chunk)
chunk = conn.recv(1)
if not chunk:
break
size += (ord(chunk) << 8)
# recv until everything is read
frame = b''
while size > 0:
chunk = conn.recv(size)
if not chunk:
return
frame += chunk
size -= len(chunk)
handle_frame(conn, frame)
IMPORTANT: this is just an example of handling a protocol that prefixes each frame with its length. Note that the client has to be adjusted as well. You either have to define such protocol or if you have a given one you have to read the spec and try to understand how framing works. For example this is done very differently with HTTP. In HTTP you read until you meet \r\n\r\n which signals the end of headers. And then you check Content-Length or Transfer-Encoding headers (not to mention hardcore things like protocol switch) to determine next action. This gets quite complicated though. I just want you to be aware that there are other options. Nevertheless framing is necessary.
Also network programming is hard. I'm not gonna dive into things like security (e.g. against DDOS) and performance. The code above should be treated as extreme simplification, not production ready. I advice using some existing soft.

Python: Multithreaded socket server runs endlessly when client stops unexpectedly

I have created a multithreaded socket server to connect many clients to the server using python. If a client stops unexpectedly due to an exception, server runs nonstop. Is there a way to kill that particular thread alone in the server and the rest running
Server:
class ClientThread(Thread):
def __init__(self,ip,port):
Thread.__init__(self)
self.ip = ip
self.port = port
print("New server socket thread started for " + ip + ":" + str(port))
def run(self):
while True :
try:
message = conn.recv(2048)
dataInfo = message.decode('ascii')
print("recv:::::"+str(dataInfo)+"::")
except:
print("Unexpected error:", sys.exc_info()[0])
Thread._stop(self)
tcpServer = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcpServer.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
tcpServer.bind((TCP_IP, 0))
tcpServer.listen(10)
print("Port:"+ str(tcpServer.getsockname()[1]))
threads = []
while True:
print( "Waiting for connections from clients..." )
(conn, (ip,port)) = tcpServer.accept()
newthread = ClientThread(ip,port)
newthread.start()
threads.append(newthread)
for t in threads:
t.join()
Client:
def Main():
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect((host,int(port)))
while True:
try:
message = input("Enter Command")
s.send(message.encode('ascii'))
except Exception as ex:
logging.exception("Unexpected error:")
break
s.close()

Sorry about a very, very long answer but here goes.
There are quite a many issues with your code. First of all, your client does not actually close the socket, as s.close() will never get executed. Your loop is interrupted at break and anything that follows it will be ignored. So change the order of these statements for the sake of good programming but it has nothing to do with your problem.
Your server code is wrong in quite a many ways. As it is currently written, it never exits. Your threads also do not work right. I have fixed your code so that it is a working, multithreaded server, but it still does not exit as I have no idea what would be the trigger to make it exit. But let us start from the main loop:
while True:
print( "Waiting for connections from clients..." )
(conn, (ip,port)) = tcpServer.accept()
newthread = ClientThread(conn, ip,port)
newthread.daemon = True
newthread.start()
threads.append(newthread) # Do we need this?
for t in threads:
t.join()
I have added passing of conn to your client thread, the reason of which becomes apparent in a moment. However, your while True loop never breaks, so you will never enter the for loop where you join your threads. If your server is meant to be run indefinitely, this is not a problem at all. Just remove the for loop and this part is fine. You do not need to join threads just for the sake of joining them. Joining threads only allows your program to block until a thread has finished executing.
Another addition is newthread.daemon = True. This sets your threads to daemonic, which means they will exit as soon as your main thread exits. Now your server responds to control + c even when there are active connections.
If your server is meant to be never ending, there is also no need to store threads in your main loop to threads list. This list just keeps growing as a new entry will be added every time a client connects and disconnects, and this leaks memory as you are not using the threads list for anything. I have kept it as it was there, but there still is no mechanism to exit the infinite loop.
Then let us move on to your thread. If you want to simplify the code, you can replace the run part with a function. There is no need to subclass Thread in this case, but this works so I have kept your structure:
class ClientThread(Thread):
def __init__(self,conn, ip,port):
Thread.__init__(self)
self.ip = ip
self.port = port
self.conn = conn
print("New server socket thread started for " + ip + ":" + str(port))
def run(self):
while True :
try:
message = self.conn.recv(2048)
if not message:
print("closed")
try:
self.conn.close()
except:
pass
return
try:
dataInfo = message.decode('ascii')
print("recv:::::"+str(dataInfo)+"::")
except UnicodeDecodeError:
print("non-ascii data")
continue
except socket.error:
print("Unexpected error:", sys.exc_info()[0])
try:
self.conn.close()
except:
pass
return
First of all, we store conn to self.conn. Your version used a global version of conn variable. This caused unexpected results when you had more than one connection to the server. conn is actually a new socket created for the client connection at accept, and this is unique to each thread. This is how servers differentiate between client connections. They listen to a known port, but when the server accepts the connection, accept creates another port for that particular connection and returns it. This is why we need to pass this to the thread and then read from self.conn instead of global conn.
Your server "hung" upon client connetion errors as there was no mechanism to detect this in your loop. If the client closes connection, socket.recv() does not raise an exception but returns nothing. This is the condition you need to detect. I am fairly sure you do not even need try/except here but it does not hurt - but you need to add the exception you are expecting here. In this case catching everything with undeclared except is just wrong. You have also another statement there potentially raising exceptions. If your client sends something that cannot be decoded with ascii codec, you would get UnicodeDecodeError (try this without error handling here, telnet to your server port and copypaste some Hebrew or Japanese into the connection and see what happens). If you just caught everything and treated as socket errors, you would now enter the thread ending part of the code just because you could not parse a message. Typically we just ignore "illegal" messages and carry on. I have added this. If you want to shut down the connection upon receiving a "bad" message, just add self.conn.close() and return to this exception handler as well.
Then when you really are encountering a socket error - or the client has closed the connection, you will need to close the socket and exit the thread. You will call close() on the socket - encapsulating it in try/except as you do not really care if it fails for not being there anymore.
And when you want to exit your thread, you just return from your run() loop. When you do this, your thread exits orderly. As simple as that.
Then there is yet another potential problem, if you are not only printing the messages but are parsing them and doing something with the data you receive. This I do not fix but leave this to you.
TCP sockets transmit data, not messages. When you build a communication protocol, you must not assume that when your recv returns, it will return a single message. When your recv() returns something, it can mean one of five things:
The client has closed the connection and nothing is returned
There is exactly one full message and you receive that
There is only a partial message. Either because you read the socket before the client had transmitted all data, or because the client sent more than 2048 bytes (even if your client never sends over 2048 bytes, a malicious client would definitely try this)
There are more than one messages waiting and you received them all
As 4, but the last message is partial.
Most socket programming mistakes are related to this. The programmer expects 2 to happen (as you do now) but they do not cater for 3-5. You should instead analyse what was received and act accordingly. If there seems to be less data than a full message, store it somewhere and wait for more data to appear. When more data appears, concatenate these and see if you now have a full message. And when you have parsed a full message from this buffer, inspect the buffer to see if there is more data there - the first part of the next message or even more full messages if your client is fast and server is slow. If you process a message and then wipe the buffer, you might have wiped also bytes from your next message.

When/why to use s.shutdown(socket.SHUT_WR)?

I have just started learning python network programming. I was reading Foundations of Python Network Programming and could not understand the use of s.shutdown(socket.SHUT_WR) where s is a socket object.
Here is the code(where sys.argv[2] is the number of bytes user wants to send, which is rounded off to a multiple of 16) in which it is used:
import socket, sys
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
HOST = '127.0.0.1'
PORT = 1060
if sys.argv[1:] == ['server']:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((HOST, PORT))
s.listen(1)
while True:
print 'Listening at', s.getsockname()
sc, sockname = s.accept()
print 'Processing up to 1024 bytes at a time from', sockname
n = 0
while True:
message = sc.recv(1024)
if not message:
break
sc.sendall(message.upper()) # send it back uppercase
n += len(message)
print '\r%d bytes processed so far' % (n,),
sys.stdout.flush()
print
sc.close()
print 'Completed processing'
elif len(sys.argv) == 3 and sys.argv[1] == 'client' and sys.argv[2].isdigit():
bytes = (int(sys.argv[2]) + 15) // 16 * 16 # round up to // 16
message = 'capitalize this!' # 16-byte message to repeat over and over
print 'Sending', bytes, 'bytes of data, in chunks of 16 bytes'
s.connect((HOST, PORT))
sent = 0
while sent < bytes:
s.sendall(message)
sent += len(message)
print '\r%d bytes sent' % (sent,),
sys.stdout.flush()
print
s.shutdown(socket.SHUT_WR)
print 'Receiving all the data the server sends back'
received = 0
while True:
data = s.recv(42)
if not received:
print 'The first data received says', repr(data)
received += len(data)
if not data:
break
print '\r%d bytes received' % (received,),
s.close()
else:
print >>sys.stderr, 'usage: tcp_deadlock.py server | client <bytes>'
And this is the explanation that the author provides which I am finding hard to understand:
Second, you will see that the client makes a shutdown() call on the socket after it finishes sending its transmission. This solves an important problem: if the server is going to read forever until it sees end-of-file, then how will the client avoid having to do a full close() on the socket and thus forbid itself from doing the many recv() calls that it still needs to make to receive the server’s response? The solution is to “half-close” the socket—that is, to permanently shut down communication in one direction but without destroying the socket itself—so that the server can no longer read any data, but can still send any remaining reply back in the other direction, which will still be open.
My understanding of what it will do is that it will prevent the client application from further sending the data and thus will also prevent the server side from further attempting to read any data.
What I cant understand is that why is it used in this program and in what situations should I consider using it in my programs?

My understanding of what it will do is that it will prevent the client
application from further sending the data and thus will also prevent
the server side from further attempting to read any data.
Your understanding is correct.
What I cant understand is that why is it used in this program …
As your own statement suggests, without the client's s.shutdown(socket.SHUT_WR) the server would not quit waiting for data, but instead stick in its sc.recv(1024) forever, because there would be no connection termination request sent to the server.
Since the server then would never get to its sc.close(), the client on his part also would not quit waiting for data, but instead stick in its s.recv(42) forever, because there would be no connection termination request sent from the server.
Reading this answer to "close vs shutdown socket?" might also be enlightening.

The explanation is half-baked, it applies only to this specific code and overall I would vote with all-fours that this is bad practice.
Now to understand why is it so, you need to look at a server code. This server works by blocking execution until it receives 1024 bytes. Upon reception it processes the data (makes it upper-case) and sends it back. Now the problem is with hardcoded value of 1024. What if your string is shorter than 1024 bytes?
To resolve this you need to tell the server that - hey there is no more data coming your way, so return from message = sc.recv(1024) and you do this by shutting down the socket in one direction.
You do not want to fully close the socket, because then the server would not be able to send you the reply.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python recv losing first bytes of data - python

Related

How do I gracefully close a socket with a persistent HTTP connection?

How can I get the Whole Message Without Closing Connection -- Resulting in Possibility to Get Multiple Messages?

What is a proper endless socket server loop in Python

Python: Multithreaded socket server runs endlessly when client stops unexpectedly

When/why to use s.shutdown(socket.SHUT_WR)?

Categories

Resources