Why is looping Python TCP receiver receives message partially?

Why is looping Python TCP receiver receives message partially? - python

I have a server that sends some messages to a client. The print(trades) statement shows that file reader reads the entire csv correctly:
def send_past_trades(self):
with open('OTC_trade_records.csv',newline='') as f:
connectionSocket, addr = self.client
trades = f.read()
#print(trades)
connectionSocket.send(trades.encode())
My client receiver is like this:
msg = b""
while(True):
print("Batch receiving")
tmp = client_socket.recv(4096)
msg += tmp
if len(tmp) < 4096:
print(len(tmp))
break
msg = msg.decode()
print(msg)
The message is always partial. I can see that the statement "Batch receiving" is printed once and when the break statement is initiated, the length of the last message is 1228.
Another point is, this code works fine in my local system. The problem occurs when I put the server program to a remote server machine. Is there a possibility that server intervenes with the message?
Note: I tried different ways to solve the problem such as sending only package size of 1024b messages in a loop. Still partial messages received.

The problem is here:
if len(tmp) < 4096:
print(len(tmp))
break
The point is that bufsize in recv(bufsize) is a maximum size to receive. The recv will return fewer bytes if there are fewer available.
I suggest to define a simple communication protocol that describes the structure of a message with a header and payload. The header must contain the payload size. This allows you to parse data from the incoming TCP stream and get the exact size of the received data. Then you can receive requested amount of data.
A client will look like this:
import struct
# Receive a header
header = connection.recv(8)
(length,) = struct.unpack('>Q', header) # Parse payload length
# Receive the payload
payload = b''
while len(payload) < length:
to_read = length - len(payload)
payload += connection.recv(4096 if to_read > 4096 else to_read)
Server:
import struct
with open('OTC_trade_records.csv',newline='') as f:
connectionSocket, addr = self.client
trades = f.read()
length = struct.pack('>Q', len(trades))
connectionSocket.sendall(length)
connectionSocket.sendall(trades)

Related

Python: how do I hold all socket transfer while waiting for data?

I'm writing a file transfer program between two sockets.
The sender has the usual game:
initiate file send
pick file to send, send filesize over socket.
pick recipient
confirm
Then the server reaches out to the recipient:
do you want to accept file of this filesize?
wait for y or n.
Recipient:
yes
receive.
and all the edge cases therein.
I am running a send thread and receive thread on both socket and server.
on the server I'm doing something like this.
import threading
accept_thread = Thread(target=accept)
send_thread = Thread(target=send)
def accept():
while True:
data = client.recv(BUFF)
# break if no data
if not data: break
# if command is /sendfile
if data == b'/sendfile':
# get filesize
filesize = client.recv(BUFF)
# get recipient
recipient = client.recv(BUFF)
# check if recipient is in address list.
if recipient_exists(recipient):
sender_socket.send(b'valid')
ask = recipient_socket.send(b'Do you want to accept?')
if ask == 'n':
sender_socket.send(b'rejected')
else:
data = sender_socket.recv(BUFF)
recipient_socket.send(data)
msg = data
clients.send(msg.encode())
Needless to say, things somehow go out of whack. I keep getting my sendfile tagged onto the end of whatever I send. For instance, if I send a text file, the last bytes will always be '/sendfile'.
And if the sender types anything when they're not supposed to, then that gets added to the file transfer as well.
What do I need to use in order to make this work? I tried putting a thread lock before the receive.
Edit: Here is my send and receive:
def xfer_file(self, path, server_socket):
"""Opens file at path, sends to server_socket."""
with open(path, 'rb') as f:
server_socket.sendfile(f, 0, self.filesize)
def receive_file(self, data, BUFFSIZE, filesize, client):
"""Download file transfer from server."""
try:
data = client.recv(BUFFSIZE)
bytes_recd = len(data)
print("STARTING=========")
# target_path = self._set_target_path()
with open('image(2).jpg', 'wb') as f:
while bytes_recd < int(filesize):
f.write(data)
data = client.recv(BUFFSIZE)
bytes_recd = bytes_recd + len(data)
print(bytes_recd)
print(data)
if data == b'':
raise RuntimeError("socket connection broken")
except ValueError:
pass
print("FINISHED==========")
```

Using delimiters in a Python TCP stream

I am working on a program using TCP protocol to collect ADS-B messages from an antenna. Since I am new to Python, I used the following scripts to establish connection. The problem is that I receive several messages at the same time (since TCP is stream-oriented). I would like to separate each message using a "\n" delimiter for instance (each message has "#" at the beginning and ";" at the end and the length varies). I have no idea of how to tell Python to separate each message like this, do you have any idea ? Thanks a lot
Python version 3.7.6, Anaconda, Windows 10
import socketserver
class MyTCPHandler(socketserver.BaseRequestHandler):
"""
# The request handler class for our server.
# It is instantiated once per connection to the server, and must
# override the handle() method to implement communication to the
# client.
# """
def handle(self):
# self.rfile is a file-like object created by the handler;
# we can now use e.g. readline() instead of raw recv() calls
self.data = self.rfile.readline().strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# Likewise, self.wfile is a file-like object used to write back
# to the client
self.wfile.write(self.data.upper())
if __name__ == "__main__":
print ("Server online")
HOST, PORT = "localhost", 10100
# Create the server, binding to localhost on port 10002
with socketserver.TCPServer((HOST, PORT), MyTCPHandler) as server:
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
import socket
import sys
def tcp_client():
HOST, PORT = "192.168.2.99", 10002
data = " ".join(sys.argv[1:])
# Create a socket (SOCK_STREAM means a TCP socket)
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
# Connect to server and send data
sock.connect((HOST, PORT))
while True :
sock.sendall(bytes(data + "\n", "utf-8"))
# Receive data from the server
received = str(sock.recv(1024), "utf-8")
print("{}".format(received))

You could try using:
acumulator = ""
while True:
received = str(sock.recv(1024), "utf-8")
divided_message = received.split('\n')
if len(divided_message) >= 2:
print('One mesage: ', acumulator + divided_message[0].strip())
for i in range(1, len(divided_message) - 1):
print('One mesage: ', divided_message[i].strip())
if '\n' in divided_message[-1]:
print('One mesage: ', divided_message[-1].strip())
acumulator = ''
else:
acumulator = divided_message[-1]
else:
acumulator += divided_message[0]
If the message is separated by /n you can divide the message applying a selection technique, like the one presented above. If your messages have a fixed length you could just modify your delimiter.

I would suggest you use the following approach:
Assuming your messages can't be more than 4GB long, just send the
length, packed into exactly 4 bytes, and then you send the data
itself. So, the other side always knows how much to read: Read exactly
4 bytes, unpack it into a length, then read exactly as many bytes as
that:
def send_one_message(sock, data):
length = len(data)
sock.sendall(struct.pack('!I', length))
sock.sendall(data)
def recv_one_message(sock):
lengthbuf = recvall(sock, 4)
length, = struct.unpack('!I', lengthbuf)
return recvall(sock, length)
That's almost a complete protocol. The only problem is that Python
doesn't have a recvall counterpart to sendall, but you can write it
yourself:
def recvall(sock, count):
buf = b''
while count:
newbuf = sock.recv(count)
if not newbuf: return None
buf += newbuf
count -= len(newbuf)
return buf
More detailed description here

Python3 socket, random partial result on socket receive

I've written a basic client/server interface using Python socket (quoted only relevant part of code, for full script: (SERVER: https://github.com/mydomo/ble-presence/blob/master/server.py)
(CLIENT: https://github.com/mydomo/ble-presence/blob/master/clients/DOMOTICZ/ble-presence/plugin.py)
The issue is when the script run from some hours and the result list is getting bigger sometimes the reply is exactly as it should be, other times it's cutted, not complete... it's random, like if the socket closed for no reason earlier or the reply is not fully read.
Can you please help me?
SERVER:
def client_thread(conn, ip, port, MAX_BUFFER_SIZE = 32768):
# the input is in bytes, so decode it
input_from_client_bytes = conn.recv(MAX_BUFFER_SIZE)
# MAX_BUFFER_SIZE is how big the message can be
# this is test if it's too big
siz = sys.getsizeof(input_from_client_bytes)
if siz >= MAX_BUFFER_SIZE:
print("The length of input is probably too long: {}".format(siz))
# decode input and strip the end of line
input_from_client = input_from_client_bytes.decode("utf8").rstrip()
res = socket_input_process(input_from_client)
#print("Result of processing {} is: {}".format(input_from_client, res))
vysl = res.encode("utf8") # encode the result string
conn.sendall(vysl) # send it to client
conn.close() # close connection
##########- END FUNCTION THAT HANDLE SOCKET'S TRANSMISSION -##########
def start_server():
global soc
soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# this is for easy starting/killing the app
soc.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
#print('Socket created')
try:
soc.bind((socket_ip, socket_port))
# print('Socket bind complete')
except socket.error as msg:
# print('Bind failed. Error : ' + str(sys.exc_info()))
sys.exit()
#Start listening on socket
soc.listen(10)
#print('Socket now listening')
# for handling task in separate jobs we need threading
#from threading import Thread
# this will make an infinite loop needed for
# not reseting server for every client
while (not killer.kill_now):
conn, addr = soc.accept()
ip, port = str(addr[0]), str(addr[1])
#print('Accepting connection from ' + ip + ':' + port)
try:
Thread(target=client_thread, args=(conn, ip, port)).start()
except:
print("Terible error!")
import traceback
traceback.print_exc()
soc.close()
CLIENT:
soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
SERV_ADDR = str(Parameters["Address"])
SERV_PORT = int(Parameters["Port"])
soc.connect((SERV_ADDR, SERV_PORT))
if BATTERY_REQUEST == True:
clients_input = str(BATTERY_DEVICE_REQUEST)
else:
clients_input = "beacon_data"
soc.send(clients_input.encode()) # we must encode the string to bytes
result_bytes = soc.recv(32768) # the number means how the response can be in bytes
result_string = result_bytes.decode("utf8") # the return will be in bytes, so decode

Method recv() does not guarantee receiving the full message in the first call so you have to try getting the full message by calling recv() multiple times.
If recv() does return an empty string, connection is closed in the client side.
Using this while loop you can get full stream from client into data:
data = b'' # recv() does return bytes
while True:
try:
chunk = conn.recv(4096) # some 2^n number
if not chunk: # chunk == ''
break
data += chunk
except socket.error:
conn.close()
break

TCP is a streaming protocol, meaning it has no concept of what constitutes a complete message. You have to implement your own message protocol layer on top of TCP to make sure you send and receive complete messages. You are responsible for buffering data received until you have a complete message, and you have to define what a complete message is. Some options:
Send fixed length messages.
Send a fixed number of bytes representing the length of the message, then the message.
Separate messages with a sentinel byte.
Then, call recv and accumulate the results until you have a complete message in the buffer.

TCP sockets unable to send messages in a burst

Hi I have multiple systems communicating via message using TCP connections.
My send function looks like the following
def _send(self, message, dest):
self.sendLock.acquire()
message = pickle.dumps(message)
#sending length
message_length = len(message)
self.outChan[dest].send('<MESSAGELENGTH>%s</MESSAGELENGTH>'
% str(message_length))
for message_i in range(0, message_length, 1024):
self.outChan[dest].send(message[:1024])
message = message[1024:]
self.sendLock.release()
And the receive thread looks like this:
def readlines(self, sock):
while True:
msg = ''
opTag = '<MESSAGELENGTH>'
clTag = '</MESSAGELENGTH>'
while not all(tag in msg for tag in (opTag, clTag)):
msg = sock.recv(1024)
msglen = int(msg.split(clTag)[0].split(opTag)[1])
msg = msg.split(clTag)[1]
while len(msg) < msglen:
msg += sock.recv(msglen-len(msg))
self.rec.put(pickle.loads(msg))
After the message is read from self.rec a confirmation message is sent to the sender.
I have implemented my own buffer to control the traffic in the network. At any moment I would have sent atmost MAX_BUFFER_SIZE messages with no confirmation received.
Here is the problem: When program starts, it sends MAX_BUFFER_SIZE messages without waiting for the confirmation. But only a few of these MAX_BUFFER_SIZE messages are received.
In one of the simulations with MAX_BUFFER_SIZE = 5, total 100 messages were sent and m2,m3 and m4 were not received. All other messages were received (in the order they were sent).
I doubt the error is in the initial sending burst, but I am unable to figure out the exact problem.

There are a few errors in the receive thread:
While inspecting the received message for the opening and closing tags, you are not appending to the already received part, but overwriting it.
After detecting the message length, you are losing the subsequent messages, that have their closing tag already received, but not analyzed yet.
You are possibly putting several messages together into self.rec.
Here is a corrected form, with comments explaining the changes:
def readlines(self, sock):
msg = '' # initialize outside since otherwise remiander of previous message would be lost
opTag = '<MESSAGELENGTH>' # no need to repeat this in each iteration
clTag = '</MESSAGELENGTH>' # no need to repeat this in each iteration
while True:
while not all(tag in msg for tag in (opTag, clTag)):
msg += sock.recv(1024) # += rather than =
msglen = int(msg.split(clTag)[0].split(opTag)[1])
msg = msg.split(clTag, 1)[1] # split just once, starting from the left
while len(msg) < msglen:
msg += sock.recv(msglen-len(msg))
self.rec.put(pickle.loads(msg[:maglen])) # handle just one message
msg = msg[msglen:] # prepare for handling future messages

Python Socket Receive Large Amount of Data

When I try to receive larger amounts of data it gets cut off and I have to press enter to get the rest of the data. At first I was able to increase it a little bit but it still won't receive all of it. As you can see I have increased the buffer on the conn.recv() but it still doesn't get all of the data. It cuts it off at a certain point. I have to press enter on my raw_input in order to receive the rest of the data. Is there anyway I can get all of the data at once? Here's the code.
port = 7777
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('0.0.0.0', port))
sock.listen(1)
print ("Listening on port: "+str(port))
while 1:
conn, sock_addr = sock.accept()
print "accepted connection from", sock_addr
while 1:
command = raw_input('shell> ')
conn.send(command)
data = conn.recv(8000)
if not data: break
print data,
conn.close()

TCP/IP is a stream-based protocol, not a message-based protocol. There's no guarantee that every send() call by one peer results in a single recv() call by the other peer receiving the exact data sent—it might receive the data piece-meal, split across multiple recv() calls, due to packet fragmentation.
You need to define your own message-based protocol on top of TCP in order to differentiate message boundaries. Then, to read a message, you continue to call recv() until you've read an entire message or an error occurs.
One simple way of sending a message is to prefix each message with its length. Then to read a message, you first read the length, then you read that many bytes. Here's how you might do that:
def send_msg(sock, msg):
# Prefix each message with a 4-byte length (network byte order)
msg = struct.pack('>I', len(msg)) + msg
sock.sendall(msg)
def recv_msg(sock):
# Read message length and unpack it into an integer
raw_msglen = recvall(sock, 4)
if not raw_msglen:
return None
msglen = struct.unpack('>I', raw_msglen)[0]
# Read the message data
return recvall(sock, msglen)
def recvall(sock, n):
# Helper function to recv n bytes or return None if EOF is hit
data = bytearray()
while len(data) < n:
packet = sock.recv(n - len(data))
if not packet:
return None
data.extend(packet)
return data
Then you can use the send_msg and recv_msg functions to send and receive whole messages, and they won't have any problems with packets being split or coalesced on the network level.

You can use it as: data = recvall(sock)
def recvall(sock):
BUFF_SIZE = 4096 # 4 KiB
data = b''
while True:
part = sock.recv(BUFF_SIZE)
data += part
if len(part) < BUFF_SIZE:
# either 0 or end of data
break
return data

The accepted answer is fine but it will be really slow with big files -string is an immutable class this means more objects are created every time you use the + sign, using list as a stack structure will be more efficient.
This should work better
while True:
chunk = s.recv(10000)
if not chunk:
break
fragments.append(chunk)
print "".join(fragments)

Most of the answers describe some sort of recvall() method. If your bottleneck when receiving data is creating the byte array in a for loop, I benchmarked three approaches of allocating the received data in the recvall() method:
Byte string method:
arr = b''
while len(arr) < msg_len:
arr += sock.recv(max_msg_size)
List method:
fragments = []
while True:
chunk = sock.recv(max_msg_size)
if not chunk:
break
fragments.append(chunk)
arr = b''.join(fragments)
Pre-allocated bytearray method:
arr = bytearray(msg_len)
pos = 0
while pos < msg_len:
arr[pos:pos+max_msg_size] = sock.recv(max_msg_size)
pos += max_msg_size
Results:

You may need to call conn.recv() multiple times to receive all the data. Calling it a single time is not guaranteed to bring in all the data that was sent, due to the fact that TCP streams don't maintain frame boundaries (i.e. they only work as a stream of raw bytes, not a structured stream of messages).
See this answer for another description of the issue.
Note that this means you need some way of knowing when you have received all of the data. If the sender will always send exactly 8000 bytes, you could count the number of bytes you have received so far and subtract that from 8000 to know how many are left to receive; if the data is variable-sized, there are various other methods that can be used, such as having the sender send a number-of-bytes header before sending the message, or if it's ASCII text that is being sent you could look for a newline or NUL character.

Disclaimer: There are very rare cases in which you really need to do this. If possible use an existing application layer protocol or define your own eg. precede each message with a fixed length integer indicating the length of data that follows or terminate each message with a '\n' character. (Adam Rosenfield's answer does a really good job at explaining that)
With that said, there is a way to read all of the data available on a socket. However, it is a bad idea to rely on this kind of communication as it introduces the risk of loosing data. Use this solution with extreme caution and only after reading the explanation below.
def recvall(sock):
BUFF_SIZE = 4096
data = bytearray()
while True:
packet = sock.recv(BUFF_SIZE)
if not packet: # Important!!
break
data.extend(packet)
return data
Now the if not packet: line is absolutely critical!
Many answers here suggested using a condition like if len(packet) < BUFF_SIZE: which is broken and will most likely cause you to close your connection prematurely and loose data. It wrongly assumes that one send on one end of a TCP socket corresponds to one receive of sent number of bytes on the other end. It does not. There is a very good chance that sock.recv(BUFF_SIZE) will return a chunk smaller than BUFF_SIZE even if there's still data waiting to be received. There is a good explanation of the issue here and here.
By using the above solution you are still risking data loss if the other end of the connection is writing data slower than you are reading. You may just simply consume all data on your end and exit when more is on the way. There are ways around it that require the use of concurrent programming, but that's another topic of its own.

A variation using a generator function (which I consider more pythonic):
def recvall(sock, buffer_size=4096):
buf = sock.recv(buffer_size)
while buf:
yield buf
if len(buf) < buffer_size: break
buf = sock.recv(buffer_size)
# ...
with socket.create_connection((host, port)) as sock:
sock.sendall(command)
response = b''.join(recvall(sock))

You can do it using Serialization
from socket import *
from json import dumps, loads
def recvall(conn):
data = ""
while True:
try:
data = conn.recv(1024)
return json.loads(data)
except ValueError:
continue
def sendall(conn):
conn.sendall(json.dumps(data))
NOTE: If you want to shara a file using code above you need to encode / decode it into base64

I think this question has been pretty well answered, but I just wanted to add a method using Python 3.8 and the new assignment expression (walrus operator) since it is stylistically simple.
import socket
host = "127.0.0.1"
port = 31337
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host,port))
s.listen()
con, addr = s.accept()
msg_list = []
while (walrus_msg := con.recv(3)) != b'\r\n':
msg_list.append(walrus_msg)
print(msg_list)
In this case, 3 bytes are received from the socket and immediately assigned to walrus_msg. Once the socket receives a b'\r\n' it breaks the loop. walrus_msg are added to a msg_list and printed after the loop breaks. This script is basic but was tested and works with a telnet session.
NOTE: The parenthesis around the (walrus_msg := con.recv(3)) are needed. Without this, while walrus_msg := con.recv(3) != b'\r\n': evaluates walrus_msg to True instead of the actual data on the socket.

Modifying Adam Rosenfield's code:
import sys
def send_msg(sock, msg):
size_of_package = sys.getsizeof(msg)
package = str(size_of_package)+":"+ msg #Create our package size,":",message
sock.sendall(package)
def recv_msg(sock):
try:
header = sock.recv(2)#Magic, small number to begin with.
while ":" not in header:
header += sock.recv(2) #Keep looping, picking up two bytes each time
size_of_package, separator, message_fragment = header.partition(":")
message = sock.recv(int(size_of_package))
full_message = message_fragment + message
return full_message
except OverflowError:
return "OverflowError."
except:
print "Unexpected error:", sys.exc_info()[0]
raise
I would, however, heavily encourage using the original approach.

For anyone else who's looking for an answer in cases where you don't know the length of the packet prior.
Here's a simple solution that reads 4096 bytes at a time and stops when less than 4096 bytes were received. However, it will not work in cases where the total length of the packet received is exactly 4096 bytes - then it will call recv() again and hang.
def recvall(sock):
data = b''
bufsize = 4096
while True:
packet = sock.recv(bufsize)
data += packet
if len(packet) < bufsize:
break
return data

This code reads 1024*32(=32768) bytes in 32 iterations from the buffer which is received from Server in socket programming-python:
jsonString = bytearray()
for _ in range(32):
packet = clisocket.recv(1024)
if not packet:
break
jsonString.extend(packet)
Data resides in jsonString variable

Plain and simple:
data = b''
while True:
data_chunk = client_socket.recv(1024)
if data_chunk:
data+=data_chunk
else:
break

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is looping Python TCP receiver receives message partially? - python

Related

Python: how do I hold all socket transfer while waiting for data?

Using delimiters in a Python TCP stream

Python3 socket, random partial result on socket receive

TCP sockets unable to send messages in a burst

Python Socket Receive Large Amount of Data

Categories

Resources