Python socket server receiving duplicate data

Python socket server receiving duplicate data - python

I have a raspberry running a server that communicates with a server on my laptop. I've done this through simple socket connections on python.
Like so:
server
HOST = "192.168.0.115"
PORT = 5001
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST,PORT))
s.listen(1)
conn, addr = s.accept()
while conn:
data = conn.recv(1024)
print data
if not data:
break
conn.close()
client
__init__(self,addr,port,timeout):
self.addr = addr
self.port = port
self.timeout = timeout
self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.socket.connect((self.addr,self.port))
#in a different file the call to send is made, passes list object as a parameter
send(self,data):
if self.socket:
self.socket.send(str(data))
setting a print statement in the client code results in (x,y,z), which is the expected output. However, upon receiving the data on server it appears in a pattern:
(x,y,z)(x,y,z)
(x,y,z)
(x,y,z)
(x,y,z)(x,y,z)
...
Why is the data being received as duplicates? Is it a property of TCP? If so, how can I counter this to receive the data as one sent string as I had initially.

TCP sends data as a stream. You can write data to that stream, and you can receive data from that stream. The important thing is that while it can be the case that one send corresponds to one receive, that's not at all guaranteed to be the case. You can send a big chunk and receive it in a bunch of smaller chunks, or you can send a bunch of smaller chunks and receive a big chunk, or anything in-between. You're in the latter situation.
To solve this, you need to layer some sort of framing protocol on top of TCP. There's two primary ways you could do this:
Prefix each message with the length of the message and then read that many bytes.
Put a delimiter between each message.
For your purposes (sending plain text without newlines), the latter with a newline as a delimiter would probably be fine. You can probably figure out how to do that, but essentially, do this repeatedly:
Receive some data.
Append that data to a buffer.
Search for a newline and process the buffer up to that point. Repeat this step until there are no more newlines.
If all you're doing is printing, you can replace all that by just looping, receiving from the socket then writing to stdout.

Related

About the issue that the character string is broken in TCP/IP communication between different machines

I tried TCP/IP communication between the same machine and TCP/IP communication between different machines.
First of all, I tried communication in the same Windows machine.The server and client code used is:
TCP_server.py
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(('', 50001))
s.listen(1)
while True:
conn, addr = s.accept()
with conn:
while True:
data = conn.recv(30000)
if not data:
break
if len(data.decode('utf-8')) < 35:
print("error")
break
print(data.decode('utf-8')+"\n")
TCP_client.py
# -*- coding : UTF-8 -*-
import socket
target_ip = "192.168.1.5"
target_port = 50001
buffer_size = 4096
tcp_client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcp_client.connect((target_ip,target_port))
message = b'123456789101112131415161718192021222324252627282930\n'
while True:
tcp_client.send(message)
The IP address of my Windows machine is 192.168.1.5, so the above code works. And it executed successfully without any error. The printed string is shown in the image below.
But when I tried to communicate with Mac and Windows using the exact same code, I had a problem. I used a Mac for the client and Windows for the server.The character string output on the server side is as follows.
As you can see from the image above, it is normally printed normally, but sometimes a line break is made and the character string is divided.
And my server-side code says that if the number of characters is less than 35, it will print error. However, error is not printed in this execution result.In other words, communication is not performed twice, but line breaks are inserted in one communication.
Is it possible to avoid this problem? Do I always have to be aware of line breaks when sending from another machine over TCP/IP?
I'm only using Python in this sample, but I had a similar problem using iOS's Swift for client-side code. So I would like to know a general solution.

There is no line break added by transmission of the data. This line break is instead added by the server code:
print(data.decode('utf-8')+"\n")
Both the print itself causes a line break and then you also add another one.
In general you are assuming that each send has a matching recv. This assumption is wrong. TCP is a byte stream and not a message stream and the payloads from multiple send might be merged together to reduce the overhead of sending and it might also cause a "split" into a single "message".
This is especially true when sending traffic between machines since the bandwidth between the machines is less than the local bandwidth and the MTU of the data layer is also much smaller.
Given that you have to first collect your "messages" at the server side. Only after you've got a complete "message" (whatever this is in your case) you should decode('utf-8'). Otherwise your code might crash when trying to decode a character which has a multi-byte UTF-8 encoding but where not all bytes were received yet.

Python file transfer (tcp socket), problem with slow network

I setted up a secure socket using Tor and socks, but i'm facing a problem when sending large amount of data
Sender:
socket.send(message.encode())
Receiver:
chunks = []
while 1:
part = connection.recv(4096)
chunks.append(part.decode())
if len(part) < 4096:
break
response = "".join(chunks)
Since the network speed is not consistent in a loop i don't always fill the 4096b buffer, so the loop breaks and i don't receive the full data.
Lowering the buffer size doesn't seem an option because the "packet" size can be as low as 20b sometimes

TCP can split your package data in any amount of pieces it wants. So you should never rely on other end of a socket on the size of the packet received. You have to invent another mechanism for detecting end of message/end of file.
If you are going to send only one blob and close socket, then on server side you just read until you get False value:
while True:
data = sock.recv(1024)
if data:
print(data)
# continue
else:
sock.close()
break
If you are going to send multiple messages, you have to decide, what will be the separator between them. For text protocols it is a good idea to use lineending. You can then enjoy the power of Twisted LineReceiver protocol and others.
If you are doing a binary protocol, it's a common practice to preface your each message with size byte/word/dword.

Try using structure to pass the length of the incoming data first to the receiver, "import struct". That way the receiving end knows exactly how much data to receive. In this example bytes are being sent over the socket, the examples here I've borrowed from my github upload github.com/nsk89/netcrypt for reference and cut out the encryption steps from the send function as well as it sending a serialised dictionary.
Edit I should also clarify that when you send data over the socket especially if your sending multiple messages they all sit in the stream as one long message. Not every message is 4096 bytes in length. If one is 2048 in length and the next 4096 and you receive 4096 on your buffers you'll receive the first message plus half of the next message or completely hang waiting for more data that doesn't exist.
data_to_send = struct.pack('>I', len(data_to_send)) + data_to_send # pack the length of data in the first four bytes of data stream, >I indicates internet byte order
socket_object.sendall(data_to_send) # transport data
def recv_message(socket_object):
raw_msg_length = recv_all(socket_object, 4) # receive first 4 bytes of data in stream
if not raw_msg_length:
return None
# unpack first 4 bytes using network byte order to retrieve incoming message length
msg_length = struct.unpack('>I', raw_msg_length)[0]
return recv_all(socket_object, msg_length) # recv rest of stream up to message length
def recv_all(socket_object, num_bytes):
data = b''
while len(data) < num_bytes: # while amount of data recv is less than message length passed
packet = socket_object.recv(num_bytes - len(data)) # recv remaining bytes/message
if not packet:
return None
data += packet
return data

By the way, no need to decode the every part before combine them to a chunk, combine all the parts to a chunk and then decode the chunk.
For your situation, the better way is using 2 steps.
Step1: sender send the size of the message, receiver take this size and ready to receive the message.
Step2: sender send the message, receiver combine the data if necessary.
Sender
# Step 1
socket.send( str(len(message.encode())).encode() )
# Step 2
socket.send(message.encode("utf-8"))
Receiver
# Step 1
message_size = connection.recv(1024)
print("Will receive message size：",message_size.decode())
# Step 2
recevied_size = 0
recevied_data = b''
while recevied_size < int(message_size.decode()):
part = connection.recv(1024)
recevied_size += len(part)
recevied_data += part
else:
print(recevied_data.decode("utf-8", "ignore"))
print("message receive done ....",recevied_size)

TCP socket reads out of turn

I am using TCP with Python sockets, transfering data from one computer to another. However the recv command reads more than it should in the serverside, I could not find the issue.
client.py
while rval:
image_string = frame.tostring()
sock.sendall(image_string)
rval, frame = vc.read()
server.py
while True:
image_string = ""
while len(image_string) < message_size:
data = conn.recv(message_size)
image_string += data
The length of the message is 921600 (message_size) so it is sent with sendall, however when recieved, when I print the length of the arrived messages, the lengths are sometimes wrong, and sometimes correct.
921600
921600
921923 # wrong
922601 # wrong
921682 # wrong
921600
921600
921780 # wrong
As you see, the wrong arrivals have no pattern. As I use TCP, I expected more consistency, however it seems the buffers are mixed up and somehow recieving a part of the next message, therefore producing a longer message. What is the issue here ?
I tried to add just the relevant part of the code, I can add more if you wish, but the code performs well on localhost but fails on two computers, so there should be no errors besides the transmitting part.
Edit1: I inspected this question a bit, it mentions that all send commands in the client may not be recieved by a single recv in the server, but I could not understand how to apply this to practice.

TCP is a stream protocol. There is ABSOLUTELY NO CONNECTION between the sizes of the chunks of data you send, and the chunks of data you receive. If you want to receive data of a known size, it's entirely up to you to only request that much data: you're currently requesting the total length of the data each time, which is going to try to read too much except in the unlikely event of the entire data being retrieved by the first .recv() call. Basically, you need to do something like data = conn.recv(message_size - len(image_string)) to reflect the fact that the amount of remaining data is decreasing.

Think of TCP as a raw stream of bytes. It is your responsibility to track where you are in the stream and interpret it correctly. Buffer what you read and only extract what you currently need.
Here's an (untested) class to illustrate:
class Buffer:
def __init__(self,socket):
self.socket = socket
self.buffer = b''
def recv_exactly(self,count):
# Could return less if socket closes early...
while len(self.buffer) < count:
data = self.socket.recv(4096)
if not data: break
self.buffer += data
ret,self.buffer = self.buffer[:count],self.buffer[count:]
return ret
The recv always requests the same amount of data and queues it in a buffer. recv_exactly only returns the number of bytes requested and leaves any extra in the buffer.

Unpacking UDP packets python

I am receiving UDP packets over wifi running a simple python script on a PC. The server and the PC are in the same subnet.
The server is sending 15 uint_8 (4 bytes each) every 20 ms or so. The data received seems to be corrupted (non Hex values). Any feedback why this could be happening greatly appreciated.
For example I get something like this,
'\xb3}fC\xb7v\t>\xc8X\xd2=g\x8e1\xbf\xe6D3\xbf\x00\x00\x13\xc3\xc8g\x1b#\xc2\x12\xb2B\x01\x000=\x02\xc0~?\x01\x00\x94<\x00\x00\x00\x00\x00\x00\x00\x00\x00
#\x9c\xbe\xac\xc9V#', ('192.168.4.1', 4097))
The script is attached here.
from socket import *
import time
HOST = '192.168.4.10'
PORT = 9048
address = (HOST, PORT)
client_socket = socket(AF_INET, SOCK_DGRAM) #Set Up the Socket
client_socket.bind((HOST, PORT))
client_socket.settimeout(5) #only wait 5 second for a response, otherwise timeout
while(1): #Main Loop
single_var = client_socket.recvfrom(1024)
print single_var #Print the response from Arduino
time.sleep(10/1000000) # sleep 10 microseconds

The print statement doesn't know that you want hex output, so it interprets hex values that have valid character representations as characters. If you want to print it as hex bytes, see the solution in Print a string as hex bytes.
i.e. do:
print ":".join("{:02x}".format(ord(c)) for c in single_var)

Send raw ethernet packet with data field length in type field

I'm trying to send a raw ethernet frame with the length of my data written in the type field. This should be a valid ethernet frame. My code for this looks like this:
ethData = "foobar"
proto =len(ethData)
if proto < 46:
proto = 46
soc = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, proto)
soc.bind((iface, proto))
For some reason I cant read package on the other end. I wonder why. I try to get this package in the interrupt handler of my wireless driver, so this packet has to be droped by my hardware directly or it doesn't get send at all. The question is why.

Sorry, my fault. I just parsed the wrong portion of the packet and didn't get any output. My bad. The package gets there just like it is supposed to.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python socket server receiving duplicate data - python

Related

About the issue that the character string is broken in TCP/IP communication between different machines

Python file transfer (tcp socket), problem with slow network

TCP socket reads out of turn

Unpacking UDP packets python

Send raw ethernet packet with data field length in type field

Categories

Resources