What am I doing wrong with sendline() in my twisted server? - python

I'm trying to send the contents of a file over to the client 1 line at a time
so that the client (written in Objective-C) can process each line individually.
However the log of the client shows that the data being send over from the server
is all coming through as 1 line and is apparently too large so the it cuts off mid
way through one line causing the client to crash because of the unexpected syntax.
Is there something i'm doing on the server(written in python with twisted) that is
causing the lines to not be sent separately?
Here is the particular code in the server that is holding me up at the moment.
def sendLine(self, line):
self.transport.write(line + '\r\n')
def updateShiftList(self):
#open the datesRequested file for the appropriate store and load the dates into a list
fob = open('stores/'+self.storeName+'/requests/datesRequested','r')
DATES_REQUESTED = fob.read()
datesRequested = DATES_REQUESTED.split('\n')
#open each date file that is listed in datesRequested
for date in datesRequested:
if os.path.isfile('stores/'+self.storeName+'/requests/' + date):
fob2 = open('stores/'+self.storeName+'/requests/' + date,'r')
#load the file into memory and split the individual requests up
THE_REQUESTS = fob2.read()
thedaysRequests = THE_REQUESTS.split('\n')
for oneRequest in thedaysRequests:
if len(oneRequest) > 4:
print "*)[*_-b4.New_REQUEST:"+oneRequest
self.sendLine('*)[*_-b4.New_REQUEST:'+oneRequest)
fob2.close()
fob.close()
So frustrating and i'm sure it's something easy. Thanks.

This question concerns a topic that is frequently raised. There are a number of questions on stackoverflow about the same issue:
TCP messages getting coalesced
gen_tcp smushed messages
Broken TCP messages
How to separate TCP socket messages in node.js
etc.
TCP sends an ordered, reliable stream of data.
Ordering means that bytes sent first arrive first.
Reliability means bytes sent will be delivered, or the connection will break. Data is not silently dropped.
Streaming is what mostly concerns your question. A stream is not divided up into separate messages. It consists of bytes, and the "boundary" between those bytes can move arbitrarily.
If you send "hello, " and then you send "world", the boundary between those two strings in the stream may disappear. The peer may receive "hello, world", or "h", "ello, world", or "he", "ll", "o,", " w", "or", "ld".
This is the very reason people use line-oriented protocols. The "\r\n" at the end of each logical message lets the receiver buffer and split the stream up into those original logical messages.
For a deeper dive, I recommend this video of a recent PyCon presentation: http://pyvideo.org/speaker/417/glyph
This all points towards the other end of your connection, the ObjC client, as the source of your misbehavior.

Related

About the issue that the character string is broken in TCP/IP communication between different machines

I tried TCP/IP communication between the same machine and TCP/IP communication between different machines.
First of all, I tried communication in the same Windows machine.The server and client code used is:
TCP_server.py
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(('', 50001))
s.listen(1)
while True:
conn, addr = s.accept()
with conn:
while True:
data = conn.recv(30000)
if not data:
break
if len(data.decode('utf-8')) < 35:
print("error")
break
print(data.decode('utf-8')+"\n")
TCP_client.py
# -*- coding : UTF-8 -*-
import socket
target_ip = "192.168.1.5"
target_port = 50001
buffer_size = 4096
tcp_client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcp_client.connect((target_ip,target_port))
message = b'123456789101112131415161718192021222324252627282930\n'
while True:
tcp_client.send(message)
The IP address of my Windows machine is 192.168.1.5, so the above code works. And it executed successfully without any error. The printed string is shown in the image below.
But when I tried to communicate with Mac and Windows using the exact same code, I had a problem. I used a Mac for the client and Windows for the server.The character string output on the server side is as follows.
As you can see from the image above, it is normally printed normally, but sometimes a line break is made and the character string is divided.
And my server-side code says that if the number of characters is less than 35, it will print error. However, error is not printed in this execution result.In other words, communication is not performed twice, but line breaks are inserted in one communication.
Is it possible to avoid this problem? Do I always have to be aware of line breaks when sending from another machine over TCP/IP?
I'm only using Python in this sample, but I had a similar problem using iOS's Swift for client-side code. So I would like to know a general solution.
There is no line break added by transmission of the data. This line break is instead added by the server code:
print(data.decode('utf-8')+"\n")
Both the print itself causes a line break and then you also add another one.
In general you are assuming that each send has a matching recv. This assumption is wrong. TCP is a byte stream and not a message stream and the payloads from multiple send might be merged together to reduce the overhead of sending and it might also cause a "split" into a single "message".
This is especially true when sending traffic between machines since the bandwidth between the machines is less than the local bandwidth and the MTU of the data layer is also much smaller.
Given that you have to first collect your "messages" at the server side. Only after you've got a complete "message" (whatever this is in your case) you should decode('utf-8'). Otherwise your code might crash when trying to decode a character which has a multi-byte UTF-8 encoding but where not all bytes were received yet.

Python socket putting data of multiple send into one receive buffer

I am trying to make a simple image-sharing app in Python such that images can be shared from clients to a server. I am doing that using socket and sending images in form of numpy arrays. My problem is that when the app loads, I have made such that 6 images or less(if less than 6 present for an account) are sent from server to the client. Each image is stored in the server with a unique name identifying it. This name is of variable length and is decided by the user and is sent before the image array is sent. Also, the shape of the image is sent to reproduce it from the bytes at the client. But as the name length is not fixed, I am reading 10 buffer size. But if the name is smaller, it is reading the shape of the are that I am sending later as well. How do I fix this? Here is the sending image code:
def send_multiple_imgs(send_socket, imgs):
num = len(imgs)
send_socket.send(str(num).encode())
for img in imgs:
print(img)
send_socket.send(img.encode())
send_img(send_socket,imgs[img])
part of send_img function:
def send_img(send_socket,img):
send_socket.send(str(img.shape).encode())
.....
The later part is not important, I think.
Here is the receiving part:
def receive_multiple_img(recv_socket):
num = int(recv_socket.recv(1).decode())
imgs = {}
for i in range(num):
img_name = recv_socket.recv(10).decode()
print(img_name)
imgs[img_name] = recieve_image(recv_socket)
return imgs
What is happening is, I have an image named 'ds' of shape (200,200,4), but the img_name reads:
'ds(200, 20' and then it messes up the further sending and receiving as well. How do I fix this? I am using TCP protocol:
s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)
I am new to networking in python. So, please consider if this is a silly question
TCP, unlike UDP, is a stream protocol. That means that a single send() doesn't correspond to a single read(). A call to send() simply places the data to be sent into the TCP send buffer, and a call to read() simply returns the bytes from the TCP receive buffer. The bytes in the buffer could have come from a single send() on the other side or a hundered.
This is a fairy common misunderstanding that leads to many bugs. Here's me explaining this again, and again, and again, and again.
If you want to send() several separate messages or pieces of data, the reader must be able to tell the messages apart. There are many ways to do that, such as using fixed-length messages, prefixing the message's length before each message, or using a delimiter.

TCP socket reads out of turn

I am using TCP with Python sockets, transfering data from one computer to another. However the recv command reads more than it should in the serverside, I could not find the issue.
client.py
while rval:
image_string = frame.tostring()
sock.sendall(image_string)
rval, frame = vc.read()
server.py
while True:
image_string = ""
while len(image_string) < message_size:
data = conn.recv(message_size)
image_string += data
The length of the message is 921600 (message_size) so it is sent with sendall, however when recieved, when I print the length of the arrived messages, the lengths are sometimes wrong, and sometimes correct.
921600
921600
921923 # wrong
922601 # wrong
921682 # wrong
921600
921600
921780 # wrong
As you see, the wrong arrivals have no pattern. As I use TCP, I expected more consistency, however it seems the buffers are mixed up and somehow recieving a part of the next message, therefore producing a longer message. What is the issue here ?
I tried to add just the relevant part of the code, I can add more if you wish, but the code performs well on localhost but fails on two computers, so there should be no errors besides the transmitting part.
Edit1: I inspected this question a bit, it mentions that all send commands in the client may not be recieved by a single recv in the server, but I could not understand how to apply this to practice.
TCP is a stream protocol. There is ABSOLUTELY NO CONNECTION between the sizes of the chunks of data you send, and the chunks of data you receive. If you want to receive data of a known size, it's entirely up to you to only request that much data: you're currently requesting the total length of the data each time, which is going to try to read too much except in the unlikely event of the entire data being retrieved by the first .recv() call. Basically, you need to do something like data = conn.recv(message_size - len(image_string)) to reflect the fact that the amount of remaining data is decreasing.
Think of TCP as a raw stream of bytes. It is your responsibility to track where you are in the stream and interpret it correctly. Buffer what you read and only extract what you currently need.
Here's an (untested) class to illustrate:
class Buffer:
def __init__(self,socket):
self.socket = socket
self.buffer = b''
def recv_exactly(self,count):
# Could return less if socket closes early...
while len(self.buffer) < count:
data = self.socket.recv(4096)
if not data: break
self.buffer += data
ret,self.buffer = self.buffer[:count],self.buffer[count:]
return ret
The recv always requests the same amount of data and queues it in a buffer. recv_exactly only returns the number of bytes requested and leaves any extra in the buffer.

Raw load found, how to access?

To start off, I have read through other raw answers pertaining to scapy on here, however none have been useful, maybe I am just doing something wrong and thats what has brought me here today.
So, for starters, I have a pcap file, which started corrupted with some retransmissions, to my belief I have gotten it back to gether correctly.
It contains Radiotap header, IEEE 802.11 (dot11), logical-link control, IPv4, UDP, and DNS.
To my understanding, the udp packets being transmitted hold this raw data, however, do to a some recent quirks, maybe the raw is in Radiotap/raw.
Using scapy, I'm iterating through the packets, and when a packet with the Raw layer is found, I am using the .show() function of scapy to view it.
As such, I can see that there is a raw load available
###[ Raw ]###
\load \
|###[ Raw ]###
| load = '#\x00\x00\x00\xff\xff\xff\xff\xff\xff\x10h?'
So, I suppose my question is, how can I capture this payload to receive whatever this may be, To my knowledge the load is supposed to be an image file, however I have trouble believing such, so I assume I have misstepped somewhere.
Here is the code I'm using to achieve the above result
from scapy.all import *
from scapy.utils import *
pack = rdpcap('/home/username/Downloads/new.pcap')
for packet in pack:
if packet.getlayer(Raw):
print '[+] Found Raw' + '\n'
l = packet.getlayer(Raw)
rawr = Raw(l)
rawr.show()
Any help, or insight for further reading would be appreciated, I am new to scapy and no expert in packet dissection.
*Side note, previously I had tried (using separate code and server) to replay the packets and send them to myself, to no avail. However I feel thats due to my lack of knowledge in receipt of UDP packets.
UPDATES - I have now tested my pcap file with a scapy reassembler, and I've confirmed I have no fragmented packets, or anything of the sort, so I assume all should go smoothly...
Upon opening my pcap in wireshark, I can see that there are retransmissions, but I'm not sure how much that will affect my goals since no fragmentation occurred?
Also, I have tried the getlayer(Raw).load, if I use print on it I get some gibberish to the screen, I'm assuming its the data to my would-be-image, however I need to now get it into a usable format.
You can do:
data = packet[Raw].load
You should be able to access the field in this way:
l = packet.getlayer(Raw).load
Using Scapy’s interactive shell I was successful doing this:
pcap = rdpcap('sniffed_packets.pcap')
s = pcap.sessions()
for key, value in s.iteritems():
# Looking for telnet sessions
if ':23' in key:
for v in value:
try:
v.getlayer(Raw).load
except AttributeError:
pass
If you are trying to get the load part of the packet only, you can try :
def handle_pkt(pkt):
if TCP in pkt and pkt[TCP].dport == 5201:
#print("got a packet")
print(pkt[IP])
load_part = pkt[IP].load
print("Load#",load_part)
pkt.show2()
sys.stdout.flush()

python socket: sending and receiving 16 bytes

See edits below.
I have two programs that communicate through sockets. I'm trying to send a block of data from one to the other. This has been working with some test data, but is failing with others.
s.sendall('%16d' % len(data))
s.sendall(data)
print(len(data))
sends to
size = int(s.recv(16))
recvd = ''
while size > len(recvd):
data = s.recv(1024)
if not data:
break
recvd += data
print(size, len(recvd))
At one end:
s = socket.socket()
s.connect((server_ip, port))
and the other:
c = socket.socket()
c.bind(('', port))
c.listen(1)
s,a = c.accept()
In my latest test, I sent a 7973903 byte block and the receiver reports size as 7973930.
Why is the data block received off by 27 bytes?
Any other issues?
Python 2.7 or 2.5.4 if that matters.
EDIT: Aha - I'm probably reading past the end of the send buffer. If remaining bytes is less than 1024, I should only read the number of remaining bytes. Is there a standard technique for this sort of data transfer? I have the feeling I'm reinventing the wheel.
EDIT2: I'm screwing up by reading the next file in the series. I'm sending file1 and the last block is 997 bytes. Then I send file2, so the recv(1024) at the end of file1 reads the first 27 bytes of file2.
I'll start another question on how to do this better.
Thanks everyone. Asking and reading comments helped me focus.
First, the line
size = int(s.recv(16))
might read less than 16 bytes — it is unlikely, I will grant, but possible depending on how the network buffers align. The recv() call argument is a maximum value, a limit on how much data you are willing to receive. But you might only receive one byte. The operating system will generally give you control back once at least one byte has arrived, maybe (depending on the OS and on how busy the CPU is) after waiting another few milliseconds in case a second packet arrives with some further data, so that it only has to wake you up once instead of twice.
So you would want to say instead (to do the simplest possible loop; other variants are possible):
data = ''
while len(data) < 16:
more = s.recv(16 - len(data))
if not more:
raise EOFError()
data += more
This is indeed a wheel nearly everyone re-invents because it is so often needed. And your own code needs it a second time: your while loop needs its recv() to count down, asking for smaller and smaller limits until finally it has received exactly the number of bytes that were promised, and no more.

Categories

Resources