Receive big list through TCP Sockets - Python - python

Which would be the best way to receive a big list through TCP Sockets ?
My code looks like this.
When u have to receive a big list, that doesn't work obviously.
print 'connection from', client_address
while True:
try:
data = pickle.loads(connection.recv(8192))
except EOFError:
print 'no more data from', client_address
break

The best way to do this is to transform the socket into a file object. You can do this with connection.makefile(). Then, instead of calling pickle.loads() -- which expects a complete byte string containing the entire pickled object -- call pickle.load(connection.makefile()). This way, you let the pickle module handle reading the entire "file". It will call the object's read function repeatedly until it has received all the expected data.
That basically assumes that the entire remaining part of the stream is to be unpickled (which appears to be what you're trying to do so it sounds like it should work). Otherwise, you may need to wrap your own pseudo-file-object around the stream that has some independent knowledge of the end of the pickled object.

Related

Python socket putting data of multiple send into one receive buffer

I am trying to make a simple image-sharing app in Python such that images can be shared from clients to a server. I am doing that using socket and sending images in form of numpy arrays. My problem is that when the app loads, I have made such that 6 images or less(if less than 6 present for an account) are sent from server to the client. Each image is stored in the server with a unique name identifying it. This name is of variable length and is decided by the user and is sent before the image array is sent. Also, the shape of the image is sent to reproduce it from the bytes at the client. But as the name length is not fixed, I am reading 10 buffer size. But if the name is smaller, it is reading the shape of the are that I am sending later as well. How do I fix this? Here is the sending image code:
def send_multiple_imgs(send_socket, imgs):
num = len(imgs)
send_socket.send(str(num).encode())
for img in imgs:
print(img)
send_socket.send(img.encode())
send_img(send_socket,imgs[img])
part of send_img function:
def send_img(send_socket,img):
send_socket.send(str(img.shape).encode())
.....
The later part is not important, I think.
Here is the receiving part:
def receive_multiple_img(recv_socket):
num = int(recv_socket.recv(1).decode())
imgs = {}
for i in range(num):
img_name = recv_socket.recv(10).decode()
print(img_name)
imgs[img_name] = recieve_image(recv_socket)
return imgs
What is happening is, I have an image named 'ds' of shape (200,200,4), but the img_name reads:
'ds(200, 20' and then it messes up the further sending and receiving as well. How do I fix this? I am using TCP protocol:
s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)
I am new to networking in python. So, please consider if this is a silly question
TCP, unlike UDP, is a stream protocol. That means that a single send() doesn't correspond to a single read(). A call to send() simply places the data to be sent into the TCP send buffer, and a call to read() simply returns the bytes from the TCP receive buffer. The bytes in the buffer could have come from a single send() on the other side or a hundered.
This is a fairy common misunderstanding that leads to many bugs. Here's me explaining this again, and again, and again, and again.
If you want to send() several separate messages or pieces of data, the reader must be able to tell the messages apart. There are many ways to do that, such as using fixed-length messages, prefixing the message's length before each message, or using a delimiter.

TCP socket reads out of turn

I am using TCP with Python sockets, transfering data from one computer to another. However the recv command reads more than it should in the serverside, I could not find the issue.
client.py
while rval:
image_string = frame.tostring()
sock.sendall(image_string)
rval, frame = vc.read()
server.py
while True:
image_string = ""
while len(image_string) < message_size:
data = conn.recv(message_size)
image_string += data
The length of the message is 921600 (message_size) so it is sent with sendall, however when recieved, when I print the length of the arrived messages, the lengths are sometimes wrong, and sometimes correct.
921600
921600
921923 # wrong
922601 # wrong
921682 # wrong
921600
921600
921780 # wrong
As you see, the wrong arrivals have no pattern. As I use TCP, I expected more consistency, however it seems the buffers are mixed up and somehow recieving a part of the next message, therefore producing a longer message. What is the issue here ?
I tried to add just the relevant part of the code, I can add more if you wish, but the code performs well on localhost but fails on two computers, so there should be no errors besides the transmitting part.
Edit1: I inspected this question a bit, it mentions that all send commands in the client may not be recieved by a single recv in the server, but I could not understand how to apply this to practice.
TCP is a stream protocol. There is ABSOLUTELY NO CONNECTION between the sizes of the chunks of data you send, and the chunks of data you receive. If you want to receive data of a known size, it's entirely up to you to only request that much data: you're currently requesting the total length of the data each time, which is going to try to read too much except in the unlikely event of the entire data being retrieved by the first .recv() call. Basically, you need to do something like data = conn.recv(message_size - len(image_string)) to reflect the fact that the amount of remaining data is decreasing.
Think of TCP as a raw stream of bytes. It is your responsibility to track where you are in the stream and interpret it correctly. Buffer what you read and only extract what you currently need.
Here's an (untested) class to illustrate:
class Buffer:
def __init__(self,socket):
self.socket = socket
self.buffer = b''
def recv_exactly(self,count):
# Could return less if socket closes early...
while len(self.buffer) < count:
data = self.socket.recv(4096)
if not data: break
self.buffer += data
ret,self.buffer = self.buffer[:count],self.buffer[count:]
return ret
The recv always requests the same amount of data and queues it in a buffer. recv_exactly only returns the number of bytes requested and leaves any extra in the buffer.

Raw load found, how to access?

To start off, I have read through other raw answers pertaining to scapy on here, however none have been useful, maybe I am just doing something wrong and thats what has brought me here today.
So, for starters, I have a pcap file, which started corrupted with some retransmissions, to my belief I have gotten it back to gether correctly.
It contains Radiotap header, IEEE 802.11 (dot11), logical-link control, IPv4, UDP, and DNS.
To my understanding, the udp packets being transmitted hold this raw data, however, do to a some recent quirks, maybe the raw is in Radiotap/raw.
Using scapy, I'm iterating through the packets, and when a packet with the Raw layer is found, I am using the .show() function of scapy to view it.
As such, I can see that there is a raw load available
###[ Raw ]###
\load \
|###[ Raw ]###
| load = '#\x00\x00\x00\xff\xff\xff\xff\xff\xff\x10h?'
So, I suppose my question is, how can I capture this payload to receive whatever this may be, To my knowledge the load is supposed to be an image file, however I have trouble believing such, so I assume I have misstepped somewhere.
Here is the code I'm using to achieve the above result
from scapy.all import *
from scapy.utils import *
pack = rdpcap('/home/username/Downloads/new.pcap')
for packet in pack:
if packet.getlayer(Raw):
print '[+] Found Raw' + '\n'
l = packet.getlayer(Raw)
rawr = Raw(l)
rawr.show()
Any help, or insight for further reading would be appreciated, I am new to scapy and no expert in packet dissection.
*Side note, previously I had tried (using separate code and server) to replay the packets and send them to myself, to no avail. However I feel thats due to my lack of knowledge in receipt of UDP packets.
UPDATES - I have now tested my pcap file with a scapy reassembler, and I've confirmed I have no fragmented packets, or anything of the sort, so I assume all should go smoothly...
Upon opening my pcap in wireshark, I can see that there are retransmissions, but I'm not sure how much that will affect my goals since no fragmentation occurred?
Also, I have tried the getlayer(Raw).load, if I use print on it I get some gibberish to the screen, I'm assuming its the data to my would-be-image, however I need to now get it into a usable format.
You can do:
data = packet[Raw].load
You should be able to access the field in this way:
l = packet.getlayer(Raw).load
Using Scapy’s interactive shell I was successful doing this:
pcap = rdpcap('sniffed_packets.pcap')
s = pcap.sessions()
for key, value in s.iteritems():
# Looking for telnet sessions
if ':23' in key:
for v in value:
try:
v.getlayer(Raw).load
except AttributeError:
pass
If you are trying to get the load part of the packet only, you can try :
def handle_pkt(pkt):
if TCP in pkt and pkt[TCP].dport == 5201:
#print("got a packet")
print(pkt[IP])
load_part = pkt[IP].load
print("Load#",load_part)
pkt.show2()
sys.stdout.flush()

ftp sending python bytesio stream

I want to send a file with python ftplib, from one ftp site to another, so to avoid file read/write processees.
I create a BytesIO stream:
myfile=BytesIO()
And i succesfully retrieve a image file from ftp site one with retrbinary:
ftp_one.retrbinary('RETR P1090080.JPG', myfile.write)
I can save this memory object to a regular file:
fot=open('casab.jpg', 'wb')
fot=myfile.readvalue()
But i am not able to send this stream via ftp with storbinary. I thought this would work:
ftp_two.storbinary('STOR magnafoto.jpg', myfile.getvalue())
But doesnt. i get a long error msg ending with 'buf = fp.read(blocksize)
AttributeError: 'str' object has no attribute 'read'
I also tried many absurd combinations, but with no success. As an aside, I am also quite puzzled with what I am really doing with myfoto.write. Shouldnt it be myfoto.write() ?
I am also quite clueless to what this buffer thing does or require. Is what I want too complicated to achieve? Should I just ping pong the files with an intermediate write/read in my system? Ty all
EDIT: thanks to abanert I got things straight. For the record, storbinary arguments were wrong and a myfile.seek(0) was needed to 'rewind' the stream before sending it. This is a working snippet that moves a file between two ftp addresses without intermediate physical file writes:
import ftplib as ftp
from io import BytesIO
ftp_one=ftp.FTP(address1, user1, pass1)
ftp_two=ftp.FTP(address2, user2, pass2)
myfile=BytesIO()
ftp_one.retrbinary ('RETR imageoldname.jpg', myfile.write)
myfile.seek(0)
ftp_two.storbinary('STOR imagenewname.jpg', myfile)
ftp_one.close()
ftp_two.close()
myfile.close()
The problem is that you're calling getvalue(). Just don't do that:
ftp_two.storbinary('STOR magnafoto.jpg', myfile)
storbinary requires a file-like object that it can call read on.
Fortunately, you have just such an object, myfile, a BytesIO. (It's not entirely clear from your code what the sequence of things is here—if this doesn't work as-is, you may need to myfile.seek(0) or create it in a different mode or something. But a BytesIO will work with storbinary unless you do something wrong.)
But instead of passing myfile, you pass myfile.getvalue(). And getvalue "Returns bytes containing the entire contents of the buffer."
So, instead of giving storbinary a file-like object that it can call read on, you're giving it a bytes object, which is of course the same as str in Python 2.x, and you can't call read on that.
For your aside:
As an aside, I am also quite puzzled with what I am really doing with myfoto.write. Shouldnt it be myfoto.write() ?
Look at the docs. The second parameter isn't a file, it's a callback function.
The callback function is called for each block of data received, with a single string argument giving the data block.
What you want is a function that appends each block of data to the end of myfoto. While you could write your own function to do that:
def callback(block_of_data):
myfoto.write(block_of_data)
… it should be pretty obvious that this function does exactly the same thing as the myfoto.write method. So, you can just pass that method itself.
If you don't understand about bound methods, see Method Objects in the tutorial.
This flexibility, as weird as it seems, lets you do something even better than downloading the whole file into a buffer to send to another server. You can actually open the two connections at the same time, and use callbacks to send each buffer from the source server to the destination server as it's received, without ever storing anything more than one buffer.
But, unless you really need that, you probably don't want to go through all that complexity.
In fact, in general, ftplib is kind of low-level. And it has some designs (like the fact that storbinary takes a file, while retrbinary takes a callback) that make total sense at that low level but seem very odd from a higher level. So, you may want to look at some of the higher-level libraries by doing a search at PyPI.

Reading bytes from network socket in Python

I want to write a Python program that gets data from a network socket and then scans the data looking for particular sequences of data.
The 'getting from the network' bit works fine, and I can dump the retrieved data to a file with no problem, but trying to get Python to actually scan the data one byte at a time is just not working.
Whenever I put code in to try and work things in the 'for byte' loop, I don't get anything much to happen.
When I run the program below, the size of byte.out is usually twice the size of buf.out, which I think is a major symptom pointing to what has gone wrong. If the inner loop were really dealing with the data byte by byte, I would expect both output files to be the same size.
My feeling is that there is something wrong with "for byte in chr(buf):" but I really don't know what to put here.
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
fh1 = open("buf.out", 'wb')
fh2 = open("byte.out", 'wb')
s.connect(("obscured.url", 9999))
s.send('GET /xx HTTP/1.1\nHost obscured.url:9999\n\n')
for i in range(10):
buf = s.recv(1024)
for byte in chr(buf):
print >>fh2, byte
print >>fh1, buf
s.close
chr(buf) should give a TypeError. Use
for byte in buf:

Categories

Resources