See edits below.
I have two programs that communicate through sockets. I'm trying to send a block of data from one to the other. This has been working with some test data, but is failing with others.
s.sendall('%16d' % len(data))
s.sendall(data)
print(len(data))
sends to
size = int(s.recv(16))
recvd = ''
while size > len(recvd):
data = s.recv(1024)
if not data:
break
recvd += data
print(size, len(recvd))
At one end:
s = socket.socket()
s.connect((server_ip, port))
and the other:
c = socket.socket()
c.bind(('', port))
c.listen(1)
s,a = c.accept()
In my latest test, I sent a 7973903 byte block and the receiver reports size as 7973930.
Why is the data block received off by 27 bytes?
Any other issues?
Python 2.7 or 2.5.4 if that matters.
EDIT: Aha - I'm probably reading past the end of the send buffer. If remaining bytes is less than 1024, I should only read the number of remaining bytes. Is there a standard technique for this sort of data transfer? I have the feeling I'm reinventing the wheel.
EDIT2: I'm screwing up by reading the next file in the series. I'm sending file1 and the last block is 997 bytes. Then I send file2, so the recv(1024) at the end of file1 reads the first 27 bytes of file2.
I'll start another question on how to do this better.
Thanks everyone. Asking and reading comments helped me focus.
First, the line
size = int(s.recv(16))
might read less than 16 bytes — it is unlikely, I will grant, but possible depending on how the network buffers align. The recv() call argument is a maximum value, a limit on how much data you are willing to receive. But you might only receive one byte. The operating system will generally give you control back once at least one byte has arrived, maybe (depending on the OS and on how busy the CPU is) after waiting another few milliseconds in case a second packet arrives with some further data, so that it only has to wake you up once instead of twice.
So you would want to say instead (to do the simplest possible loop; other variants are possible):
data = ''
while len(data) < 16:
more = s.recv(16 - len(data))
if not more:
raise EOFError()
data += more
This is indeed a wheel nearly everyone re-invents because it is so often needed. And your own code needs it a second time: your while loop needs its recv() to count down, asking for smaller and smaller limits until finally it has received exactly the number of bytes that were promised, and no more.
Related
I am trying to make a simple image-sharing app in Python such that images can be shared from clients to a server. I am doing that using socket and sending images in form of numpy arrays. My problem is that when the app loads, I have made such that 6 images or less(if less than 6 present for an account) are sent from server to the client. Each image is stored in the server with a unique name identifying it. This name is of variable length and is decided by the user and is sent before the image array is sent. Also, the shape of the image is sent to reproduce it from the bytes at the client. But as the name length is not fixed, I am reading 10 buffer size. But if the name is smaller, it is reading the shape of the are that I am sending later as well. How do I fix this? Here is the sending image code:
def send_multiple_imgs(send_socket, imgs):
num = len(imgs)
send_socket.send(str(num).encode())
for img in imgs:
print(img)
send_socket.send(img.encode())
send_img(send_socket,imgs[img])
part of send_img function:
def send_img(send_socket,img):
send_socket.send(str(img.shape).encode())
.....
The later part is not important, I think.
Here is the receiving part:
def receive_multiple_img(recv_socket):
num = int(recv_socket.recv(1).decode())
imgs = {}
for i in range(num):
img_name = recv_socket.recv(10).decode()
print(img_name)
imgs[img_name] = recieve_image(recv_socket)
return imgs
What is happening is, I have an image named 'ds' of shape (200,200,4), but the img_name reads:
'ds(200, 20' and then it messes up the further sending and receiving as well. How do I fix this? I am using TCP protocol:
s = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)
I am new to networking in python. So, please consider if this is a silly question
TCP, unlike UDP, is a stream protocol. That means that a single send() doesn't correspond to a single read(). A call to send() simply places the data to be sent into the TCP send buffer, and a call to read() simply returns the bytes from the TCP receive buffer. The bytes in the buffer could have come from a single send() on the other side or a hundered.
This is a fairy common misunderstanding that leads to many bugs. Here's me explaining this again, and again, and again, and again.
If you want to send() several separate messages or pieces of data, the reader must be able to tell the messages apart. There are many ways to do that, such as using fixed-length messages, prefixing the message's length before each message, or using a delimiter.
I setted up a secure socket using Tor and socks, but i'm facing a problem when sending large amount of data
Sender:
socket.send(message.encode())
Receiver:
chunks = []
while 1:
part = connection.recv(4096)
chunks.append(part.decode())
if len(part) < 4096:
break
response = "".join(chunks)
Since the network speed is not consistent in a loop i don't always fill the 4096b buffer, so the loop breaks and i don't receive the full data.
Lowering the buffer size doesn't seem an option because the "packet" size can be as low as 20b sometimes
TCP can split your package data in any amount of pieces it wants. So you should never rely on other end of a socket on the size of the packet received. You have to invent another mechanism for detecting end of message/end of file.
If you are going to send only one blob and close socket, then on server side you just read until you get False value:
while True:
data = sock.recv(1024)
if data:
print(data)
# continue
else:
sock.close()
break
If you are going to send multiple messages, you have to decide, what will be the separator between them. For text protocols it is a good idea to use lineending. You can then enjoy the power of Twisted LineReceiver protocol and others.
If you are doing a binary protocol, it's a common practice to preface your each message with size byte/word/dword.
Try using structure to pass the length of the incoming data first to the receiver, "import struct". That way the receiving end knows exactly how much data to receive. In this example bytes are being sent over the socket, the examples here I've borrowed from my github upload github.com/nsk89/netcrypt for reference and cut out the encryption steps from the send function as well as it sending a serialised dictionary.
Edit I should also clarify that when you send data over the socket especially if your sending multiple messages they all sit in the stream as one long message. Not every message is 4096 bytes in length. If one is 2048 in length and the next 4096 and you receive 4096 on your buffers you'll receive the first message plus half of the next message or completely hang waiting for more data that doesn't exist.
data_to_send = struct.pack('>I', len(data_to_send)) + data_to_send # pack the length of data in the first four bytes of data stream, >I indicates internet byte order
socket_object.sendall(data_to_send) # transport data
def recv_message(socket_object):
raw_msg_length = recv_all(socket_object, 4) # receive first 4 bytes of data in stream
if not raw_msg_length:
return None
# unpack first 4 bytes using network byte order to retrieve incoming message length
msg_length = struct.unpack('>I', raw_msg_length)[0]
return recv_all(socket_object, msg_length) # recv rest of stream up to message length
def recv_all(socket_object, num_bytes):
data = b''
while len(data) < num_bytes: # while amount of data recv is less than message length passed
packet = socket_object.recv(num_bytes - len(data)) # recv remaining bytes/message
if not packet:
return None
data += packet
return data
By the way, no need to decode the every part before combine them to a chunk, combine all the parts to a chunk and then decode the chunk.
For your situation, the better way is using 2 steps.
Step1: sender send the size of the message, receiver take this size and ready to receive the message.
Step2: sender send the message, receiver combine the data if necessary.
Sender
# Step 1
socket.send( str(len(message.encode())).encode() )
# Step 2
socket.send(message.encode("utf-8"))
Receiver
# Step 1
message_size = connection.recv(1024)
print("Will receive message size:",message_size.decode())
# Step 2
recevied_size = 0
recevied_data = b''
while recevied_size < int(message_size.decode()):
part = connection.recv(1024)
recevied_size += len(part)
recevied_data += part
else:
print(recevied_data.decode("utf-8", "ignore"))
print("message receive done ....",recevied_size)
I am using TCP with Python sockets, transfering data from one computer to another. However the recv command reads more than it should in the serverside, I could not find the issue.
client.py
while rval:
image_string = frame.tostring()
sock.sendall(image_string)
rval, frame = vc.read()
server.py
while True:
image_string = ""
while len(image_string) < message_size:
data = conn.recv(message_size)
image_string += data
The length of the message is 921600 (message_size) so it is sent with sendall, however when recieved, when I print the length of the arrived messages, the lengths are sometimes wrong, and sometimes correct.
921600
921600
921923 # wrong
922601 # wrong
921682 # wrong
921600
921600
921780 # wrong
As you see, the wrong arrivals have no pattern. As I use TCP, I expected more consistency, however it seems the buffers are mixed up and somehow recieving a part of the next message, therefore producing a longer message. What is the issue here ?
I tried to add just the relevant part of the code, I can add more if you wish, but the code performs well on localhost but fails on two computers, so there should be no errors besides the transmitting part.
Edit1: I inspected this question a bit, it mentions that all send commands in the client may not be recieved by a single recv in the server, but I could not understand how to apply this to practice.
TCP is a stream protocol. There is ABSOLUTELY NO CONNECTION between the sizes of the chunks of data you send, and the chunks of data you receive. If you want to receive data of a known size, it's entirely up to you to only request that much data: you're currently requesting the total length of the data each time, which is going to try to read too much except in the unlikely event of the entire data being retrieved by the first .recv() call. Basically, you need to do something like data = conn.recv(message_size - len(image_string)) to reflect the fact that the amount of remaining data is decreasing.
Think of TCP as a raw stream of bytes. It is your responsibility to track where you are in the stream and interpret it correctly. Buffer what you read and only extract what you currently need.
Here's an (untested) class to illustrate:
class Buffer:
def __init__(self,socket):
self.socket = socket
self.buffer = b''
def recv_exactly(self,count):
# Could return less if socket closes early...
while len(self.buffer) < count:
data = self.socket.recv(4096)
if not data: break
self.buffer += data
ret,self.buffer = self.buffer[:count],self.buffer[count:]
return ret
The recv always requests the same amount of data and queues it in a buffer. recv_exactly only returns the number of bytes requested and leaves any extra in the buffer.
I'm trying to send files (images and text) by sockets in python. I don't want to create a new connection every time because the code is writing lots of files (>100) in a short amount of time so I don't want to build up that many connections while they wait to close. So before each chunk of the file is sent, I send the length of the chunk first. When I run it, it gives me a ValueError on length = int(s.recv(4)) , showing a string from the file and saying that it cannot be converted to an int. Here is the part of my code that sends and receives one file:
Sending:
#Connect s and open file f
s.setblocking(1)
buf = 4096
while True:
msg = f.read(buf)
length = str(len(msg))
if len(length) < 4: length = "0"*(4-len(length)) + length
s.sendall(length)
if length == "0000": break
s.sendall(msg)
if len(msg) != buf: break
Receiving:
#Connect s and open file f
while True:
length = int(s.recv(4))
if length == 0: break
f.write(s.recv(length))
if length < buf: break
Running on Windows 8.
If you are sending large files (depending on your router), the router might split the packets into shorter ones causing you to accidentally try to get the length of a chunk while still not getting the entire earlier file. You should make sure with a while loop that you got the entire length of data, if not keep requesting for the data with s.recv(length - len(what_i_got_so_far)).
Example, when I'm sending to myself a few MB picture through LAN the router cuts the packets to around 25 KB so I have to use recv a lot of times although I only used 1 send.
I hope this helps.
I'm trying to use a computer connected to an Arduino (which is itself connected to some 5V voltmeters) to "fake" an old school stereo VU meter. My goal is to have the computer that is playing the audio file analyze the signal and send the amplitude information to the Arudino via a serial connection to be displayed on the voltmeters.
I'm using MPD to render and send the audio to a USB DAC (ODAC). MPD is also outputting to a FIFO, which I read from using a Python script. I read from the FIFO in 4096 byte chunks, then use the audioop library to split that chunk/sample into a left and right channel and compute the maximum amplitude of each channel.
Here's the problem - I'm getting swamped with data. I'm guessing my math is wrong or that I don't understand how a FIFO works (or maybe both). MPD is outputting everything in 44100:16:2 format - I thought that meant that it would be writing out 44,100 4-byte samples per second. So if I'm grabbing 4096 byte chunks, I should expect about 43 chunks per second. But I'm getting far more than that (over 100) and the number of chunks I get per second doesn't change if I up my chunk size. For example, if I double my chunk size to 8192, I still get roughly the same number of chunks per second. So clearly I'm doing something wrong, but I don't know what it is. Anyone have any thoughts?
Here is the relevant portion of my mpd.conf file:
audio_output {
type "fifo"
name "my_fifo"
path "/tmp/mpd.fifo"
format "44100:16:2"
}
And here is the Python script:
import os
import audioop
import time
import errno
import math
#Open the FIFO that MPD has created for us
#This represents the sample (44100:16:2) that MPD is currently "playing"
fifo = os.open('/tmp/mpd.fifo', os.O_RDONLY)
while 1:
try:
rawStream = os.read(fifo, 4096)
except OSError as err:
if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
rawStream = None
else:
raise
if rawStream:
leftChannel = audioop.tomono(rawStream, 2, 1, 0)
rightChannel = audioop.tomono(rawStream, 2, 0, 1)
stereoPeak = audioop.max(rawStream, 2)
leftPeak = audioop.max(leftChannel, 2)
rightPeak = audioop.max(rightChannel, 2)
leftDB = 20 * math.log10(leftPeak) -74
rightDB = 20 * math.log10(rightPeak) -74
print(rightPeak, leftPeak, rightDB, leftDB)
Answering my own question. It turns out that, regardless of how many bytes I specified should be read, os.read() was returning 2048 bytes. What that means is that the second parameter that os.read() takes is the maximum number of bytes it will read - but there's no guarantee that that many bytes will actually be read. I had thought that by leaving out the NONBLOCK option when I opened the FIFO that the os.read() call would wait around until it received an end of file or the number of bytes specified. But that's not the case. To get around this issue, my code now checks the length of the byte string returned by os.read() and - if that length is less than my specified chunk size - will wait to grab the next chunk(s) and then will concatenate all the chunks together so that I have a chunk size that matches my target before I move on to processing the data.