How to continuously read data from socket in python?

How to continuously read data from socket in python? - python

The problem is that i don't know how much bytes i will receive from socket, so i just trying to loop.
buffer = ''
while True:
data, addr = sock.recvfrom(1024)
buffer += data
print buffer
As I understood recvfrom return only specified size of bytes and the discards other data, is it possible somehow continuously read this data to buffer variable?

It wont discard the data, it will just return the data in the next iteration. What you are doing in your code is perfectly correct.
The only thing I would change is a clause to break the loop:
buffer = ''
while True:
data, addr = sock.recv(1024)
if data:
buffer += data
print buffer
else:
break
An empty string signifies the connection has been broken according to the documentation
If this code still does not work then it would be good to show us how you are setting up your socket.

I suggest using readline as the buffer and use "\n" as the separator between lines:
#read one line from the socket
def buffered_readLine(socket):
line = ""
while True:
part = socket.recv(1)
if part != "\n":
line+=part
elif part == "\n":
break
return line
This is helpful when you want to buffer without closing the socket.
sock.recv(1024) will hang when there is no data being sent unless you close the socket on the other end.

Related

What is a proper endless socket server loop in Python

I am a Python newbie and my first task is to create a small server program that will forward events from a network unit to a rest api.
The overall structure of my code seems to work, but I have one problem. After I receive the first package, nothing happens. Is something wrong with my loop such that new packages (from the same client) aren't accepted?
Packages look something like this: EVNTTAG 20190219164001132%0C%3D%E2%80h%90%00%00%00%01%CBU%FB%DF ... not that it matters, but I'm sharing just for clarity.
My code (I skipped the irrelevant init of rest etc. but the main loop is the complete code):
# Configure TAGP listener
ipaddress = ([l for l in ([ip for ip in socket.gethostbyname_ex(socket.gethostname())[2] if not ip.startswith("127.")][:1], [[(s.connect(('8.8.8.8', 53)), s.getsockname()[0], s.close()) for s in [socket.socket(socket.AF_INET, socket.SOCK_DGRAM)]][0][1]]) if l][0][0])
server_name = ipaddress
server_address = (server_name, TAGPListenerPort)
print ('starting TAGP listener on %s port %s' % server_address)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(server_address)
sock.listen(1)
sensor_data = {'tag': 0}
# Listen for TAGP data and forward events to ThingsBoard
try:
while True:
data = ""
connection, client_address = sock.accept()
data = str(connection.recv(1024))
if data.find("EVNTTAG") != -1:
timestamp = ((data.split())[1])[:17]
tag = ((data.split())[1])[17:]
sensor_data['tag'] = tag
client.publish('v1/devices/me/telemetry', json.dumps(sensor_data), 1)
print (data)
except KeyboardInterrupt:
# Close socket server (TAGP)
connection.shutdown(1)
connection.close()
# Close client to ThingsBoard
client.loop_stop()
client.disconnect()

There are multiple issues with your code:
First of all you need a loop over what client sends. So you first connection, client_address = sock.accept() and you now have a client. But in the next iteration of the loop you do .accept() again overwriting your old connection with a new client. If there is no new client this simply waits forever. And that's what you observe.
So this can be fixed like this:
while True:
conn, addr = sock.accept()
while True:
data = conn.recv(1024)
but this code has another issue: no new client can connect until the old one disconnects (well, at the moment it just loops indefinitly regardless of whether the client is alive or not, we'll deal with it later). To overcome it you can use threads (or async programming) and process each client independently. For example:
from threading import Thread
def client_handler(conn):
while True:
data = conn.recv(1024)
while True:
conn, addr = sock.accept()
t = Thread(target=client_handler, args=(conn,))
t.start()
Async programming is harder and I'm not gonna address it here. Just be aware that there are multiple advantages of async over threads (you can google those).
Now each client has its own thread and the main thread only worries about accepting connections. Things happen concurrently. So far so good.
Let's focus on the client_handler function. What you misunderstand is how sockets work. This:
data = conn.recv(1024)
does not read 1024 bytes from the buffer. It actually reads up to 1024 bytes with 0 being possible as well. Even if you send 1024 bytes it can still read say 3. And when you receive a buffer of length 0 then this is an indication that the client disconnected. So first of all you need this:
def client_handler(conn):
while True:
data = conn.recv(1024)
if not data:
break
Now the real fun begins. Even if data is nonempty it can be of arbitrary length between 1 and 1024. Your data can be chunked and may require multiple .recv calls. And no, there is nothing you can do about it. Chunking can happen due to some other proxy servers or routers or network lag or cosmic radiation or whatever. You have to be prepared for it.
So in order to work with that correctly you need a proper framing protocol. For example you have to somehow know how big is the incoming packet (so that you can answer the question "did I read everything I need?"). One way to do that is by prefixing each frame with (say) 2 bytes that combine into total length of the frame. The code may look like this:
def client_handler(conn):
while True:
chunk = conn.recv(1) # read first byte
if not chunk:
break
size = ord(chunk)
chunk = conn.recv(1) # read second byte
if not chunk:
break
size += (ord(chunk) << 8)
Now you know that the incoming buffer will be of length size. With that you can loop to read everything:
def handle_frame(conn, frame):
if frame.find("EVNTTAG") != -1:
pass # do your stuff here now
def client_handler(conn):
while True:
chunk = conn.recv(1)
if not chunk:
break
size = ord(chunk)
chunk = conn.recv(1)
if not chunk:
break
size += (ord(chunk) << 8)
# recv until everything is read
frame = b''
while size > 0:
chunk = conn.recv(size)
if not chunk:
return
frame += chunk
size -= len(chunk)
handle_frame(conn, frame)
IMPORTANT: this is just an example of handling a protocol that prefixes each frame with its length. Note that the client has to be adjusted as well. You either have to define such protocol or if you have a given one you have to read the spec and try to understand how framing works. For example this is done very differently with HTTP. In HTTP you read until you meet \r\n\r\n which signals the end of headers. And then you check Content-Length or Transfer-Encoding headers (not to mention hardcore things like protocol switch) to determine next action. This gets quite complicated though. I just want you to be aware that there are other options. Nevertheless framing is necessary.
Also network programming is hard. I'm not gonna dive into things like security (e.g. against DDOS) and performance. The code above should be treated as extreme simplification, not production ready. I advice using some existing soft.

Sockets Python 3.5: Socket server hangs forever on file receive

I'm trying to write a Python program that can browse directories and grab files w/ sockets if the client connects to the server. The browsing part works fine, it prints out all directories of the client.
Here's a part of the code:
with clientsocket:
print('Connected to: ', addr)
while True:
m = input("Command > ")
clientsocket.send(m.encode('utf-8'))
data = clientsocket.recv(10000)
if m == "exit":
clientsocket.close()
if m.split()[0] == 'get':
inp = input("Filename > ")
while True:
rbuf = clientsocket.recv(8192)
if not rbuf:
break
d = open(inp, "ab")
d.write(rbuf)
d.close()
elif data.decode('utf-8').split()[0] == "LIST":
print(data.decode('utf-8'))
if not data:
break
However, the problem lies in here:
if m.split()[0] == 'get':
inp = input("Filename > ")
while True:
rbuf = clientsocket.recv(8192)
if not rbuf:
break
It seems to be stuck in an infinite loop. What's more interesting is that the file I'm trying to receive is 88.3kb, but what the file returns is 87kb while it's in the loop, which is very close...
I tried receiving a python script at one time as well (without the loop) and it works fine.
Here's some of the client code:
while True:
msg = s.recv(1024).decode('utf-8')
if msg.split()[0] == "list":
dirs = os.listdir(msg.split()[1])
string = ''
for dira in dirs:
string += "LIST " + dira + "\n"
s.send(string.encode('utf-8'))
elif msg == "exit":
break
else:
#bit that sends the file
with open(msg.split()[1], 'rb') as r:
s.sendall(r.read())
So my question is, why is it getting stuck in an infinite loop if I have it set up to close when there is no data, and how can I fix this?
I'm sort of new to network programming in general, so forgive me if I miss something obvious.
Thanks!

I think I know what's the problem, but I may be wrong. It happened to me several times, that the entire message is not received in one recv call, even if I specify the correct length. However, you don't reach the end of stream, so your program keeps waiting for remaining of 8192 bytes which never arrives.
Try this:
Sending file:
#bit that sends the file
with open(msg.split()[1], 'rb') as r:
data = r.read()
# check data length in bytes and send it to client
data_length = len(data)
s.send(data_length.to_bytes(4, 'big'))
s.send(data)
s.shutdown(socket.SHUT_RDWR)
s.close()
Receiving the file:
# check expected message length
remaining = int.from_bytes(clientsocket.recv(4), 'big')
d = open(inp, "wb")
while remaining:
# until there are bytes left...
# fetch remaining bytes or 4094 (whatever smaller)
rbuf = clientsocket.recv(min(remaining, 4096))
remaining -= len(rbuf)
# write to file
d.write(rbuf)
d.close()

There are several issues with your code.
First:
clientsocket.send(m.encode('utf-8'))
data = clientsocket.recv(10000)
This causes the file to be partially loaded to data variable when you issue get statement. That's why you don't get full file.
Now this:
while True:
rbuf = clientsocket.recv(8192)
if not rbuf:
break
...
You indeed load full file but the client never closes the connection (it goes to s.recv() after sending the file) so if statement is never satisfied. Thus this loop gets blocked on the clientsocket.recv(8192) part after downloading the file.
So the problem is that you have to somehow notify the downloader that you've sent all the data even though the connection is still open. There are several ways to do that:
You calculate the size of the file and send it as a first few bytes. For example, say the content of the file is ala ma kota. These are 11 bytes and thus you send \x11ala ma kota. Now receiver knows that first byte is size and it will interpret it as such. Of course one byte header isn't much (you would only be able to send max 256 byte files) so normally you go for for example 4 bytes. So now your protocol between client and server is: first 4 bytes is the size of the file. Thus our initial file would be sent as \x0\x0\x0\x11ala ma kota. The drawback of this solution is that you have to know the size of the content before sending it.
You mark the end of the stream. So you pick a particular character, say X and you read the straem until you find X. If you do then you know that the other side sent everything it has. The drawback is that if X is in the content then you have to escape it, i.e. additional content processing (and interpretation on the other side) is needed.

How to detect disconnection in python, without sending data

Ok, I have a socket and I'm handling one line at a time and logging it. The code below works great for that. The cout function is what I use to send data to the log. The for loop I use so I can process one line at a time.
socket.connect((host, port))
readbuffer = ""
while True:
readbuffer = readbuffer+socket.recv(4096).decode("UTF-8")
cout("RECEIVING: " + readbuffer) #Logs the buffer
temp = str.split(readbuffer, "\n")
readbuffer=temp.pop( )
for line in temp:
#Handle one line at a time.
The I ran into a problem where when the server disconnected me, all of the sudden I had a massive file full of the word "RECEIVING: ". I know this is because when a python socket disconnects the socket starts receiving blank data constantly.
I have tried inserting:
if "" == readbuffer:
print("It disconnected!")
break
All that did was immediately break the loop, and say that it disconnected, even on a successful connection.
I also know that I can detect a disconnection by sending data, but I can't do that, because anything I send gets broadcasted to all the other clients on the server, and this is meant to debug those clients so I it would interfere.
What do I do. Thank you in advanced.

You need to check the result of the recv() separately to the readbuffer
while True:
c = socket.recv(4096)
if c == '': break # no more data
readbuffer = readbuffer + c.decode("UTF-8")
...

Python Socket Receive Large Amount of Data

When I try to receive larger amounts of data it gets cut off and I have to press enter to get the rest of the data. At first I was able to increase it a little bit but it still won't receive all of it. As you can see I have increased the buffer on the conn.recv() but it still doesn't get all of the data. It cuts it off at a certain point. I have to press enter on my raw_input in order to receive the rest of the data. Is there anyway I can get all of the data at once? Here's the code.
port = 7777
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('0.0.0.0', port))
sock.listen(1)
print ("Listening on port: "+str(port))
while 1:
conn, sock_addr = sock.accept()
print "accepted connection from", sock_addr
while 1:
command = raw_input('shell> ')
conn.send(command)
data = conn.recv(8000)
if not data: break
print data,
conn.close()

TCP/IP is a stream-based protocol, not a message-based protocol. There's no guarantee that every send() call by one peer results in a single recv() call by the other peer receiving the exact data sent—it might receive the data piece-meal, split across multiple recv() calls, due to packet fragmentation.
You need to define your own message-based protocol on top of TCP in order to differentiate message boundaries. Then, to read a message, you continue to call recv() until you've read an entire message or an error occurs.
One simple way of sending a message is to prefix each message with its length. Then to read a message, you first read the length, then you read that many bytes. Here's how you might do that:
def send_msg(sock, msg):
# Prefix each message with a 4-byte length (network byte order)
msg = struct.pack('>I', len(msg)) + msg
sock.sendall(msg)
def recv_msg(sock):
# Read message length and unpack it into an integer
raw_msglen = recvall(sock, 4)
if not raw_msglen:
return None
msglen = struct.unpack('>I', raw_msglen)[0]
# Read the message data
return recvall(sock, msglen)
def recvall(sock, n):
# Helper function to recv n bytes or return None if EOF is hit
data = bytearray()
while len(data) < n:
packet = sock.recv(n - len(data))
if not packet:
return None
data.extend(packet)
return data
Then you can use the send_msg and recv_msg functions to send and receive whole messages, and they won't have any problems with packets being split or coalesced on the network level.

You can use it as: data = recvall(sock)
def recvall(sock):
BUFF_SIZE = 4096 # 4 KiB
data = b''
while True:
part = sock.recv(BUFF_SIZE)
data += part
if len(part) < BUFF_SIZE:
# either 0 or end of data
break
return data

The accepted answer is fine but it will be really slow with big files -string is an immutable class this means more objects are created every time you use the + sign, using list as a stack structure will be more efficient.
This should work better
while True:
chunk = s.recv(10000)
if not chunk:
break
fragments.append(chunk)
print "".join(fragments)

Most of the answers describe some sort of recvall() method. If your bottleneck when receiving data is creating the byte array in a for loop, I benchmarked three approaches of allocating the received data in the recvall() method:
Byte string method:
arr = b''
while len(arr) < msg_len:
arr += sock.recv(max_msg_size)
List method:
fragments = []
while True:
chunk = sock.recv(max_msg_size)
if not chunk:
break
fragments.append(chunk)
arr = b''.join(fragments)
Pre-allocated bytearray method:
arr = bytearray(msg_len)
pos = 0
while pos < msg_len:
arr[pos:pos+max_msg_size] = sock.recv(max_msg_size)
pos += max_msg_size
Results:

You may need to call conn.recv() multiple times to receive all the data. Calling it a single time is not guaranteed to bring in all the data that was sent, due to the fact that TCP streams don't maintain frame boundaries (i.e. they only work as a stream of raw bytes, not a structured stream of messages).
See this answer for another description of the issue.
Note that this means you need some way of knowing when you have received all of the data. If the sender will always send exactly 8000 bytes, you could count the number of bytes you have received so far and subtract that from 8000 to know how many are left to receive; if the data is variable-sized, there are various other methods that can be used, such as having the sender send a number-of-bytes header before sending the message, or if it's ASCII text that is being sent you could look for a newline or NUL character.

Disclaimer: There are very rare cases in which you really need to do this. If possible use an existing application layer protocol or define your own eg. precede each message with a fixed length integer indicating the length of data that follows or terminate each message with a '\n' character. (Adam Rosenfield's answer does a really good job at explaining that)
With that said, there is a way to read all of the data available on a socket. However, it is a bad idea to rely on this kind of communication as it introduces the risk of loosing data. Use this solution with extreme caution and only after reading the explanation below.
def recvall(sock):
BUFF_SIZE = 4096
data = bytearray()
while True:
packet = sock.recv(BUFF_SIZE)
if not packet: # Important!!
break
data.extend(packet)
return data
Now the if not packet: line is absolutely critical!
Many answers here suggested using a condition like if len(packet) < BUFF_SIZE: which is broken and will most likely cause you to close your connection prematurely and loose data. It wrongly assumes that one send on one end of a TCP socket corresponds to one receive of sent number of bytes on the other end. It does not. There is a very good chance that sock.recv(BUFF_SIZE) will return a chunk smaller than BUFF_SIZE even if there's still data waiting to be received. There is a good explanation of the issue here and here.
By using the above solution you are still risking data loss if the other end of the connection is writing data slower than you are reading. You may just simply consume all data on your end and exit when more is on the way. There are ways around it that require the use of concurrent programming, but that's another topic of its own.

A variation using a generator function (which I consider more pythonic):
def recvall(sock, buffer_size=4096):
buf = sock.recv(buffer_size)
while buf:
yield buf
if len(buf) < buffer_size: break
buf = sock.recv(buffer_size)
# ...
with socket.create_connection((host, port)) as sock:
sock.sendall(command)
response = b''.join(recvall(sock))

You can do it using Serialization
from socket import *
from json import dumps, loads
def recvall(conn):
data = ""
while True:
try:
data = conn.recv(1024)
return json.loads(data)
except ValueError:
continue
def sendall(conn):
conn.sendall(json.dumps(data))
NOTE: If you want to shara a file using code above you need to encode / decode it into base64

I think this question has been pretty well answered, but I just wanted to add a method using Python 3.8 and the new assignment expression (walrus operator) since it is stylistically simple.
import socket
host = "127.0.0.1"
port = 31337
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host,port))
s.listen()
con, addr = s.accept()
msg_list = []
while (walrus_msg := con.recv(3)) != b'\r\n':
msg_list.append(walrus_msg)
print(msg_list)
In this case, 3 bytes are received from the socket and immediately assigned to walrus_msg. Once the socket receives a b'\r\n' it breaks the loop. walrus_msg are added to a msg_list and printed after the loop breaks. This script is basic but was tested and works with a telnet session.
NOTE: The parenthesis around the (walrus_msg := con.recv(3)) are needed. Without this, while walrus_msg := con.recv(3) != b'\r\n': evaluates walrus_msg to True instead of the actual data on the socket.

Modifying Adam Rosenfield's code:
import sys
def send_msg(sock, msg):
size_of_package = sys.getsizeof(msg)
package = str(size_of_package)+":"+ msg #Create our package size,":",message
sock.sendall(package)
def recv_msg(sock):
try:
header = sock.recv(2)#Magic, small number to begin with.
while ":" not in header:
header += sock.recv(2) #Keep looping, picking up two bytes each time
size_of_package, separator, message_fragment = header.partition(":")
message = sock.recv(int(size_of_package))
full_message = message_fragment + message
return full_message
except OverflowError:
return "OverflowError."
except:
print "Unexpected error:", sys.exc_info()[0]
raise
I would, however, heavily encourage using the original approach.

For anyone else who's looking for an answer in cases where you don't know the length of the packet prior.
Here's a simple solution that reads 4096 bytes at a time and stops when less than 4096 bytes were received. However, it will not work in cases where the total length of the packet received is exactly 4096 bytes - then it will call recv() again and hang.
def recvall(sock):
data = b''
bufsize = 4096
while True:
packet = sock.recv(bufsize)
data += packet
if len(packet) < bufsize:
break
return data

This code reads 1024*32(=32768) bytes in 32 iterations from the buffer which is received from Server in socket programming-python:
jsonString = bytearray()
for _ in range(32):
packet = clisocket.recv(1024)
if not packet:
break
jsonString.extend(packet)
Data resides in jsonString variable

Plain and simple:
data = b''
while True:
data_chunk = client_socket.recv(1024)
if data_chunk:
data+=data_chunk
else:
break

programs hangs during socket interaction

I have two programs, sendfile.py and recvfile.py that are supposed to interact to send a file across the network. They communicate over TCP sockets. The communication is supposed to go something like this:
sender =====filename=====> receiver
sender <===== 'ok' ======= receiver
or
sender <===== 'no' ======= receiver
if ok:
sender ====== file ======> receiver
I've got
The sender and receiver code is here:
Sender:
import sys
from jmm_sockets import *
if len(sys.argv) != 4:
print "Usage:", sys.argv[0], "<host> <port> <filename>"
sys.exit(1)
s = getClientSocket(sys.argv[1], int(sys.argv[2]))
try:
f = open(sys.argv[3])
except IOError, msg:
print "couldn't open file"
sys.exit(1)
# send filename
s.send(sys.argv[3])
# receive 'ok'
buffer = None
response = str()
while 1:
buffer = s.recv(1)
if buffer == '':
break
else:
response = response + buffer
if response == 'ok':
print 'receiver acknowledged receipt of filename'
# send file
s.send(f.read())
elif response == 'no':
print "receiver doesn't want the file"
# cleanup
f.close()
s.close()
Receiver:
from jmm_sockets import *
s = getServerSocket(None, 16001)
conn, addr = s.accept()
buffer = None
filename = str()
# receive filename
while 1:
buffer = conn.recv(1)
if buffer == '':
break
else:
filename = filename + buffer
print "sender wants to send", filename, "is that ok?"
user_choice = raw_input("ok/no: ")
if user_choice == 'ok':
# send ok
conn.send('ok')
#receive file
data = str()
while 1:
buffer = conn.recv(1)
if buffer=='':
break
else:
data = data + buffer
print data
else:
conn.send('no')
conn.close()
I'm sure I'm missing something here in the sorts of a deadlock, but don't know what it is.

With blocking sockets, which are the default and I assume are what you're using (can't be sure since you're using a mysterious module jmm_sockets), the recv method is blocking -- it will not return an empty string when it has "nothing more to return for the moment", as you seem to assume.
You could work around this, for example, by sending an explicit terminator character (that must never occur within a filename), e.g. '\xff', after the actual string you want to send, and waiting for it at the other end as the indication that all the string has now been received.

TCP is a streaming protocol. It has no concept of message boundaries. For a blocking socket, recv(n) will return a zero-length string only when the sender has closed the socket or explicitly called shutdown(SHUT_WR). Otherwise it can return a string from one to n bytes in length, and will block until it has at least one byte to return.
It is up to you to design a protocol to determine when you have a complete message. A few ways are:
Use a fixed-length message.
Send a fixed-length message indicating the total message length, followed by the variable portion of the message.
Send the message, followed by a unique termination message that will never occur in the message.
Another issue you may face is that send() is not guaranteed to send all the data. The return value indicates how many bytes were actually sent, and it is the sender's responsibility to keep calling send with the remaining message bytes until they are all sent. You may rather use the sendall() method.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.