Pickle EOFError: Ran out of input when recv from a socket - python

I am running a very simple python (3.x) client-server program (both locally on my PC) for a school project (not intended for the real world) which just sends messages back-and-forth (like view customers, add customer, delete customer, etc... real basic).
Sometimes the data can be multiple records which I had stored as namedTuples (just made sense) and then went down the path of using Pickle to transfer then.
So for example on the client I do something like this:
s.send(message.encode('utf-8'))
pickledResponse = s.recv(4096);
response = pickle.loads(pickledResponse)
Now ever so often I get the following error:
response = pickle.loads(pickledResponse)
EOFError: Ran out of input
My fear is that this has something to do with my socket (TCP) transfer and maybe somehow I am not getting all the data in time for my pickle.loads - make sense? If not I am really lost as to why this would be happening so inconsistently.
However, even if I am right I am not sure how to fix it (quickly), I was considering dropping pickle and just using strings (but couldn't this suffer from the same fate)? Does anyone have any suggestions?
Really my message are pretty basic - usually just a command and some small data like "1=John" which means command (1) which is FIND command and then "John" and it returns the record (name, age, etc...) of John (as a namedTuple - but honestly this isn't mandatory).
Any suggestions or help would be much appreciated, looking for a quick fix...

The problem with your code is that recv(4096), when used on a TCP socket, might return different amount of data from what you might have expected, as they are sliced at packet boundaries.
The easy solution is to prefix each message with length; for sending like
import struct
packet = pickle.dumps(foo)
length = struct.pack('!I', len(packet)
packet = length + packet
then for receiving
import struct
buf = b''
while len(buf) < 4:
buf += socket.recv(4 - len(buf))
length = struct.unpack('!I', buf)[0]
# now recv until at least length bytes are received,
# then slice length first bytes and decode.
However, Python standard library already has a support for message oriented pickling socket, namely multiprocessing.Connection, that supports sending and receiving pickles with ease using the Connection.send and Connection.recv respectively.
Thus you can code your server as
from multiprocessing.connection import Listener
PORT = 1234
server_sock = Listener(('localhost', PORT))
conn = server_sock.accept()
unpickled_data = conn.recv()
and client as
from multiprocessing.connection import Client
client = Client(('localhost', 1234))
client.send(['hello', 'world'])

For receiving everything the server sends until it closes its side of the connection try this:
import json
import socket
from functools import partial
def main():
message = 'Test'
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
sock.connect(('127.0.0.1', 9999))
sock.sendall(message.encode('utf-8'))
sock.shutdown(socket.SHUT_WR)
json_response = b''.join(iter(partial(sock.recv, 4096), b''))
response = json.loads(json_response.decode('utf-8'))
print(response)
if __name__ == '__main__':
main()
I've used sendall() because send() has the same ”problem” as recv(): It's not guaranteed everything is sent. send() returns the number of bytes actually sent, and the programmer has to make sure that matches the length of the argument and if not to send the rest until everything is out. After sending the writing side of the connection is closed (shutdown()) so the server knows there is no more data coming from the client. After that, all data from the server is received until the server closes its side of the connection, resulting in the empty bytes object returned from the recv() call.
Here is a suitable socketserver.TCPServer for the client:
import json
from socketserver import StreamRequestHandler, TCPServer
class Handler(StreamRequestHandler):
def handle(self):
print('Handle request...')
message = self.rfile.read().decode('utf-8')
print('Received message:', message)
self.wfile.write(
json.dumps(
{'name': 'John', 'age': 42, 'message': message}
).encode('utf-8')
)
print('Finished request.')
def main():
address = ('127.0.0.1', 9999)
try:
print('Start server at', address, '...')
server = TCPServer(address, Handler)
server.serve_forever()
except KeyboardInterrupt:
print('Stopping server...')
if __name__ == '__main__':
main()
It reads the complete data from the client and puts it into a JSON encoded response with some other, fixed items. Instead of the low level socket operations it makes use of the more convenient file like objects the TCPServer offers for reading and writing from/to the connection. The connection is closed by the TCPServer after the handle() method finished.

Related

Python sockets, how to receive only last message instead of the whole pending buffer?

I've got a server side socket that sends a message very frequently (it updates the same message with new data). The client side is processing that information as soon as the message is received.
My problem is that while the client is processing the message, the server side might have already send a few messages.
How could the client receive only the last message and drop all the pending packages?
This is the code I use for my client side. I made it non-blocking hoping it would solve my problem, but it didn't, so I don't know if it is even needed now.
import select
import socket
import time
class MyClient:
def __init__(self):
self.PORT = 5055
self.SERVER = socket.gethostbyname(socket.gethostname())
self.client = self.connect()
self.client.setblocking(False)
def connect(self):
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect((self.SERVER, self.PORT))
return client
client = MyClient()
while True:
inputs = [client.client]
while inputs:
readable, writable, exceptional = select.select(inputs, [], inputs, 0.5)
for s in readable:
data = client.client.recv(2048)
print(data)
time.sleep(1)
There is no concept of a message in TCP in the first place, which also means that TCP cannot have any semantic for "last message". Instead TCP is a byte stream. One cannot rely on a 1:1 relation between a single send on one side and a single recv on the other side.
It is only possible to read data from the stream, not to skip data. Skipping data must be done by reading the data and throwing these away. To get the last message an application level protocol must first define what a message means and then the application logic must read data, detect message boundaries and then throw away all messages except the last.

Recvall with while loop doesn't work between two devices in python

I have the following problem: I want a sever to send the contents of a textfile
when requested to do so. I have writen a server script which sends the contents to the client and the client script which receives all the contents with a revcall loop. The recvall works fine when
I run the server and client from the same device for testing.
But when I run the server from a different device in the same wifi network to receive the textfile contents from the server device, the recvall doesn't work and I only receive the first 1460 bytes of the text.
server script
import socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(("", 5000))
server.listen(5)
def send_file(client):
read_string = open("textfile", "rb").read() #6 kilobyte large textfile
client.send(read_string)
while True:
client, data = server.accept()
connect_data = client.recv(1024)
if connect_data == b"send_string":
send_file(client)
else:
pass
client script
import socket
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect(("192.168.1.10", 5000))
connect_message = client.send(b"send_string")
receive_data = ""
while True: # the recvall loop
receive_data_part = client.recv(1024).decode()
receive_data += receive_data_part
if len(receive_data_part) < 1024:
break
print(receive_data)
recv(1024) means to receive at least 1 and at most 1024 bytes. If the connection has closed, you receive 0 bytes, and if something goes wrong, you get an exception.
TCP is a stream of bytes. It doesn't try to keep the bytes from any given send together for the recv. When you make the call, if the TCP endpoint has some data, you get that data.
In client, you assume that anything less than 1024 bytes must be the last bit of data. Not so. You can receive partial buffers at any time. Its a bit subtle on the server side, but you make the same mistake there by assuming that you'll receive exactly the command b"send_string" in a single call.
You need some sort of a protocol that tells receivers when they've gotten the right amount of data for an action. There are many ways to do this, so I can't really give you the answer. But this is why there are protocols out there like zeromq, xmlrpc, http, etc...

How can I continually send data without shutdown socket connection in python

I wrote a python client to communicate with server side. Each time when I finished sanding out data, I have to call sock.shutdown(socket.SHUT_WR), otherwise the server would not do any response. But after calling sock.shutdown(socket.SHUT_WR), I have to reconnect the connection as sock.connect((HOST, PORT)), other wise I can not send data to server. So how can I keep the connection alive without close it.
My sample code as following:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((HOST, PORT))
sock.sendall(data)
sock.shutdown(socket.SHUT_WR)
received = sock.recv(1024)
while len(received)>0:
received = sock.recv(1024)
sock.sendall(newdata) # this would throw exception
The Server Side code as following:
def handle(self):
cur_thread = threading.current_thread()
while True:
self.data = self.rfile.read(bufsiz=100)
if not self.data:
print 'receive none!'
break
try:
data = self.data
print 'Received data, length: %d' % len(data)
self.wfile.write('get received data\n')
except Exception:
print 'exception!!'
You didn't show any server side code but I suspect it simply reads bytes until it gets none anymore.
You can't do this as you found out, because then the only way to tell the server the message is complete is by killing the connection.
Instead you'll have to add some form of framing in your protocol. Possible approaches include a designated stop character that the server recognises (such as a single newline character, or perhaps a 0-byte), sending frames in fixed sizes that your client and server agree upon, or send the frame size first as a network encoded integer followed by exactly the specified number of bytes. The server then first reads the integer and then exactly that same number of bytes from the socket.
That way you can leave the connection open and send multiple messages.

Python broadcasting message to all clients in a socket

I have made a simple chat server using threads like the following:
#-*- coding:utf-8 -*-
import _thread as thread
import time
import socket
def now():
return time.asctime(time.localtime())
def handleclient(connection, ADDR):
sod = str(ADDR)
msg = sod+"joined the chat"
msg2 = msg.encode("utf-8")
connection.sendall(msg2)
while True:
recieved = connection.recv(1024)
adsf = recieved.decode("utf-8")
print(now(),"(%s):%s" % (ADDR, recieved))
output = "%s:%s"%(ADDR, recieved.decode("utf-8"))
message = output.encode("utf-8")
connection.sendall(message)
if __name__ == "__main__":
addr = ("", 8080)
r =socket.socket()
print("socket object created at", now())
r.bind(addr)
r.listen(5)
while True:
print("Waiting for clients...")
connection, ADDR = r.accept()
print("We have connection from ", ADDR)
thread.start_new_thread(handleclient, (connection, ADDR))
However, it looks like the sendall isnt working and sending the message to only the person who sent it. How can I make it send it to all clients?
There is nothing like what you're trying to do, because as pointed out in the commends, sendall() means "definitely send all my bytes and keep trying until you have," not "send these bytes to lots of clients."
You will want to use either UDP multicast (if you're on a relatively reliable network which supports it, such as a LAN or corporate WAN), or you will simply need to send explicitly to every connected client. The other alternative is peer-to-peer: send to several clients and instruct those clients to send to more clients until all clients are taken care of. Obviously this requires more coding. :)
You may have a look at Zero MQ, which provides high-level facilities over sockets by implementing several patterns ( publish/subscribe , push/pull, etc...).

Correct multiprocessing to treat UDP in Python

I am trying to implement a simple UDP client and server. Server should receive a message and return a transformed one.
My main technique for server is to listen UDP messages in a loop, then spawn multiprocessing.Process for each incoming message and send the reply within each Process instance:
class InputProcessor(Process):
...
def run(self):
output = self.process_input()
self.sock.sendto(output, self.addr) # send a reply
if __name__ == "__main__":
print "serving at %s:%s" % (UDP_IP, UDP_PORT)
sock = socket.socket(socket.AF_INET, # Internet
socket.SOCK_DGRAM) # UDP
sock.bind((UDP_IP,UDP_PORT))
while True:
data, addr = sock.recvfrom(1024) # buffer size is 1024 bytes
print "received message: %s from %s:%s" % (data, addr[0], addr[1])
p = InputProcessor(sock, data, addr)
p.start()
In test client, I do something like this:
def send_message(ip, port, data):
sock = socket.socket(socket.AF_INET, # Internet
socket.SOCK_DGRAM) # UDP
print "sending: %s" % data
sock.sendto(data, (ip, port))
sock.close()
for i in xrange(SECONDS*REQUESTS_PER_SECOND):
data = generate_data()
p = multiprocessing.Process(target=send_message, args=(UDP_IP,
UDP_PORT,
data))
p.start()
time.sleep(1/REQUESTS_PER_SECOND)
The problem I am having with the code above is that when REQUESTS_PER_SECOND becomes higher than certain value (~50), it seems some client processes receive responses destinated to different processes, i.e. process #1 receives response for process #2, and vice versa.
Please criticize my code as much as possible, due to I am new to network programming and may miss something obvious. Maybe it's even worth and better for some reason to use Twisted, hovewer, I am highly interested in understanding the internals. Thanks.
As per previous answer, I think that the main reason is that there is a race condition at the UDP port for the clients. I do not see receiving at the client code, but presumably it is similar to the one in server part. What I think happens in concrete terms is that for values under 50 requests / second, the request - response roundtrip gets completed and the client exits. When more requests arrive, there may be multiple processes blocking to read the UDP socket, and then it is probably nondeterministic which client process receives the incoming message. If the network latency is going to be larger in the real setting, this limit will be hit sooner.
Thanks guys a lot! It seems I've found why my code failed before. I was using multiprocessing.Manager().dict() within client to check if the results from server are correct. However, I didn't use any locks to wrap a set of write operations to that dict(), thus got a lot of errors though the output from server was correct.
Shortly, in client, I was doing incorrect checks for correct server responses.

Categories

Resources