Python multiprocessing and networking on Windows - python

I'm trying to implement a tcp 'echo server'.
Simple stuff:
Client sends a message to the server.
Server receives the message
Server converts message to uppercase
Server sends modified message to client
Client prints the response.
It worked well, so I decided to parallelize the server; make it so that it could handle multiple clients at time.
Since most Python interpreters have a GIL, multithreading won't cut it.
I had to use multiproces... And boy, this is where things went downhill.
I'm using Windows 10 x64 and the WinPython suit with Python 3.5.2 x64.
My idea is to create a socket, intialize it (bind and listen), create sub processes and pass the socket to the children.
But for the love of me... I can't make this work, my subprocesses die almost instantly.
Initially I had some issues 'pickling' the socket...
So I googled a bit and thought this was the issue.
So I tried passing my socket thru a multiprocessing queue, through a pipe and my last attempt was 'forkpickling' and passing it as a bytes object during the processing creating.
Nothing works.
Can someone please shed some light here?
Tell me whats wrong?
Maybe the whole idea (sharing sockets) is bad... And if so, PLEASE tell me how can I achieve my initial objective: enabling my server to ACTUALLY handle multiple clients at once (on Windows) (don't tell me about threading, we all know python's threading won't cut it ¬¬)
It also worth noting that no files are create by the debug function.
No process lived long enough to run it, I believe.
The typical output of my server code is (only difference between runs is the process numbers):
Server is running...
Degree of parallelism: 4
Socket created.
Socket bount to: ('', 0)
Process 3604 is alive: True
Process 5188 is alive: True
Process 6800 is alive: True
Process 2844 is alive: True
Press ctrl+c to kill all processes.
Process 3604 is alive: False
Process 3604 exit code: 1
Process 5188 is alive: False
Process 5188 exit code: 1
Process 6800 is alive: False
Process 6800 exit code: 1
Process 2844 is alive: False
Process 2844 exit code: 1
The children died...
Why god?
WHYYyyyyy!!?!?!?
The server code:
# Imports
import socket
import packet
import sys
import os
from time import sleep
import multiprocessing as mp
import pickle
import io
# Constants
DEGREE_OF_PARALLELISM = 4
DEFAULT_HOST = ""
DEFAULT_PORT = 0
def _parse_cmd_line_args():
arguments = sys.argv
if len(arguments) == 1:
return DEFAULT_HOST, DEFAULT_PORT
else:
raise NotImplemented()
def debug(data):
pid = os.getpid()
with open('C:\\Users\\Trauer\\Desktop\\debug\\'+str(pid)+'.txt', mode='a',
encoding='utf8') as file:
file.write(str(data) + '\n')
def handle_connection(client):
client_data = client.recv(packet.MAX_PACKET_SIZE_BYTES)
debug('received data from client: ' + str(len(client_data)))
response = client_data.upper()
client.send(response)
debug('sent data from client: ' + str(response))
def listen(picklez):
debug('started listen function')
pid = os.getpid()
server_socket = pickle.loads(picklez)
debug('acquired socket')
while True:
debug('Sub process {0} is waiting for connection...'.format(str(pid)))
client, address = server_socket.accept()
debug('Sub process {0} accepted connection {1}'.format(str(pid),
str(client)))
handle_connection(client)
client.close()
debug('Sub process {0} finished handling connection {1}'.
format(str(pid),str(client)))
if __name__ == "__main__":
# Since most python interpreters have a GIL, multithreading won't cut
# it... Oughta bust out some process, yo!
host_port = _parse_cmd_line_args()
print('Server is running...')
print('Degree of parallelism: ' + str(DEGREE_OF_PARALLELISM))
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print('Socket created.')
server_socket.bind(host_port)
server_socket.listen(DEGREE_OF_PARALLELISM)
print('Socket bount to: ' + str(host_port))
buffer = io.BytesIO()
mp.reduction.ForkingPickler(buffer).dump(server_socket)
picklez = buffer.getvalue()
children = []
for i in range(DEGREE_OF_PARALLELISM):
child_process = mp.Process(target=listen, args=(picklez,))
child_process.daemon = True
child_process.start()
children.append(child_process)
while not child_process.pid:
sleep(.25)
print('Process {0} is alive: {1}'.format(str(child_process.pid),
str(child_process.is_alive())))
print()
kids_are_alive = True
while kids_are_alive:
print('Press ctrl+c to kill all processes.\n')
sleep(1)
exit_codes = []
for child_process in children:
print('Process {0} is alive: {1}'.format(str(child_process.pid),
str(child_process.is_alive())))
print('Process {0} exit code: {1}'.format(str(child_process.pid),
str(child_process.exitcode)))
exit_codes.append(child_process.exitcode)
if all(exit_codes):
# Why do they die so young? :(
print('The children died...')
print('Why god?')
print('WHYYyyyyy!!?!?!?')
kids_are_alive = False
edit: fixed the signature of "listen". My processes still die instantly.
edit2: User cmidi pointed out that this code does work on Linux; so my question is: How can I 'made this work' on Windows?

You can directly pass a socket to a child process. multiprocessing registers a reduction for this, for which the Windows implementation uses the following DupSocket class from multiprocessing.resource_sharer:
class DupSocket(object):
'''Picklable wrapper for a socket.'''
def __init__(self, sock):
new_sock = sock.dup()
def send(conn, pid):
share = new_sock.share(pid)
conn.send_bytes(share)
self._id = _resource_sharer.register(send, new_sock.close)
def detach(self):
'''Get the socket. This should only be called once.'''
with _resource_sharer.get_connection(self._id) as conn:
share = conn.recv_bytes()
return socket.fromshare(share)
This calls the Windows socket share method, which returns the protocol info buffer from calling WSADuplicateSocket. It registers with the resource sharer to send this buffer over a connection to the child process. The child in turn calls detach, which receives the protocol info buffer and reconstructs the socket via socket.fromshare.
It's not directly related to your problem, but I recommend that you redesign the server to instead call accept in the main process, which is the way this is normally done (e.g. in Python's socketserver.ForkingTCPServer module). Pass the resulting (conn, address) tuple to the first available worker over a multiprocessing.Queue, which is shared by all of the workers in the process pool. Or consider using a multiprocessing.Pool with apply_async.

def listen() the target/start for your child processes does not take any argument but you are providing serialized socket as an argument args=(picklez,) to the child process this would cause an exception in the child process and exit immediately.
TypeError: listen() takes no arguments (1 given)
def listen(picklez) should solve the problem this will provide one argument to the target of your child processes.

Related

Multithreaded TCP socket

I'm trying to create a threaded TCP socket server that can handle multiple socket request at a time.
To test it, I launch several thread in the client side to see if my server can handle it. The first socket is printed successfully but I get a [Errno 32] Broken pipe for the others.
I don't know how to avoid it.
import threading
import socketserver
import graphitesend
class ThreadedTCPRequestHandler(socketserver.BaseRequestHandler):
def handle(self):
data = self.request.recv(1024)
if data != "":
print(data)
class ThreadedTCPServer(socketserver.ThreadingTCPServer):
allow_reuse_address = True
def __init__(self, host, port):
socketserver.ThreadingTCPServer.__init__(self, (host, port), ThreadedTCPRequestHandler)
def stop(self):
self.server_close()
self.shutdown()
def start(self):
threading.Thread(target=self._on_started).start()
def _on_started(self):
self.serve_forever()
def client(g):
g.send("test", 1)
if __name__ == "__main__":
HOST, PORT = "localhost", 2003
server = ThreadedTCPServer(HOST, PORT)
server.start()
g = graphitesend.init(graphite_server = HOST, graphite_port = PORT)
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
server.stop()
It's a little bit difficult to determine what exactly you're expecting to happen, but I think the proximate cause is that you aren't giving your clients time to run before killing the server.
When you construct a Thread object and call its start method, you're creating a thread, and getting it ready to run. It will then be placed on the "runnable" task queue on your system, but it will be competing with your main thread and all your other threads (and indeed all other tasks on the same machine) for CPU time.
Your multiple threads (main plus others) are also likely being serialized by the python interpreter's GIL (Global Interpreter Lock -- assuming you're using the "standard" CPython) which means they may not have even gotten "out of the gate" yet.
But then you're shutting down the server with server_close() before they've had a chance to send anything. That's consistent with the "Broken Pipe" error: your remaining clients are attempting to write to a socket that has been closed by the "remote" end.
You should collect the thread objects as you create them and put them in a list (so that you can reference them later). When you're finished creating and starting all of them, then go back through the list and call the .join method on each thread object. This will ensure that the thread has had a chance to finish. Only then should you shut down the server. Something like this:
threads = []
for n in range(7):
th = threading.Thread(target=client, args=(g,))
th.start()
threads.append(th)
# All threads created. Wait for them to finish.
for th in threads:
th.join()
server.stop()
One other thing to note is that all of your clients are sharing the same single connection to send to the server, so that your server will never create more than one thread: as far as it's concerned, there is only a single client. You should probably move the graphitesend.init into the client function if you actually want separate connections for each client.
(Disclaimer: I know nothing about graphitesend except what I could glean in a 15 second glance at the first result in google; I'm assuming it's basically just a wrapper around a TCP connection.)

How to terminate a running Thread in python?

I am using socket in this code to connect with other machine.I want to terminate thread when i get message from other machine but how to terminate Thread in Python ?
I refer Many SO Questions and I found that there is no method in python to Close thread.Can anyone tell me the alternate way to close the thread ?
code:
from threading import Thread
import time
import socket
def background(arg):
global thread
thread = Thread(target=arg)
thread.start()
def display():
for i in range(0,20):
print(i)
time.sleep(5)
background(display)
s = socket.socket()
s.bind((ip,6500))
s.listen(5)
print("listening")
val,addr = s.accept()
cmd = val.recv(1024)
if cmd == "Terminate Process":
print("Connected")
thread.close()
print("Process Closed")
Error:
AttributeError: 'Thread' object has no attribute 'close'
Short answer:
thread.join()
The rule of thumb is: don't kill threads (note that in some environments this may not even be possible, e.g. standard C++11 threads). Let the thread fetch the information and terminate itself. Controlling threads from other threads leads to hard to maintain and debug code.
E.g.
SHOULD_TERMINATE = False
def display():
for i in range(0,20):
print(i)
time.sleep(5)
if SHOULD_TERMINATE:
return
thread = Thread(target=display)
thread.start()
// some other code
if cmd == "Terminate Process":
SHOULD_TERMINATE = True
thread.join()
This is of course heavily simplified. Your code can be further refined with event objects (instead of .sleep) or thread pools.

Python3: Wait for Daemon to finish iteration

I'm writing a python script that will start a local fileserver, and while that server is alive it will be writing to a file every 30 seconds. I would like to have the server and writer function running synchronously so I made the writer function a daemon thread... My main question is, since this daemon thread will quit once the server is stopped, if the daemon is in the middle of writing to a file will it complete that operation before exiting? It would be really bad to be left with 1/2 a file. Here's the code, but the actual file it will be writing is about 3k lines of JSON, hence the concern.
import http.server
import socketserver
from time import sleep
from threading import Thread
class Server:
def __init__(self):
self.PORT = 8000
self.Handler = http.server.SimpleHTTPRequestHandler
self.httpd = socketserver.TCPServer(("", self.PORT), self.Handler)
print("Serving at port", self.PORT)
def run(self):
try:
self.httpd.serve_forever()
except KeyboardInterrupt:
print("Server stopped")
def test():
while True:
with open('test', mode='w') as file:
file.write('testing...')
print('file updated')
sleep(5)
if __name__ == "__main__":
t = Thread(target=test, daemon=True)
t.start()
server = Server()
server.run()
It looks like you may have made an incorrect decision making the writer thread daemonic.
Making a daemonic thread does not mean it will run synchronously. It will still be affected by the GIL.
If you want synchronous execution, you'll have to use multiprocessing
From the Python docs:
Daemon threads are abruptly stopped at shutdown. Their resources (such
as open files, database transactions, etc.) may not be released
properly. If you want your threads to stop gracefully, make them
non-daemonic and use a suitable signalling mechanism such as an Event.
So that means that daemon threads are only suitable for the tasks that only make sense in context of the main thread and don't matter when the main thread has stopped working. File I/O, particularly data saving, is not suitable for a daemon thread.
So it looks like the most obvious and logical solution would be to make the writer thread non-daemonic.
Then, even if the main thread exits, the Python process won't be ended until all non-daemonic threads have finished. This allows for file I/O to complete and exit safely.
Explanation of daemonic threads in Python can be found here

Implementing a single thread server/daemon (Python)

I am developing a server (daemon).
The server has one "worker thread". The worker thread runs a queue of commands. When the queue is empty, the worker thread is paused (but does not exit, because it should preserve certain state in memory). To have exactly one copy of the state in memory, I need to run all time exactly one (not several and not zero) worker thread.
Requests are added to the end of this queue when a client connects to a Unix socket and sends a command.
After the command is issued, it is added to the queue of commands of the worker thread. After it is added to the queue, the server replies something like "OK". There should be not a long pause between server receiving a command and it "OK" reply. However, running commands in the queue may take some time.
The main "work" of the worker thread is split into small (taking relatively little time) chunks. Between chunks, the worker thread inspects ("eats" and empties) the queue and continues to work based on the data extracted from the queue.
How to implement this server/daemon in Python?
This is a sample code with internet sockets, easily replaced with unix domain sockets. It takes whatever you write to the socket, passes it as a "command" to worker, responds OK as soon as it has queued the command. The single worker simulates a lengthy task with sleep(30). You can queue as many tasks as you want, receive OK immediately and every 30 seconds, your worker prints a command from the queue.
import Queue, threading, socket
from time import sleep
class worker(threading.Thread):
def __init__(self,q):
super(worker,self).__init__()
self.qu = q
def run(self):
while True:
new_task=self.qu.get(True)
print new_task
i=0
while i < 10:
print "working ..."
sleep(1)
i += 1
try:
another_task=self.qu.get(False)
print another_task
except Queue.Empty:
pass
task_queue = Queue.Queue()
w = worker(task_queue)
w.daemon = True
w.start()
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('localhost', 4200))
sock.listen(1)
try:
while True:
conn, addr = sock.accept()
data = conn.recv(32)
task_queue.put(data)
conn.sendall("OK")
conn.close()
except:
sock.close()

Non-blocking multiprocessing.connection.Listener?

I use multiprocessing.connection.Listener for communication between processes, and it works as a charm for me. Now i would really love my mainloop to do something else between commands from client. Unfortunately listener.accept() blocks execution until connection from client process is established.
Is there a simple way of managing non blocking check for multiprocessing.connection? Timeout? Or shall i use a dedicated thread?
# Simplified code:
from multiprocessing.connection import Listener
def mainloop():
listener = Listener(address=(localhost, 6000), authkey=b'secret')
while True:
conn = listener.accept() # <--- This blocks!
msg = conn.recv()
print ('got message: %r' % msg)
conn.close()
One solution that I found (although it might not be the most "elegant" solution is using conn.poll. (documentation) Poll returns True if the Listener has new data, and (most importantly) is nonblocking if no argument is passed to it. I'm not 100% sure that this is the best way to do this, but I've had success with only running listener.accept() once, and then using the following syntax to repeatedly get input (if there is any available)
from multiprocessing.connection import Listener
def mainloop():
running = True
listener = Listener(address=(localhost, 6000), authkey=b'secret')
conn = listener.accept()
msg = ""
while running:
while conn.poll():
msg = conn.recv()
print (f"got message: {msg}")
if msg == "EXIT":
running = False
# Other code can go here
print(f"I can run too! Last msg received was {msg}")
conn.close()
The 'while' in the conditional statement can be replaced with 'if,' if you only want to get a maximum of one message at a time. Use with caution, as it seems sort of 'hacky,' and I haven't found references to using conn.poll for this purpose elsewhere.
You can run the blocking function in a thread:
conn = await loop.run_in_executor(None, listener.accept)
I've not used the Listener object myself- for this task I normally use multiprocessing.Queue; doco at the following link:
https://docs.python.org/2/library/queue.html#Queue.Queue
That object can be used to send and receive any pickle-able object between Python processes with a nice API; I think you'll be most interested in:
in process A
.put('some message')
in process B
.get_nowait() # will raise Queue.Empty if nothing is available- handle that to move on with your execution
The only limitation with this is you'll need to have control of both Process objects at some point in order to be able to allocate the queue to them- something like this:
import time
from Queue import Empty
from multiprocessing import Queue, Process
def receiver(q):
while 1:
try:
message = q.get_nowait()
print 'receiver got', message
except Empty:
print 'nothing to receive, sleeping'
time.sleep(1)
def sender(q):
while 1:
message = 'some message'
q.put('some message')
print 'sender sent', message
time.sleep(1)
some_queue = Queue()
process_a = Process(
target=receiver,
args=(some_queue,)
)
process_b = Process(
target=sender,
args=(some_queue,)
)
process_a.start()
process_b.start()
print 'ctrl + c to exit'
try:
while 1:
time.sleep(1)
except KeyboardInterrupt:
pass
process_a.terminate()
process_b.terminate()
process_a.join()
process_b.join()
Queues are nice because you can actually have as many consumers and as many producers for that exact same Queue object as you like (handy for distributing tasks).
I should point out that just calling .terminate() on a Process is bad form- you should use your shiny new messaging system to pass a shutdown message or something of that nature.
The multiprocessing module comes with a nice feature called Pipe(). It is a nice way to share resources between two processes(never tried more than two before). With the dawn of python 3.80 came the shared memory function in the multiprocessing module but i have not really tested that so i cannot vouch for it
You will use the pipe function something like
from multiprocessing import Pipe
.....
def sending(conn):
message = 'some message'
#perform some code
conn.send(message)
conn.close()
receiver, sender = Pipe()
p = Process(target=sending, args=(sender,))
p.start()
print receiver.recv() # prints "some message"
p.join()
with this you should be able to have separate processes running independently and when you get to the point which you need the input from one process. If there is somehow an error due to the unrelieved data of the other process you can put it on a kind of sleep or halt or use a while loop to constantly check pending when the other process finishes with that task and sends it over
while not parent_conn.recv():
time.sleep(5)
this should keep it in an infinite loop until the other process is done running and sends the result. This is also about 2-3 times faster than Queue. Although queue is also a good option personally I do not use it.

Categories

Resources