I am developing a server (daemon).
The server has one "worker thread". The worker thread runs a queue of commands. When the queue is empty, the worker thread is paused (but does not exit, because it should preserve certain state in memory). To have exactly one copy of the state in memory, I need to run all time exactly one (not several and not zero) worker thread.
Requests are added to the end of this queue when a client connects to a Unix socket and sends a command.
After the command is issued, it is added to the queue of commands of the worker thread. After it is added to the queue, the server replies something like "OK". There should be not a long pause between server receiving a command and it "OK" reply. However, running commands in the queue may take some time.
The main "work" of the worker thread is split into small (taking relatively little time) chunks. Between chunks, the worker thread inspects ("eats" and empties) the queue and continues to work based on the data extracted from the queue.
How to implement this server/daemon in Python?
This is a sample code with internet sockets, easily replaced with unix domain sockets. It takes whatever you write to the socket, passes it as a "command" to worker, responds OK as soon as it has queued the command. The single worker simulates a lengthy task with sleep(30). You can queue as many tasks as you want, receive OK immediately and every 30 seconds, your worker prints a command from the queue.
import Queue, threading, socket
from time import sleep
class worker(threading.Thread):
def __init__(self,q):
super(worker,self).__init__()
self.qu = q
def run(self):
while True:
new_task=self.qu.get(True)
print new_task
i=0
while i < 10:
print "working ..."
sleep(1)
i += 1
try:
another_task=self.qu.get(False)
print another_task
except Queue.Empty:
pass
task_queue = Queue.Queue()
w = worker(task_queue)
w.daemon = True
w.start()
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('localhost', 4200))
sock.listen(1)
try:
while True:
conn, addr = sock.accept()
data = conn.recv(32)
task_queue.put(data)
conn.sendall("OK")
conn.close()
except:
sock.close()
Related
I'm trying to create a threaded TCP socket server that can handle multiple socket request at a time.
To test it, I launch several thread in the client side to see if my server can handle it. The first socket is printed successfully but I get a [Errno 32] Broken pipe for the others.
I don't know how to avoid it.
import threading
import socketserver
import graphitesend
class ThreadedTCPRequestHandler(socketserver.BaseRequestHandler):
def handle(self):
data = self.request.recv(1024)
if data != "":
print(data)
class ThreadedTCPServer(socketserver.ThreadingTCPServer):
allow_reuse_address = True
def __init__(self, host, port):
socketserver.ThreadingTCPServer.__init__(self, (host, port), ThreadedTCPRequestHandler)
def stop(self):
self.server_close()
self.shutdown()
def start(self):
threading.Thread(target=self._on_started).start()
def _on_started(self):
self.serve_forever()
def client(g):
g.send("test", 1)
if __name__ == "__main__":
HOST, PORT = "localhost", 2003
server = ThreadedTCPServer(HOST, PORT)
server.start()
g = graphitesend.init(graphite_server = HOST, graphite_port = PORT)
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
server.stop()
It's a little bit difficult to determine what exactly you're expecting to happen, but I think the proximate cause is that you aren't giving your clients time to run before killing the server.
When you construct a Thread object and call its start method, you're creating a thread, and getting it ready to run. It will then be placed on the "runnable" task queue on your system, but it will be competing with your main thread and all your other threads (and indeed all other tasks on the same machine) for CPU time.
Your multiple threads (main plus others) are also likely being serialized by the python interpreter's GIL (Global Interpreter Lock -- assuming you're using the "standard" CPython) which means they may not have even gotten "out of the gate" yet.
But then you're shutting down the server with server_close() before they've had a chance to send anything. That's consistent with the "Broken Pipe" error: your remaining clients are attempting to write to a socket that has been closed by the "remote" end.
You should collect the thread objects as you create them and put them in a list (so that you can reference them later). When you're finished creating and starting all of them, then go back through the list and call the .join method on each thread object. This will ensure that the thread has had a chance to finish. Only then should you shut down the server. Something like this:
threads = []
for n in range(7):
th = threading.Thread(target=client, args=(g,))
th.start()
threads.append(th)
# All threads created. Wait for them to finish.
for th in threads:
th.join()
server.stop()
One other thing to note is that all of your clients are sharing the same single connection to send to the server, so that your server will never create more than one thread: as far as it's concerned, there is only a single client. You should probably move the graphitesend.init into the client function if you actually want separate connections for each client.
(Disclaimer: I know nothing about graphitesend except what I could glean in a 15 second glance at the first result in google; I'm assuming it's basically just a wrapper around a TCP connection.)
I'm trying to implement a tcp 'echo server'.
Simple stuff:
Client sends a message to the server.
Server receives the message
Server converts message to uppercase
Server sends modified message to client
Client prints the response.
It worked well, so I decided to parallelize the server; make it so that it could handle multiple clients at time.
Since most Python interpreters have a GIL, multithreading won't cut it.
I had to use multiproces... And boy, this is where things went downhill.
I'm using Windows 10 x64 and the WinPython suit with Python 3.5.2 x64.
My idea is to create a socket, intialize it (bind and listen), create sub processes and pass the socket to the children.
But for the love of me... I can't make this work, my subprocesses die almost instantly.
Initially I had some issues 'pickling' the socket...
So I googled a bit and thought this was the issue.
So I tried passing my socket thru a multiprocessing queue, through a pipe and my last attempt was 'forkpickling' and passing it as a bytes object during the processing creating.
Nothing works.
Can someone please shed some light here?
Tell me whats wrong?
Maybe the whole idea (sharing sockets) is bad... And if so, PLEASE tell me how can I achieve my initial objective: enabling my server to ACTUALLY handle multiple clients at once (on Windows) (don't tell me about threading, we all know python's threading won't cut it ¬¬)
It also worth noting that no files are create by the debug function.
No process lived long enough to run it, I believe.
The typical output of my server code is (only difference between runs is the process numbers):
Server is running...
Degree of parallelism: 4
Socket created.
Socket bount to: ('', 0)
Process 3604 is alive: True
Process 5188 is alive: True
Process 6800 is alive: True
Process 2844 is alive: True
Press ctrl+c to kill all processes.
Process 3604 is alive: False
Process 3604 exit code: 1
Process 5188 is alive: False
Process 5188 exit code: 1
Process 6800 is alive: False
Process 6800 exit code: 1
Process 2844 is alive: False
Process 2844 exit code: 1
The children died...
Why god?
WHYYyyyyy!!?!?!?
The server code:
# Imports
import socket
import packet
import sys
import os
from time import sleep
import multiprocessing as mp
import pickle
import io
# Constants
DEGREE_OF_PARALLELISM = 4
DEFAULT_HOST = ""
DEFAULT_PORT = 0
def _parse_cmd_line_args():
arguments = sys.argv
if len(arguments) == 1:
return DEFAULT_HOST, DEFAULT_PORT
else:
raise NotImplemented()
def debug(data):
pid = os.getpid()
with open('C:\\Users\\Trauer\\Desktop\\debug\\'+str(pid)+'.txt', mode='a',
encoding='utf8') as file:
file.write(str(data) + '\n')
def handle_connection(client):
client_data = client.recv(packet.MAX_PACKET_SIZE_BYTES)
debug('received data from client: ' + str(len(client_data)))
response = client_data.upper()
client.send(response)
debug('sent data from client: ' + str(response))
def listen(picklez):
debug('started listen function')
pid = os.getpid()
server_socket = pickle.loads(picklez)
debug('acquired socket')
while True:
debug('Sub process {0} is waiting for connection...'.format(str(pid)))
client, address = server_socket.accept()
debug('Sub process {0} accepted connection {1}'.format(str(pid),
str(client)))
handle_connection(client)
client.close()
debug('Sub process {0} finished handling connection {1}'.
format(str(pid),str(client)))
if __name__ == "__main__":
# Since most python interpreters have a GIL, multithreading won't cut
# it... Oughta bust out some process, yo!
host_port = _parse_cmd_line_args()
print('Server is running...')
print('Degree of parallelism: ' + str(DEGREE_OF_PARALLELISM))
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print('Socket created.')
server_socket.bind(host_port)
server_socket.listen(DEGREE_OF_PARALLELISM)
print('Socket bount to: ' + str(host_port))
buffer = io.BytesIO()
mp.reduction.ForkingPickler(buffer).dump(server_socket)
picklez = buffer.getvalue()
children = []
for i in range(DEGREE_OF_PARALLELISM):
child_process = mp.Process(target=listen, args=(picklez,))
child_process.daemon = True
child_process.start()
children.append(child_process)
while not child_process.pid:
sleep(.25)
print('Process {0} is alive: {1}'.format(str(child_process.pid),
str(child_process.is_alive())))
print()
kids_are_alive = True
while kids_are_alive:
print('Press ctrl+c to kill all processes.\n')
sleep(1)
exit_codes = []
for child_process in children:
print('Process {0} is alive: {1}'.format(str(child_process.pid),
str(child_process.is_alive())))
print('Process {0} exit code: {1}'.format(str(child_process.pid),
str(child_process.exitcode)))
exit_codes.append(child_process.exitcode)
if all(exit_codes):
# Why do they die so young? :(
print('The children died...')
print('Why god?')
print('WHYYyyyyy!!?!?!?')
kids_are_alive = False
edit: fixed the signature of "listen". My processes still die instantly.
edit2: User cmidi pointed out that this code does work on Linux; so my question is: How can I 'made this work' on Windows?
You can directly pass a socket to a child process. multiprocessing registers a reduction for this, for which the Windows implementation uses the following DupSocket class from multiprocessing.resource_sharer:
class DupSocket(object):
'''Picklable wrapper for a socket.'''
def __init__(self, sock):
new_sock = sock.dup()
def send(conn, pid):
share = new_sock.share(pid)
conn.send_bytes(share)
self._id = _resource_sharer.register(send, new_sock.close)
def detach(self):
'''Get the socket. This should only be called once.'''
with _resource_sharer.get_connection(self._id) as conn:
share = conn.recv_bytes()
return socket.fromshare(share)
This calls the Windows socket share method, which returns the protocol info buffer from calling WSADuplicateSocket. It registers with the resource sharer to send this buffer over a connection to the child process. The child in turn calls detach, which receives the protocol info buffer and reconstructs the socket via socket.fromshare.
It's not directly related to your problem, but I recommend that you redesign the server to instead call accept in the main process, which is the way this is normally done (e.g. in Python's socketserver.ForkingTCPServer module). Pass the resulting (conn, address) tuple to the first available worker over a multiprocessing.Queue, which is shared by all of the workers in the process pool. Or consider using a multiprocessing.Pool with apply_async.
def listen() the target/start for your child processes does not take any argument but you are providing serialized socket as an argument args=(picklez,) to the child process this would cause an exception in the child process and exit immediately.
TypeError: listen() takes no arguments (1 given)
def listen(picklez) should solve the problem this will provide one argument to the target of your child processes.
I'm making a simple multi-threaded port scanner. It scans all ports on host and returns open ports. The trouble is interrupting the scan. It take a lot of time for a scan to complete and sometimes I wish to kill program with C-c while in the middle of scan. Trouble is the scan won't stop. Main thread is locked on queue.join() and oblivious to KeyboardInterrupt, until all data from queue is processed thus deblocking main thread and exiting program gracefully. All my threads are daemonized so when main thread dies they should die with him.
I tried using signal lib, no success. Overriding threading.Thread class and adding method for graceful termination didn't work... Main thread just won't receive KeyboardInterrupt while executing queue.join()
import threading, sys, Queue, socket
queue = Queue.Queue()
def scan(host):
while True:
port = queue.get()
if port > 999 and port % 1000 == 0:
print port
try:
#sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
#sock.settimeout(2) #you need timeout or else it will try to connect forever!
#sock.connect((host, port))
#----OR----
sock = socket.create_connection((host, port), timeout = 2)
sock.send('aaa')
data = sock.recv(100)
print "Port {} open, message: {}".format(port, data)
sock.shutdown()
sock.close()
queue.task_done()
except:
queue.task_done()
def main(host):
#populate queue
for i in range(1, 65536):
queue.put(i)
#spawn worker threads
for port in range(100):
t = threading.Thread(target = scan, args = (host,))
t.daemon = True
t.start()
if __name__ == '__main__':
host = ""
#does input exist?
try:
host = sys.argv[1]
except:
print "No argument was recivied!"
exit(1)
#is input sane?
try:
host = socket.gethostbyname(host)
except:
print "Adress does not exist"
exit(2)
#execute main program and wait for scan to complete
main(host)
print "Post main() call!"
try:
queue.join()
except KeyboardInterrupt:
print "C-C"
exit(3)
EDIT:
I have found a solution by using time module.
#execute main program and wait for scan to complete
main(host)
#a little trick. queue.join() makes main thread immune to keyboardinterrupt. So use queue.empty() with time.sleep()
#queue.empty() is "unreliable" so it may return True a bit earlier then intented.
#when queue is true, queue.join() is executed, to confirm that all data was processed.
#not a true solution, you can't interrupt main thread near the end of scan (when queue.empty() returns True)
try:
while True:
if queue.empty() == False:
time.sleep(1)
else:
break
except KeyboardInterrupt:
print "Alas poor port scanner..."
exit(1)
queue.join()
You made your threads daemons already, but you need to keep your main thread alive while daemon threads are there, there's how to do that: Cannot kill Python script with Ctrl-C
When you create the threads add them to a list of running threads and when dealing with ctrl-C send a kill signal to each thread on the list. That way you are actively cleaning up rather than relying on it being done for you.
I was reading a article on Python multi threading using Queues and have a basic question.
Based on the print stmt, 5 threads are started as expected. So, how does the queue works?
1.The thread is started initially and when the queue is populated with a item does it gets restarted and starts processing that item?
2.If we use the queue system and threads process each item by item in the queue, how there is a improvement in performance..Is it not similar to serial processing ie; 1 by 1.
import Queue
import threading
import urllib2
import datetime
import time
hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com",
"http://ibm.com", "http://apple.com"]
queue = Queue.Queue()
class ThreadUrl(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
print 'threads are created'
self.queue = queue
def run(self):
while True:
#grabs host from queue
print 'thread startting to run'
now = datetime.datetime.now()
host = self.queue.get()
#grabs urls of hosts and prints first 1024 bytes of page
url = urllib2.urlopen(host)
print 'host=%s ,threadname=%s' % (host,self.getName())
print url.read(20)
#signals to queue job is done
self.queue.task_done()
start = time.time()
if __name__ == '__main__':
#spawn a pool of threads, and pass them queue instance
print 'program start'
for i in range(5):
t = ThreadUrl(queue)
t.setDaemon(True)
t.start()
#populate queue with data
for host in hosts:
queue.put(host)
#wait on the queue until everything has been processed
queue.join()
print "Elapsed Time: %s" % (time.time() - start)
A queue is similar to a list container, but with internal locking to make it a thread-safe way to communicate data.
What happens when you start all of your threads is that they all block on the self.queue.get() call, waiting to pull an item from the queue. When an item is put into the queue from your main thread, one of the threads will become unblocked and receive the item. It can then continue to process it until it finishes and returns to a blocking state.
All of your threads can run concurrently because they all are able to receive items from the queue. This is where you would see your improvement in performance. If the urlopen and read take time in one thread and it is waiting on IO, that means another thread can do work. The queue objects job is simply to manage the locking access, and popping off items to the callers.
I am developing a client-server application where whenever a new client connects to the server, the server spawns a new process using the multiprocessing module. Its target function is a function where it takes the socket and does I/O. The problem I have is once the TCP connection is closed between the client and the process on the server how/where do I put the .join() function call to end the child process? Also do I need to do any waitpid in the parent process like in C?
Server code:
def new_client(conn_socket):
while True:
message = conn_socket.recv(BUFFER_SIZE)
conn_socket.send(message)
#just echo the message
#how to check to see if the TCP connection is still alive?
#put the .join() here??
def main():
#create the socket
server_socket = socket(AF_INET,SOCK_STREAM)
#bind the socket to the local ip address on a specific port and listen
server_port = 12000
server_socket.bind(('',server_port))
server_socket.listen(1)
#enter in a loop to accept client connections
while True:
connection_socket, client_address = server_socket.accept()
#create a new process with the new connection_socket
new_process = Process(target = new_client, args = (connection_socket,))
new_process.start()
#put the .join() here or what??
if __name__ == '__main__':
main()
Also for this setup would it be more beneficial to use threads in the thread module or stay with processes? The server code is being developed for heavy usage on a server with "average" specs(how to optimize this setup).
You need to check the return value of recv. If it returns zero then the connection is closed nicely, if negative then there was an error.
And the join call should be in the process that creates the sub-process. However, be carefull because join without argument will block the calling process until the sub-process is done. Put the processes in a list, and on regular intervals call join with a small timeout.
Edit: Simplest is to add, at the end of the infinite accept loop, to iterate over the list of processes, and check if it's is_alive. If not then call join and remove it from the list.
Something like:
all_processes = []
while True:
connection_socket, client_address = server_socket.accept()
#create a new process with the new connection_socket
new_process = Process(target = new_client, args = (connection_socket,))
new_process.start()
# Add process to our list
all_processes.append(new_process)
# Join all dead processes
for proc in all_processes:
if not proc.is_alive():
proc.join()
# And remove them from the list
all_processes = [proc for proc in all_processes if proc.is_alive()]
Note that purging of old processes will only happen if we get a new connection. This can take some time, depending on if you get new connections often or not. You could make the listening socket non-blocking and use e.g. select with a timeout to know if there are new connections or not, and the purging will happen at more regular intervals even if there are no new connections.