python multithreading a server - python

I've been working on this server in python for a bit, and it seemed to work, but I don't think that the threading works properly. Things seem to happen in a sequential order (the first client to connect gets ALL of the information before the next client begins), is this a problem with this threading interface for servers? Should I change it and how so? Here is some example code:
port = 8123
print port
backlog = 5
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host,port))
s.listen(backlog)
while 1:
client, address = s.accept()
print "client recieved at " + address[0]
thread.start_new_thread(serveClient, (client, address, board))

If you want to re-use the standard Python module, check out SocketServer. It includes a Threading mixin, to make your server multi threaded.
An example test (from the above documentation) is this code:
if __name__ == "__main__":
# Port 0 means to select an arbitrary unused port
HOST, PORT = "localhost", 0
server = ThreadedTCPServer((HOST, PORT), ThreadedTCPRequestHandler)
ip, port = server.server_address
# Start a thread with the server -- that thread will then start one
# more thread for each request
server_thread = threading.Thread(target=server.serve_forever)
# Exit the server thread when the main thread terminates
server_thread.daemon = True
server_thread.start()
print "Server loop running in thread:", server_thread.name
client(ip, port, "Hello World 1")
client(ip, port, "Hello World 2")
client(ip, port, "Hello World 3")
server.shutdown()

Yes. CPython does not support parallel threads, due to the global interpreter lock.
CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.
From https://docs.python.org/2/library/threading.html
If you want truly parallel processing, you need to either use the Process class or use a different Python implementation (Jython supports parallel threads, I believe IronPython does as well).

Related

Receiving zmq messages on background thread fails on Windows

I'm trying to set up a hello world style example of asynchronous communication between two peers with zmq.PAIR by receiving messages on a background thread while using console input to send messages:
server.py:
import zmq
import threading
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.bind('tcp://*:5556')
def print_incoming_messages():
while True:
msg = socket.recv_string()
print(f'Message from client: {msg}')
recv_thread = threading.Thread(target=print_incoming_messages)
recv_thread.start()
while True:
msg = input('Message to send: ')
socket.send_string(msg)
client.py:
import zmq
import threading
context = zmq.Context()
socket = context.socket(zmq.PAIR)
socket.connect('tcp://127.0.0.1:5556')
def print_incoming_messages():
while True:
msg = socket.recv_string()
print(f'Message from server: {msg}')
recv_thread = threading.Thread(target=print_incoming_messages)
recv_thread.start()
while True:
msg = input('Message to send: ')
socket.send_string(msg)
This works completely fine on a Linux machine but socket.send_string blocks in either process when run from the Windows 10 command prompt. What is the reason for this discrepancy?
The socket is set up properly, flushing all outputs make no difference. The reading itself also works as expected as may be verified by navigating to 127.0.0.1:5556 in a browser. Looking at the loopback interface in Wireshark also reveals that the connection is set up properly, yet no messages are sent.
If I comment out recv_thread.start() in the client, however, messages are sent through as may be verified in Wireshark, which suggests that somehow socket.recv_string is blocking the socket from sending even though it isn't doing so on Linux.
I am also able to achieve the desired behavior by using two sets of PUSH/PULL (cf. this answer) but that doesn't quite help explain what's going on in the example at hand.
This is on Python 3.7.1, pyzmq 18.0.0, and libzmq 4.3.1 on both systems.
zmq sockets are not threadsafe, so running send and recv on the same socket in different threads should not be expected to work. Different threading behaviors on different platforms may be responsible for the difference in behavior you are seeing, but this code could also result in segfaults eventually due to the thread-unsafety of zmq sockets. Using a Lock might solve the problem.
As a side note, PAIR is a rarely-used socket type, and not often intended for use in production or inter-process communication. Most real-world instances of PAIR are as inproc sockets for inter-thread communication. PAIR can have weird behavior on reconnect, for example. Using PUSH-PULL for one-way or DEALER-DEALER for two-way communication is likely to behave in a more expected fashion.

Non-Blocking Server Apache Thrift Python

In one Python Module A I am doing some stuff. In the middle of doing that stuff I am creating a Thrift connection. The problem is after that connection starts, the program gets stuck in the network logic. (i.e blocking).
In module A I have:
stuff = "do some stuff"
network.ConnectionManager(host, port, ...)
stuff = "do more stuff" # not getting to this point
In network...
ConnectionManager.start_service_handler()
def start_service_handler(self):
handler = ServiceHandler(self)
processor = Service.Processor(handler)
transport = TSocket.TServerSocket(port=self.port)
tfactory = TTransport.TBufferedTransportFactory()
pfactory = TBinaryProtocol.TBinaryProtocolFactory()
# server = TServer.TThreadedServer(processor, transport, tfactory, pfactory)
server = TNonblockingServer(processor, transport, tfactory, pfactory)
logger().info('starting server...')
server.serve()
I try this, but yet the code in module A does not continue as soon as the connection code starts.
I thought TNonblockingServer would do the trick, but unfortunately did not.
The code blocks at server.serve() which is by design, across all target languages supported by Thrift. The usual use case is to run a server like this (pseudo code):
init server
setup thrift protocol/tramsport stack
server.serve()
shutdown code
The "nonblocking" does not refer to the server.serve() call, rather to the code taking the actual client call. With a TSimpleServer, the server can only handle one call at a time. In contrast, the TNonblockingServer is designed to accept a number of connections in parallel.
Conclusion: If you want to run a Thrift server and also have some other work to do in parallel, or need to start and stop the server on the fly during program run, you will need another thread to achieve that.

Interprocess communcation between docker and host system

I have a python program that does some machine learning. This is supposed to be accessible over network using HTTP. Since I want Apache to act as a server,I use a python script to send the data which is received to my program using python multiprocessing.connection.
For eg script to send will be
#!/usr/bin/python
from multiprocessing.connection import Client
import cgi
from job import *
form = cgi.FieldStorage()
address = ('localhost', 6000)
conn = Client(address, authkey='secretpass')
conn.send(form)
And the receiving script will be
from multiprocessing.connection import Listener
import threading
print "Starting listener"
address = ('localhost', 6000)
listener = Listener(address, authkey='secretpass')
while True:
conn = listener.accept()
msg = conn.recv()
conn.close()
# Do stuff with msg
listener.close()
Once I trigger the url, Apache will call the first script, and it will send the python object to other script. Other script will receive it and do the processing.
Now, I would like to put the ML part into a docker container while Apache will be in the host system. In that case how will I communicate ?
As part of the Processing library you will find the process Queue. This structure exists to allow messages to be passed between processes. If you are working on Linux it is a matter of setting up a global variable and pushing messages. The pattern is usually: any process can post, and a single process reads. With two or more queues you can easily set up back and forth communications without worry about collisions or lost messages.
This becomes harder in Windows and other more restrictive systems, as there are no globals shared between processes, and no way to pass a complex structure at creation of a process. In Windows it is far easier to simply stick to threads.
Details of the multi-processing/threads in python can be found here:
16.6. multiprocessing — Process-based “threading” interface

Client(1) to Server to Client(2) in Python 3.x (Sockets?)

I've been working with a project that involves sending information to a public server (to demonstrate how key-exchange schemes work) and then sending it to a specific client. There is only two clients.
I'm hoping to get pushed in the right direction on how to get information from client(1) to the server, then have the server redirect that information to client(2). I've messed with the code somewhat, getting comfortable with how to send and recieve information from the server, but I have no idea (~2 hours of research so far) how to differentiate clients and send information to specific clients
My current server code (pretty much unchanged from the python3 docs:
import socketserver
class MyTCPHandler(socketserver.BaseRequestHandler):
"""
The RequestHandler class for our server.
It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# just send back the same data, but upper-cased
self.request.sendall(self.data.upper())
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
# Create the server, binding to localhost on port 9999
server = socketserver.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
My client code (pretty much unchanged from the python3 docs:
import socket
import time
data = "matt is ok"
def contactserver(data):
HOST, PORT = "localhost", 9999
# Create a socket (SOCK_STREAM means a TCP socket)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect to server and send data
sock.connect((HOST, PORT))
sock.sendall(bytes(data, "utf-8"))
# Receive data from the server and shut down
received = str(sock.recv(1024), "utf-8")
print("Sent: {}".format(data))
print("Received: {}".format(received))
return format(received)
while True:
k = contactserver('banana')
time.sleep(1)
print(k)
First, a base socketserver.TCPServer can't even talk to two clients at the same time. As the docs explain:
These four classes process requests synchronously; each request must be completed before the next request can be started.
As the same paragraph tells you, you can solve that problem by using a forking or threading mix-in. That's pretty easy.
But there's a bigger problem. A threaded socketserver server creates a separate, completely independent, object for each connected client, and has no means of communicating between them, or even letting them find out about each other. So, what can you do?
You can always build it yourself. Put some kind of shared data somewhere, and some kind of synchronization on it, and all of the threads can talk to each other the same way any threads can, socketserver or otherwise.
For your design, a queue has all the magic built in for everything we need: client 1 can put a message on the queue (whether client 2 has shown up yet or not), and client 2 can get a message off the same queue (automatically waiting around if the message isn't there yet), and it's all automatically synchronized.
The big question is: how does the server know who's client 1 and who's client 2? Unless you want to switch based on address and port, or add some kind of "login" mechanism, the only rule I can think of is that whoever connects first is client 1, whoever connects second is client 2, and anyone who connects after that, who cares, they don't belong here. So, we can use a simple shared flag with a Lock on it.
Putting it all together:
class MyTCPHandler(socketserver.ThreadingMixIn, socketserver.BaseRequestHandler):
q = queue.queue()
got_first = False
got_first_lock = threading.Lock()
def handle_request(self):
with MyTCPHandler.got_first_lock:
if MyTCPHandler.got_first:
first = False
else:
first = True
MyTCPHandler.got_first = True
if first:
self.data = self.request.recv(1024).strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# just send back the same data, but upper-cased
self.request.sendall(self.data.upper())
# and also queue it up for client 2
MyTCPHandler.q.put(self.data)
else:
# get the message off the queue, waiting if necessary
self.data = MyTCPHandler.q.get()
self.request.sendall(self.data)
If you want to build a more complicated chat server, where everyone talks to everyone… well, that gets a bit more complicated, and you're stretching socketserver even farther beyond its intended limits.
I would suggest either (a) dropping to a lower level and writing a threaded or multiplexing server manually, or (b) going to a higher-level, more-powerful framework that can more easily handle interdependent clients.
The stdlib comes with a few alternatives for writing servers, but all of them suck except for asyncio—which is great, but unfortunately brand new (it requires 3.4, which is still in beta, or can be installed as a back-port for 3.3). If you don't want to skate on the bleeding edge, there are some great third-party choices like twisted or gevent. All of these options have a higher learning curve than socketserver, but that's only to be expected from something much more flexible and powerful.

Multiprocessing and Sockets

I am trying to use multiprocessing and sockets to allow multiple connections to the same socket. However, I am having a real hard time because I don't have much experience in this field.
The code I have isn't working
def server(port, listen=10):
connected = []
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('', port))
s.listen(listen)
while True:
conn, address = s.accept()
p = multiprocessing.Process(target=server, args=(port, listen))
p.start()
p.join()
command = raw_input("Command: ")
conn.send(command)
Thanks for the help
This is because you are trying to create multiple servers in loop. Single server is suffucient for your task, no need to open many listening sockets. Every local port may be bound by at most one listening socket -- that's why you see "address in use" error.
Try out the Python standard TCPServer class, this could be much more convenient than to bother with low-level sockets.
For threading server see this example.
At the OS socket level, this scheme needs only one listening socket, which will spawn new socket each time when accepting a new connection (that's the standard way of socketry). Then you'll work with new socket at the separate thread (keep in mind access to common data shared among threads).

Categories

Resources