Non-Blocking Server Apache Thrift Python

Non-Blocking Server Apache Thrift Python - python

In one Python Module A I am doing some stuff. In the middle of doing that stuff I am creating a Thrift connection. The problem is after that connection starts, the program gets stuck in the network logic. (i.e blocking).
In module A I have:
stuff = "do some stuff"
network.ConnectionManager(host, port, ...)
stuff = "do more stuff" # not getting to this point
In network...
ConnectionManager.start_service_handler()
def start_service_handler(self):
handler = ServiceHandler(self)
processor = Service.Processor(handler)
transport = TSocket.TServerSocket(port=self.port)
tfactory = TTransport.TBufferedTransportFactory()
pfactory = TBinaryProtocol.TBinaryProtocolFactory()
# server = TServer.TThreadedServer(processor, transport, tfactory, pfactory)
server = TNonblockingServer(processor, transport, tfactory, pfactory)
logger().info('starting server...')
server.serve()
I try this, but yet the code in module A does not continue as soon as the connection code starts.
I thought TNonblockingServer would do the trick, but unfortunately did not.

The code blocks at server.serve() which is by design, across all target languages supported by Thrift. The usual use case is to run a server like this (pseudo code):
init server
setup thrift protocol/tramsport stack
server.serve()
shutdown code
The "nonblocking" does not refer to the server.serve() call, rather to the code taking the actual client call. With a TSimpleServer, the server can only handle one call at a time. In contrast, the TNonblockingServer is designed to accept a number of connections in parallel.
Conclusion: If you want to run a Thrift server and also have some other work to do in parallel, or need to start and stop the server on the fly during program run, you will need another thread to achieve that.

Related

Interprocess communcation between docker and host system

I have a python program that does some machine learning. This is supposed to be accessible over network using HTTP. Since I want Apache to act as a server,I use a python script to send the data which is received to my program using python multiprocessing.connection.
For eg script to send will be
#!/usr/bin/python
from multiprocessing.connection import Client
import cgi
from job import *
form = cgi.FieldStorage()
address = ('localhost', 6000)
conn = Client(address, authkey='secretpass')
conn.send(form)
And the receiving script will be
from multiprocessing.connection import Listener
import threading
print "Starting listener"
address = ('localhost', 6000)
listener = Listener(address, authkey='secretpass')
while True:
conn = listener.accept()
msg = conn.recv()
conn.close()
# Do stuff with msg
listener.close()
Once I trigger the url, Apache will call the first script, and it will send the python object to other script. Other script will receive it and do the processing.
Now, I would like to put the ML part into a docker container while Apache will be in the host system. In that case how will I communicate ?

As part of the Processing library you will find the process Queue. This structure exists to allow messages to be passed between processes. If you are working on Linux it is a matter of setting up a global variable and pushing messages. The pattern is usually: any process can post, and a single process reads. With two or more queues you can easily set up back and forth communications without worry about collisions or lost messages.
This becomes harder in Windows and other more restrictive systems, as there are no globals shared between processes, and no way to pass a complex structure at creation of a process. In Windows it is far easier to simply stick to threads.
Details of the multi-processing/threads in python can be found here:
16.6. multiprocessing — Process-based “threading” interface

Client(1) to Server to Client(2) in Python 3.x (Sockets?)

I've been working with a project that involves sending information to a public server (to demonstrate how key-exchange schemes work) and then sending it to a specific client. There is only two clients.
I'm hoping to get pushed in the right direction on how to get information from client(1) to the server, then have the server redirect that information to client(2). I've messed with the code somewhat, getting comfortable with how to send and recieve information from the server, but I have no idea (~2 hours of research so far) how to differentiate clients and send information to specific clients
My current server code (pretty much unchanged from the python3 docs:
import socketserver
class MyTCPHandler(socketserver.BaseRequestHandler):
"""
The RequestHandler class for our server.
It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# just send back the same data, but upper-cased
self.request.sendall(self.data.upper())
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
# Create the server, binding to localhost on port 9999
server = socketserver.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
My client code (pretty much unchanged from the python3 docs:
import socket
import time
data = "matt is ok"
def contactserver(data):
HOST, PORT = "localhost", 9999
# Create a socket (SOCK_STREAM means a TCP socket)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect to server and send data
sock.connect((HOST, PORT))
sock.sendall(bytes(data, "utf-8"))
# Receive data from the server and shut down
received = str(sock.recv(1024), "utf-8")
print("Sent: {}".format(data))
print("Received: {}".format(received))
return format(received)
while True:
k = contactserver('banana')
time.sleep(1)
print(k)

First, a base socketserver.TCPServer can't even talk to two clients at the same time. As the docs explain:
These four classes process requests synchronously; each request must be completed before the next request can be started.
As the same paragraph tells you, you can solve that problem by using a forking or threading mix-in. That's pretty easy.
But there's a bigger problem. A threaded socketserver server creates a separate, completely independent, object for each connected client, and has no means of communicating between them, or even letting them find out about each other. So, what can you do?
You can always build it yourself. Put some kind of shared data somewhere, and some kind of synchronization on it, and all of the threads can talk to each other the same way any threads can, socketserver or otherwise.
For your design, a queue has all the magic built in for everything we need: client 1 can put a message on the queue (whether client 2 has shown up yet or not), and client 2 can get a message off the same queue (automatically waiting around if the message isn't there yet), and it's all automatically synchronized.
The big question is: how does the server know who's client 1 and who's client 2? Unless you want to switch based on address and port, or add some kind of "login" mechanism, the only rule I can think of is that whoever connects first is client 1, whoever connects second is client 2, and anyone who connects after that, who cares, they don't belong here. So, we can use a simple shared flag with a Lock on it.
Putting it all together:
class MyTCPHandler(socketserver.ThreadingMixIn, socketserver.BaseRequestHandler):
q = queue.queue()
got_first = False
got_first_lock = threading.Lock()
def handle_request(self):
with MyTCPHandler.got_first_lock:
if MyTCPHandler.got_first:
first = False
else:
first = True
MyTCPHandler.got_first = True
if first:
self.data = self.request.recv(1024).strip()
print("{} wrote:".format(self.client_address[0]))
print(self.data)
# just send back the same data, but upper-cased
self.request.sendall(self.data.upper())
# and also queue it up for client 2
MyTCPHandler.q.put(self.data)
else:
# get the message off the queue, waiting if necessary
self.data = MyTCPHandler.q.get()
self.request.sendall(self.data)
If you want to build a more complicated chat server, where everyone talks to everyone… well, that gets a bit more complicated, and you're stretching socketserver even farther beyond its intended limits.
I would suggest either (a) dropping to a lower level and writing a threaded or multiplexing server manually, or (b) going to a higher-level, more-powerful framework that can more easily handle interdependent clients.
The stdlib comes with a few alternatives for writing servers, but all of them suck except for asyncio—which is great, but unfortunately brand new (it requires 3.4, which is still in beta, or can be installed as a back-port for 3.3). If you don't want to skate on the bleeding edge, there are some great third-party choices like twisted or gevent. All of these options have a higher learning curve than socketserver, but that's only to be expected from something much more flexible and powerful.

Is it OK to send asynchronous notifications from server to client via the same TCP connection?

As far as I understand the basics of the client-server model, generally only client may initiate requests; server responds to them. Now I've run into a system where the server sends asynchronous messages back to the client via the same persistent TCP connection whenever it wants. So, a couple of questions:
Is it a right thing to do at all? It seems to really overcomplicate implementation of a client.
Are there any nice patterns/methodologies I could use to implement a client for such a system in Python? Changing the server is not an option.
Obviously, the client has to watch both the local request queue (i.e. requests to be sent to the server), and the incoming messages from the server. Launching two threads (Rx and Tx) per connection does not feel right to me. Using select() is a major PITA here. Do I miss something?

When dealing with asynchronous io in python I typically use a library such as gevent or eventlet. The objective of these libraries is allow for applications written in a synchronous to be multiplexed by a back-end reactor.
This basic example demonstrates the launching of two green threads/co-routines/fibers to handle either side of the TCP duplex. The send side of the duplex is listening on an asynchronous queue.
This is all performed within a single hardware thread. Both gevent && eventlet have more substantive examples in their documentation that what I have provided below.
If you run nc -l -p 8000 you will see "012" printed out. As soon netcat is exited, this code will be terminated.
from eventlet import connect, sleep, GreenPool
from eventlet.queue import Queue
def handle_i(sock, queue):
while True:
data = sock.recv(8)
if data:
print(data)
else:
queue.put(None) #<- signal send side of duplex to exit
break
def handle_o(sock, queue):
while True:
data = queue.get()
if data:
sock.send(data)
else:
break
queue = Queue()
sock = connect(('127.0.0.1', 8000))
gpool = GreenPool()
gpool.spawn(handle_i, sock, queue)
gpool.spawn(handle_o, sock, queue)
for i in range(0, 3):
queue.put(str(i))
sleep(1)
gpool.waitall() #<- waits until nc exits

I believe what you are trying to achieve is a bit similar to jsonp. While sending to the client, send through a callback method which you know of, that is existing in client.
like if you are sending "some data xyz", send it like server.send("callback('some data xyz')");. This suggestion is for javascript because it executes the returned code as if it were called through that method., and I believe you can port this theory to python with some difficulty. But I am not sure, though.

Yes this is very normal and Server can also send the messages to client after connection is made like in case of telnet server when you initiate a connection it sends you a message for the capability exchange and after that it asks you about your username & password.
You could very well use select() or if I were in your shoes I would have spawned a separate thread to receive the asynchronous messages from the server & would have left the main thread free to do further processing.

Trouble with a Python server

I have a few test clients that are encountering the same issue each time. The clients can connect, and they can send their first message, but after that the server stops responding to that client. I suspect that the problem is related to s.accept(), but I'm not sure exactly what is wrong or how to work around it.
def startServer():
host = ''
port = 13572
backlog = 5
size = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host,port))
s.listen(backlog)
print "Close the command prompt to stop Gamelink"
while 1:
try:
client, address = s.accept()
data = client.recv(size)
if data:
processData(data)
client.send("OK")
else:
print "Disconnecting from client at client's request"
client.close()
except socket.error, (value, message):
if s:
print "Disconnecting from client, socket issue"
s.close()
print "Error opening socket: " + message
break
except:
print "Gamelink encountered a problem"
break
print "End of loop"
client.close()
s.close()
The server is intended to be accessed across a local network, and it needs to be light weight and very quick to respond, so if another implementation (such as thread based) would be better for meeting those requirements please let me know. The intended application is to be used as a remote gaming keyboard, thus the need for low resource use and high speed.

Writing a server using socket directly will be hard. As Keith says, you need to multiplex the connections somehow, like with select or poll or threads or fork. You might think you need only one connection, but what will you do when something hiccups and the connection is lost? Will your server be able to respond to reconnection attempts from the client if it hasn't yet realized the connection is lost?
If your networking needs are basic, you might be able to let something else handle all the listening and accepting and forking stuff for you. You don't specify a platform, but examples of such programs are launchd on Mac OS and xinetd on Linux. The details differ between these tools, but basically you configure them, in some configuration file, to listen for a connection on some port. When they get it, they take care of setting up the connection, then they exec() your program with stdin and stdout aimed at the socket, so you can simply use all the basic IO you probably already know like print and sys.stdin.read().
The trouble with solutions like xinitd and launchd is that for each new connection, they must fork() and exec() a new instance of your program. These are relatively heavy operations so a large number of connections or a high rate of new connections might hit the limits of your server. But worse, since each connection is in a separate process, sharing data between them is hard. Also, most solutions you might find to communicate between processes involve a blocking API, and now you are back to the problem of multiplexing with select or threads or similar.
If that doesn't meet your needs, I think you are better off learning to use a higher-level networking framework which will handle all the problems you will inevitably encounter if you go down the path of socket. One such framework I'd suggest is Twisted. Beyond handling the mundane details of handling connections, and the more complex task of multiplexing IO between them, you will also have a huge library of tools that will make implementing your protocol much easier.

python xinetd client disconnection handling

This may or may not being a coding issue. It may also be an xinetd deamon issue, i do not know.
I have a python script which is triggered from a linux server running xinetd. Xinetd has been setup to only allow one instance as I only want one machine to be able to connect to the service, which is therefore also limited by IP.
Currently when the client connects to xinetd the service works correctly and the script begins sending its output to the client machine. However, when the client disconnects (i.e: due to reboot), the process is still alive on the server, and this blocks the ability for the client to connect once its finished rebooting or so on.
Q: How can i detect in python that the client has disconnected. Perhaps i can test if stdout is no longer being read from by the client (and then exit the script), or is there a much eaiser way in xinetd to have the child process be killed when the client disconnects ?
(I'm using python 2.4.3 on RHEL5 linux - solutions for 2.4 are needed, but 3.1 solutions would be useful to know also.)

Add a signal handler for SIGHUP. (x)inetd sends this upon the socket disconnecting.

Monitor the signals sent to your proccess. Maybe your script isn't responding to the SIGHUP sent by xinet, monitor the signal and let it die.

You don't seem to get a SIGHUP, but you do get a SIGPIPE, at least so long as you are attempting any IO on the connection. If the application spends long periods of time not doing any IO, then you could just start a thread reading stdin to ensure you get the SIGPIPE as soon as the disconnection occurs. This was good enough for my application but then I didn't use any pipes other than the ones xinetd gave me.
I've seen several places on the net where people talk about the SIGHUP getting sent on client disconnection, so I've written an inetd python script to test out a couple of servers (one inetd and another xinetd), so you could use that to check on the signals getting sent. It just logs what it finds to /var/log/test.log. Perhaps it will be useful.
#!/usr/bin/python
import os, signal, sys
skip = ["SIGKILL", "SIG_DFL", "SIGSTOP", "SIG_IGN", "SIGCLD", "SIGCHLD"]
name_map = {}
identifiers = [i for i in dir(signal) if i.startswith("SIG") and not i in skip]
for i in identifiers:
name_map[getattr(signal, i)] = i
def handler(num, frame):
signame = name_map[num]
os.system("echo handled %s >> /var/log/test.log" % signame)
if __name__ == "__main__":
for id, name in name_map.iteritems():
signal.signal(id, handler)
while True:
print sys.stdin.readline()
sys.stdout.flush()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.