Is there a easy way to shut down python grpc server gracefully?

Is there a easy way to shut down python grpc server gracefully? - python

Here is a blog explaining how to gracefully shutdown a GRPC server in kotlin.
Is this the only way to do it? Counting live calls and handling SIGTERM manually? This should have been normal behavior.
I couldn't find how to count live calls in python. Can someone point me to docs that will help?

Turns out there is a easy way instead of counting RPCs, here is how I got it done:
server = grpc.server(futures.ThreadPoolExecutor(max_workers=100))
{} = {}Impl()
add_{}Servicer_to_server({}, server)
server.add_insecure_port('[::]:' + port)
server.start()
logger.info('Started server at ' + port)
done = threading.Event()
def on_done(signum, frame):
logger.info('Got signal {}, {}'.format(signum, frame))
done.set()
signal.signal(signal.SIGTERM, on_done)
done.wait()
logger.info('Stopped RPC server, Waiting for RPCs to complete...')
server.stop(NUM_SECS_TO_WAIT).wait()
logger.info('Done stopping server')

gRPC Python servers have a (newish) method for this. Just call server.wait_for_termination()

Related

How do I forcibly disconnect all currently connected clients to my TCP or HTTP server during shutdown?

I have a fake HTTP server that I use as a fixture in my testing. At some point in the test, I want to stop the server regardless of any still open connections. Clients on these open connections should get a TCP FIN.
I am aware that usually production servers need to solve different problem, that of quiescing, sometimes called graceful shutdown. This is the opposite of what I want.
With a standalone process, it is usually possible to simply get the process to quit and the OS will take care of the rest. (Forcibly killing processes is easy, while forcibly killing threads is not.) My fake server is, however, running in a thread of the test process itself, so I don't have this option (and I don't want to externalize it if there is other way around).
I investigated this issue in Python, with the HTTPServer class, where I was not able to find any solution.
I also investigated this in Go, where I was able to find the concept of Contexts, which is close to what I need, but it works the other way around: a http server would propagate a Context that can be used to cancel e.g. a database lookup if a client disconnected.
Edit: looks like Go actually does what I need and has a separate graceful and nongraceful shutdown methods, with the nongraceful being net/http#Server.Close.
server = http.server.HTTPServer(...)
thread = threading.Thread(run=server.serve_forever)
thread.start()
# a client has connected ....
server.shutdown()
# at this point I want to have the server stopped,
# without waiting for the request handling to complete

I've implemented the Go solution in Python. When new client connects, I remember the client socket, and when I want to quit, I shutdown all remembered sockets.
It seems to work.
import socket
import http.server.HTTPServer
class MyHTTPServer(HTTPServer):
"""Adds a method to the HTTPServer to allow it to exit gracefully"""
def __init__(self, addr, handler_cls):
super().__init__(addr, handler_cls)
self._client_sockets: List[socket.socket] = []
self.server_killed = False
def get_request(self) -> Tuple[socket.socket, Any]:
"""Remember the client socket"""
sock, addr = super().get_request()
self._client_sockets.append(sock)
return sock, addr
def shutdown_request(self, request: socket.socket) -> None:
"""Forget the client socket"""
self._client_sockets.remove(request)
print(f"{self._client_sockets=}")
super().shutdown_request(request)
def force_disconnect_clients(self) -> None:
"""Shutdown the remembered sockets"""
for client in self._client_sockets:
client.shutdown(socket.SHUT_RDWR)
Usage
server = MyHTTPServer(server_addr, MyRequestHandler)
# in a new thread
while not server.server_killed:
self._server.handle_request()
# ... use the server (keep in mind it can have at most one client at a time) ...
# in the main program
server.server_killed = True
server.force_disconnect_clients()
server.server_close()

Django: communicate with TCP server (with twisted?)

I have a django application, that needs to talk to a remote TCP server. This server will send packages and depending on what the package is, I need add entries to the database and inform other parts of the application. I also need to actively send requests to the TCP server, for instance when the user navigates to a certain page, I want to subscribe to a certain stream on the TCP server. So communication in both directions need to work.
So far, I use the following solution:
I wrote a custom Django command, that I can start with
python manage.py listen
This command will start a twisted socket server with reactor.connectTCP(IP, PORT, factory) and since it is a django command, I will have access to the database and all the other parts of my application.
But since I also want to be able to send something to the TCP server, triggered by a certain django view, I have an additional socket server, that starts within my twisted application by reactor.listenTCP(PORT, server_factory).
To this server, I will then connect directly in my django application, within a new thread:
class MSocket:
def __init__(self):
self.stopped = False
self.socket = None
self.queue = []
self.process = start_new_thread(self.__connect__, ())
atexit.register(self.terminate)
def terminate(self):
self.stopped = True
try:
self.socket.close()
except:
pass
def __connect__(self):
if self.stopped:
return
attempts = 0
self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
while True and not self.stopped:
try:
print "Connecting to Socket Server..."
self.socket.connect(("127.0.0.1", settings.SOCKET_PORT))
print "Connection Successful!"
for msg in self.queue:
self.socket.send(msg)
self.queue = []
break
except:
pause = min(int(round(1.2**attempts)), 30)
print "Connection Failed! Try again in " + str(pause) + " seconds."
sleep(pause)
attempts += 1
self.__loop__()
def __loop__(self):
if self.stopped:
return
while True and not self.stopped:
try:
data = self.socket.recv(1024)
except:
try:
self.socket.close()
except:
pass
break
if not data:
break
self.__connect__()
def send(self, msg):
try:
self.socket.send(msg)
return True
except:
self.queue.append(msg)
return False
m_socket = MSocket()
m_socket will then be imported by the main urls.py so that it starts with django.
So my setup looks kind this:
Sending to TCP Server:
Django (connect:8001) -------> (listen:8001) Twisted (connect:4444) ------> (listen:4444) TCP-Server
Receiving from TCP Server
TCP-Server (listen:4444) ------> (connect:4444) Twisted ---(direct access)---> Django
It all seems to work that way, but I fear that this is not a really good solution, since I have to open this extra TCP connection. So my question would be now, if the setup can be optimized (and I'm sure it can) and how it can be done.

This is not going to work unless you monkey patch Django (as mentioned by #pss)
I had a similar situation so this is what I did.
Run a separate twisted deamon.
To communicate from Django to Twisted, use Unix sockets. The local twisted server can listen on Unix sockets (AF_UNIX) and Django can simply connect to that socket. This will avoid going through the TCP stack
To communicate from Twisted to Django, you have multiple options,
a) call Django url with the data
b) launch a script (Django management command)
c) Use celery to launch the the above Django command
d) Use a queue (zeromq or rabbit) and have your Django management command listen in on the queue (preferred)
With the last option, you get much better throughput, durability and it scales well.

You may want to consider using Twisted within your Django application. Here's an excellent talk about that, and a simple example of deploying Django on Twisted, and a deployment tool for Django that uses Twisted.

As I understand now the problem is how to integrate Twisted into Django app. It does not seem to be a good idea because Twisted's event loop blocks the process.
What you could try is to run Django in non-blocking environment with gunicorn and use gevent to implement all communications needed.
If that is not possible - there is an answer suggesting standalone Django app as a way to use Twisted within Django (or rather Django bits inside Twisted).
Personally I would go with gevent. After using Twisted for about 2 years it seems as powerful but old, heavy, hard to learn and hard to debug tool.

How can I write a socket server in a different thread from my main program (using gevent)?

I'm developing a Flask/gevent WSGIserver webserver that needs to communicate (in the background) with a hardware device over two sockets using XML.
One socket is initiated by the client (my application) and I can send XML commands to the device. The device answers on a different port and sends back information that my application has to confirm. So my application has to listen to this second port.
Up until now I have issued a command, opened the second port as a server, waited for a response from the device and closed the second port.
The problem is that it's possible that the device sends multiple responses that I have to confirm. So my solution was to keep the port open and keep responding to incoming requests. However, in the end the device is done sending requests, and my application is still listening (I don't know when the device is done), thereby blocking everything else.
This seemed like a perfect use case for a thread, so that my application launches a listening server in a separate thread. Because I'm already using gevent as a WSGI server for Flask, I can use the greenlets.
The problem is, I have looked for a good example of such a thing, but all I can find is examples of multi-threading handlers for a single socket server. I don't need to handle a lot of connections on the socket server, but I need it launched in a separate thread so it can listen for and handle incoming messages while my main program can keep sending messages.
The second problem I'm running into is that in the server, I need to use some methods from my "main" class. Being relatively new to Python I'm unsure how to structure it in a way to make that possible.
class Device(object):
def __init__(self, ...):
self.clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
def _connect_to_device(self):
print "OPEN CONNECTION TO DEVICE"
try:
self.clientsocket.connect((self.ip, 5100))
except socket.error as e:
pass
def _disconnect_from_device(self):
print "CLOSE CONNECTION TO DEVICE"
self.clientsocket.close()
def deviceaction1(self, ...):
# the data that is sent is an XML document that depends on the parameters of this method.
self._connect_to_device()
self._send_data(XMLdoc)
self._wait_for_response()
return True
def _send_data(self, data):
print "SEND:"
print(data)
self.clientsocket.send(data)
def _wait_for_response(self):
print "WAITING FOR REQUESTS FROM DEVICE (CHANNEL 1)"
self.serversocket.bind(('10.0.0.16', 5102))
self.serversocket.listen(5) # listen for answer, maximum 5 connections
connection, address = self.serversocket.accept()
# the data is of a specific length I can calculate
if len(data) > 0:
self._process_response(data)
self.serversocket.close()
def _process_response(self, data):
print "RECEIVED:"
print(data)
# here is some code that processes the incoming data and
# responds to the device
# this may or may not result in more incoming data
if __name__ == '__main__':
machine = Device(ip="10.0.0.240")
Device.deviceaction1(...)
This is (globally, I left out sensitive information) what I'm doing now. As you can see everything is sequential.
If anyone can provide an example of a listening server in a separate thread (preferably using greenlets) and a way to communicate from the listening server back to the spawning thread, it would be of great help.
Thanks.
EDIT:
After trying several methods, I decided to use Pythons default select() method to solve this problem. This worked, so my question regarding the use of threads is no longer relevant. Thanks for the people who provided input for your time and effort.

Hope it can provide some help, In example class if we will call tenMessageSender function then it will fire up an async thread without blocking main loop and then _zmqBasedListener will start listening on separate port untill that thread is alive. and whatever message our tenMessageSender function will send, those will be received by client and respond back to zmqBasedListener.
Server Side
import threading
import zmq
import sys
class Example:
def __init__(self):
self.context = zmq.Context()
self.publisher = self.context.socket(zmq.PUB)
self.publisher.bind('tcp://127.0.0.1:9997')
self.subscriber = self.context.socket(zmq.SUB)
self.thread = threading.Thread(target=self._zmqBasedListener)
def _zmqBasedListener(self):
self.subscriber.connect('tcp://127.0.0.1:9998')
self.subscriber.setsockopt(zmq.SUBSCRIBE, "some_key")
while True:
message = self.subscriber.recv()
print message
sys.exit()
def tenMessageSender(self):
self._decideListener()
for message in range(10):
self.publisher.send("testid : %d: I am a task" %message)
def _decideListener(self):
if not self.thread.is_alive():
print "STARTING THREAD"
self.thread.start()
Client
import zmq
context = zmq.Context()
subscriber = context.socket(zmq.SUB)
subscriber.connect('tcp://127.0.0.1:9997')
publisher = context.socket(zmq.PUB)
publisher.bind('tcp://127.0.0.1:9998')
subscriber.setsockopt(zmq.SUBSCRIBE, "testid")
count = 0
print "Listener"
while True:
message = subscriber.recv()
print message
publisher.send('some_key : Message received %d' %count)
count+=1
Instead of thread you can use greenlet etc.

A good heartbeat interval for pika-rabbitmq in Amazon ec2

I am using the latest pika library(0.9.9+) for rabbitmq. My usage for rabbitmq and pika is as follows :
I have long running tasks (about 5 minutes) as workers. These tasks take their requests from rabbitmq.The requests come very infrequently i.e. there is a long idle time between requests.
The problem i was facing previously is related to idle connections(connection closures due to idle connections). So, I have enabled heartbeat in pika.
Now the selection of heartbeat is a problem. Pika seems to be a single threaded library where heartbeats reception and acknowledgement happens to be done in-between requests time frame.
So, if the heartbeat interval is set less than the time the callback function uses to do its long running computations, the server does not receive any heartbeat acknowledgements and closes the connection.
So, I assume the minimum heartbeat interval should be the maximum computation time of the callback function in a blocking connection.
What can be a good heartbeat value for amazon ec2 to prevent it closing idle connections ?
Also, some suggest to use rabbitmq keepalive (or libkeepalive) to maintain tcp connections. I think managing heartbeats at the tcp layer is much better because the application need not manage them.Is this true ? Is keepalive a good method when compared to RMQ heartbeats ?
I have seen that some suggest using multiple threads and queue for long running tasks. But is this the only option for long running tasks ? It is quite disappointing that another queue must be used for this scenario.
Thank you in advance. I think I have detailed the problem. Let me know if I can provide more details.

If you're not tied to using pika, this thread helped me achieve what you're trying to do using kombu:
#!/usr/bin/env python
import time, logging, weakref, eventlet
from kombu import Connection, Exchange, Queue
from kombu.utils.debug import setup_logging
from kombu.common import eventloop
from eventlet import spawn_after
eventlet.monkey_patch()
log_format = ('%(levelname) -10s %(asctime)s %(name) -30s %(funcName) '
'-35s %(lineno) -5d: %(message)s')
logging.basicConfig(level=logging.INFO, format=log_format)
logger = logging.getLogger('job_worker')
logger.setLevel(logging.INFO)
def long_running_function(body):
time.sleep(300)
def job_worker(body, message):
long_running_function(body)
message.ack()
def monitor_heartbeats(connection, rate=2):
"""Function to send heartbeat checks to RabbitMQ. This keeps the
connection alive over long-running processes."""
if not connection.heartbeat:
logger.info("No heartbeat set for connection: %s" % connection.heartbeat)
return
interval = connection.heartbeat
cref = weakref.ref(connection)
logger.info("Starting heartbeat monitor.")
def heartbeat_check():
conn = cref()
if conn is not None and conn.connected:
conn.heartbeat_check(rate=rate)
logger.info("Ran heartbeat check.")
spawn_after(interval, heartbeat_check)
return spawn_after(interval, heartbeat_check)
def main():
setup_logging(loglevel='INFO')
# process for heartbeat monitor
p = None
try:
with Connection('amqp://guest:guest#localhost:5672//', heartbeat=300) as conn:
conn.ensure_connection()
monitor_heartbeats(conn)
queue = Queue('job_queue',
Exchange('job_queue', type='direct'),
routing_key='job_queue')
logger.info("Starting worker.")
with conn.Consumer(queue, callbacks=[job_worker]) as consumer:
consumer.qos(prefetch_count=1)
for _ in eventloop(conn, timeout=1, ignore_timeouts=True):
pass
except KeyboardInterrupt:
logger.info("Worker was shut down.")
if __name__ == "__main__":
main()
I stripped out my domain specific code but essentially this is the framework I use.

How can I restart a BaseHTTPServer instance?

This is what I have:
http.py:
class HTTPServer():
def __init__(self, port):
self.port = port
self.thread = None
self.run = True
def serve(self):
self.thread = threading.Thread(target=self._serve)
self.thread.start()
def _serve(self):
serverAddress = ("", self.port)
self.server = MyBaseHTTPServer(serverAddress,MyRequestHandler)
logging.log(logging.INFO, "HTTP server started on port %s"%self.port)
while self.run:
self.server.handle_request()
def stop(self):
self.run = False
self.server.server_close()
Then in another file, to restart it:
def restartHTTP(self):
try:
self.httpserver.stop()
reload(http)
self.httpserver = http.HTTPServer(80)
self.httpserver.serve()
except:
traceback.print_exc()
This gives me an address already in use error, so it seems the HTTP server isn't stopping properly. What else do I need to do to stop it?
EDIT:
Where I call restartHTTP:
def commandHTTPReload(self, parts, byuser, overriderank):
self.client.factory.restartHTTP()
self.client.sendServerMessage("HTTP server reloaded.")
I do know the command is executing because I get the message it's supposed to send.

You just need to let the OS know that you really do want to reuse the port immediately after closing it. Normally it's held in a closed state for a while, in case any extra packets show up. You do this with SO_REUSEADDR:
mysocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
..after opening mysocket. A good place to do this with HTTPServer could be in an overridden server_bind method:
def server_bind(self):
HTTPServer.server_bind(self)
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
Edit: Having looked more closely at your code, I see that your threading model is also likely causing problems here. You're closing the socket in the main(?) thread while the other thread is waiting on a connection on that same socket (in accept()). This arrangement does not have well-defined semantics, and I believe it does different things on different OSes. In any case, it is something you ought to avoid in order to minimize confusion (already lots of that to go around in a multithreaded program). Your old thread will not actually go away until after it gets a connection and handles its request (because it won't re-check self.run until then), and so the port may not be re-bindable until after that.
There isn't really a simple solution to this. You could add a communication pipe between the threads, and then use select()/poll() in the server thread to wait for activity on either of them, or you could timeout the accept() calls after a short amount of time so that self.run gets checked more frequently. Or you could have the main thread connect to the listening socket itself. But whatever you do, you're probably approaching the level of complexity where you ought to look at using a "real" httpd or network framework instead of rolling your own: apache, lighttpd, Tornado, Twisted, etc.

For gracefully stop HTTPServer and close socket one should use:
# Start server
httpd = HTTPServer(...)
httpd.serve_forever()
# Stop server
httpd.shutdown()
httpd.server_close()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.