What is a lifecycle of Aiohttp application when used together with Gunicorn? - python

A project I'm working on uses Gunicorn and Aiohttp to implement a web server. It all starts with something like this:
# main.py
class GunicornApp(gunicorn.app.base.Application):
def __init__(self, ...):
def load_config(self):
def load(self):
return create_aiohttp_app(...)
if __name__ == "__main__":
where create_aiohttp_app is defined as something like this:
def create_aiohttp_app(...) -> web.Application:
app = web.Application(...)
return app
start_app performs certain initialisation actions and then launches an async task which is supposed to execute indefinitely, thus becoming the server's main payload:
async def start_app(web.Application) -> None:
app["payload_obj"] = PayloadClass(...)
app["payload_task"] = create_task(app["payload_obj"].run()) # infinite loop inside
stop_app just does some cleanup:
async def stop_app(app: web.Application) -> None:
With all of the above, there are a few things that I would like to understand:
How many times is GunicornApp.load() supposed to be called? Is this called once per Gunicorn worker, or is it called once during the whole lifetime of the Gunicorn application? In other words, how many web.Application are expected to be created?
What's the expected lifetime of a web.Application instance returned by create_aiohttp_app? When is it disposed of? Does it live until the Gunicorn worker executing it stays alive, or can it outlive it?
How many start_app/stop_app cycles can there be for a web.Application instance? Are these methods only called once each or many times?
What exactly is the relationship between Gunicorn workers and web.Application instances? Does web.Application maintain an infinite event loop inside (thus ensuring that it runs forever and app["payload_task"] doesn't go out of scope, or is there something more complex here?


Slow EC2 Performance with Python Threading?

I'm using Python threading in a REST endpoint so that the endpoint can launch a thread, and then immediately return a 200 OK to the client while the thread runs. (The client then polls server state to track the progress of the thread).
The code runs in 7 seconds on my local dev system, but takes 6 minutes on an AWS EC2 m5.large.
Here's what the code looks like:
import threading
# https://stackoverflow.com/a/1239108/364966
thr = threading.Thread(target=score, args=(myArgs1, myArgs2), kwargs={})
thr.start() # Will run "foo"
thr.is_alive() # Will return whether function is running currently
data = {'now creating test scores'}
return Response(data, status=status.HTTP_200_OK)
I turned off threading to test if that was the cause of the slowdown, like this:
# https://stackoverflow.com/a/1239108/364966
# thr = threading.Thread(target=score, args=(myArgs1, myArgs2), kwargs={})
# thr.start() # Will run "foo"
# thr.is_alive() # Will return whether function is running currently
score(myArgs1, myArgs2)
data = {'now creating test scores'}
return Response(data, status=status.HTTP_200_OK)
...and it ran in 5 seconds on EC2. This proves that something about how I'm handling threads on EC2 is the cause of the slowdown.
Is there something I need to configure on EC2 to better support Python threads?
An AWS-certified consultant has advised me that EC2 is known to be slow in execution of Python threads, and to use AWS Lambda functions instead.

How can I provide shared state to my Flask app with multiple workers without depending on additional software?

I want to provide shared state for a Flask app which runs with multiple workers, i. e. multiple processes.
To quote this answer from a similar question on this topic:
You can't use global variables to hold this sort of data. [...] Use a data source outside of Flask to hold global data. A database, memcached, or redis are all appropriate separate storage areas, depending on your needs.
(Source: Are global variables thread safe in flask? How do I share data between requests?)
My question is on that last part regarding suggestions on how to provide the data "outside" of Flask. Currently, my web app is really small and I'd like to avoid requirements or dependencies on other programs. What options do I have if I don't want to run Redis or anything else in the background but provide everything with the Python code of the web app?
If your webserver's worker type is compatible with the multiprocessing module, you can use multiprocessing.managers.BaseManager to provide a shared state for Python objects. A simple wrapper could look like this:
from multiprocessing import Lock
from multiprocessing.managers import AcquirerProxy, BaseManager, DictProxy
def get_shared_state(host, port, key):
shared_dict = {}
shared_lock = Lock()
manager = BaseManager((host, port), key)
manager.register("get_dict", lambda: shared_dict, DictProxy)
manager.register("get_lock", lambda: shared_lock, AcquirerProxy)
except OSError: # Address already in use
return manager.get_dict(), manager.get_lock()
You can assign your data to the shared_dict to make it accessible across processes:
HOST = ""
PORT = 35791
KEY = b"secret"
shared_dict, shared_lock = get_shared_state(HOST, PORT, KEY)
shared_dict["number"] = 0
shared_dict["text"] = "Hello World"
shared_dict["array"] = numpy.array([1, 2, 3])
However, you should be aware of the following circumstances:
Use shared_lock to protect against race conditions when overwriting values in shared_dict. (See Flask example below.)
There is no data persistence. If you restart the app, or if the main (the first) BaseManager process dies, the shared state is gone.
With this simple implementation of BaseManager, you cannot directly edit nested values in shared_dict. For example, shared_dict["array"][1] = 0 has no effect. You will have to edit a copy and then reassign it to the dictionary key.
Flask example:
The following Flask app uses a global variable to store a counter number:
from flask import Flask
app = Flask(__name__)
number = 0
def counter():
global number
number += 1
return str(number)
This works when using only 1 worker gunicorn -w 1 server:app. When using multiple workers gunicorn -w 4 server:app it becomes apparent that number is not a shared state but individual for each worker process.
Instead, with shared_dict, the app looks like this:
from flask import Flask
app = Flask(__name__)
HOST = ""
PORT = 35791
KEY = b"secret"
shared_dict, shared_lock = get_shared_state(HOST, PORT, KEY)
shared_dict["number"] = 0
def counter():
with shared_lock:
shared_dict["number"] += 1
return str(shared_dict["number"])
This works with any number of workers, like gunicorn -w 4 server:app.
your example is a bit magic for me! I'd suggest reusing the magic already in the multiprocessing codebase in the form of a Namespace. I've attempted to make the following code compatible with spawn servers (i.e. MS Windows) but I only have access to Linux machines, so can't test there
start by pulling in dependencies and defining our custom Manager and registering a method to get out a Namespace singleton:
from multiprocessing.managers import BaseManager, Namespace, NamespaceProxy
class SharedState(BaseManager):
_shared_state = Namespace(number=0)
def _get_shared_state(cls):
return cls._shared_state
SharedState.register('state', SharedState._get_shared_state, NamespaceProxy)
this might need to be more complicated if creating the initial state is expensive and hence should only be done when it's needed. note that the OPs version of initialising state during process startup will cause everything to reset if gunicorn starts a new worker process later, e.g. after killing one due to a timeout
next I define a function to get access to this shared state, similar to how the OP does it:
def shared_state(address, authkey):
manager = SharedState(address, authkey)
manager.get_server() # raises if another server started
except OSError:
return manager.state()
though I'm not sure if I'd recommend doing things like this. when gunicorn starts it spawns lots of processes that all race to run this code and it wouldn't surprise me if this could go wrong sometimes. also if it happens to kill off the server process (because of e.g. a timeout) every other process will start to fail
that said, if we wanted to use this we would do something like:
ss = shared_state('server.sock', b'noauth')
ss.number += 1
this uses Unix domain sockets (passing a string rather than a tuple as an address) to lock this down a bit more.
also note this has the same race conditions as the OP's code: incrementing a number will cause the value to be transferred to the worker's process, which is then incremented, and sent back to the server. I'm not sure what the _lock is supposed to be protecting, but I don't think it'll do much

Tornado 4.x solution of running game on ThreadPoolExecutor not working anymore. Need help refactoring it

My ThreadPoolExecutor/gen.coroutine(tornado v4.x) solution to circumvent blocking the webserver is not working anymore with tornado version 6.x.
A while back I started to develop an online Browser game using a Tornado webserver(v4.x) and websockets. Whenever user input is expected, the game would send the question to the client and wait for the response. Back than i used gen.coroutine and a ThreadPoolExecutor to make this task non-blocking. Now that I started refactoring the game, it is not working with tornado v6.x and the task is blocking the server again. I searched for possible solutions, but so far i have been unable to get it working again. It is not clear to me how to change my existing code to be non-blocking again.
class PlayerWebSocket(tornado.websocket.WebSocketHandler):
executor = ThreadPoolExecutor(max_workers=15)
def on_message(self,message):
params = message.split(':')
if __name__ == '__main__':
application = Application()
def send(self, message):
def create_choice(self, id, choices):
d = {}
d['id'] = id
while not d['id'] in self.callbacks:
del self.choice[d['id']]
return self.callbacks[d['id']]
Whenever a choice is to be made, the create_choice function creates a dict with a list (choices) and an id and stores it in the players self.callbacks. After that it just stays in the while loop until the websocket.on_message function puts the received answer (which looks like this: id:Choice_id, so for example 1:12838732) into the callbacks dict.
The WebSocketHandler.write_message method is not thread-safe, so it can only be called from the IOLoop's thread, and not from a ThreadPoolExecutor (This has always been true, but sometimes it might have seemed to work anyway).
The simplest way to fix this code is to save IOLoop.current() in a global variable from the main thread (the current() function accesses a thread-local variable so you can't call it from the thread pool) and use ioloop.add_callback(self.socket.write_message, message) (and remove #gen.coroutine from send - it doesn't do any good to make functions coroutines if they contain no yield expressions).

Starting an independent ApplicationSession and event loop from an existing one

I'd like to be able to spawn an independent event loop/reactor from an existing one. Let's say I have an application in a module standaloneapps:
#in standaloneapps.py
class StandaloneApp(ApplicationSession):
def runner(self, message):
def start_app(self):
yield self.subscribe(self.runner, 'com.example.some_topic')
I'd like to be able to start this application from a different one. Example:
from standaloneapps import StandaloneApp
class ApplicationStarter(ApplicationSession):
def onJoin(self, details):
yield self.subscribe(self.start_app, 'com.example.startapp')
def start_app(self, message):
print('subscribing to app')
new_runner = ApplicationRunner(url="ws://",
I can start ApplicationStarter, but as soon as I publish the event 'com.example.startapp' crossbar crashes with exception builtins.Exception: not joined.
Perhaps this seems like an overly complicated setup, but I'm trying to have one application subscribe to an 'app dispatcher' which dynamically starts new applications which may-or-may-not be known ahead of time. I'd like for the new applications to be running on different event loops so as to stay isolated.
Edit: All that was necessary was to add:
runner.run(StandaloneApp, start_reactor=False)
to the call to start the second app. This forces it to run in the same loop.

How can I keep multiple gevent servers serving forever?

Currently, I have an application that has two servers: the first processes orders and responds individually, the second broadcasts results to other interested subscribers. They need to be served from different ports. I can start() both of them, but I can only get one or the other to serve_forever() as I read it is a blocking function. I am looking for ideas on how to keep both the servers from exiting. abbreviated code below:
def main():
stacklist = []
subslist = []
bcastserver = BroadcastServer(subslist) # creates a new server
tradeserver = TradeServer(stacklist) # creates a new server
bcastserver.start() # start accepting new connections
tradeserver.start() # start accepting new connections
#bcastserver.serve_forever() #if I do it here, the first one...
#tradeserver.serve_forever() #blocks the second one
class TradeServer(StreamServer):
def __init__(self, stacklist):
self.stacklist = stacklist
StreamServer.__init__(self, ('localhost', 12345), self.handle)
#self.serve_forever() #If I put it here in both, neither works
def handle(self, socket, address):
#handler here
class BroadcastServer(StreamServer):
def __init__(self, subslist):
StreamServer.__init__(self, ('localhost', 8000), self.handle)
self.subslist = subslist
#self.serve_forever() #If I put it here in both, neither works
def handle(self, socket, address):
#handler here
Perhaps I just need a way to keep the two from exiting, but I'm not sure how. In the end, I want both servers to listen forever for incoming connections and handle them.
I know this question has an accepted answer, but there is a better one. I'm adding it for people like me who find this post later.
As described on in the gevent documentation about servers:
The BaseServer.serve_forever() method calls BaseServer.start() and then waits until interrupted or until the server is stopped.
So you can just do:
def main():
stacklist = []
subslist = []
bcastserver = BroadcastServer(subslist) # creates a new server
tradeserver = TradeServer(stacklist) # creates a new server
bcastserver.start() # starts accepting bcast connections and returns
tradeserver.serve_forever() # starts accepting trade connections and blocks until tradeserver stops
bcastserver.stop() # stops also the bcast server
The gevent introduction documentation explains why this works:
Unlike other network libraries, though in a similar fashion as
eventlet, gevent starts the event loop implicitly in a dedicated
greenlet. There’s no reactor that you must call a run() or dispatch()
function on. When a function from gevent’s API wants to block, it
obtains the gevent.hub.Hub instance — a special greenlet that runs the
event loop — and switches to it (it is said that the greenlet yielded
control to the Hub).
When serve_forever() blocks, it does not prevent either server from continuing communication.
Note: In the above code the trader server is the one that decides when the whole application stops. If you want the broadcast server to decide this, you should swap them in the start() and serve_forever() calls.
ok, I was able to do this using threading and with gevent's monkeypatch library:
from gevent import monkey
def main():
# etc, etc
t = threading.Thread(target=bcastserver.serve_forever)
Start each server loop in its own instance of Python (one console per gevent). I've never understood trying to run multiple servers from one program. You can run the same server many times and use a reverse proxy like nginx to load balance and route accordingly.

