I want to kill a gunicorn worker after each request to a specific endpoint:
User sends GET /endpoint
A particular Gunicorn worker process X accepts the request and responds with 200 OK, using Flask underneath.
X gets killed, replaced by a new Gunicorn worker process Y automatically.
I could only find a way to restart after any request (use Gunicorn's max_requests settings), but that's not what I want. Basically, I need max_requests=1 but only applied to one specific endpoint, not all endpoints.
Related
My Flask project takes in orders as POST requests from multiple online stores, saves those orders to a database, and forwards the purchase information to a service which delivers the product. Sometimes, the product is not set up in the final service and the request sits in my service's database in an "unresolved" state.
When the product is set up in the final service, I want to kick off a long-running (maybe a minute) process to send all "unresolved" orders to the final service. During this process, will Flask still be able to receive orders from the stores and continue processing as normal? If not, do I need to offload this to a task runner like rq?
I'm not worried about speed as much as I am about consistency. The items being purchased are tickets to a live event so as long as the order information is passed along before the event begins, it should make no difference to the customer.
There's a few different answers that are all valid in different situations. The quick answer is that a job queue like RQ is usually the right solution, especially in the long run as your project grows.
As long as the WSGI server has workers available, another request can be handled. Each worker handles one request at a time. The development server uses threads, so an unlimited number of workers are available (with the performance constraints of threads in Python). Production servers like Gunicorn can use multiple workers, and different types of workers such as threads, processes, or eventlets. If you want to run a task in response to an HTTP request and wait until the task is finished to send a response, you'll need enough workers to block on those tasks along with handling regular requests.
#app.route("/admin/send-purchases")
def send_purchases():
... # do stuff, wait for it to finish
return "success"
However, the task you're describing seems like a cleanup task that should be run regardless of HTTP requests from a user. In that case, you should write a Flask CLI command and call it using cron or another scheduling system.
#app.cli.command()
def send_purchases():
...
click.echo("done")
# crontab hourly job
0 * * * * env FLASK_APP=myapp /path/to/venv/bin/flask send-purchases
If you do want a user to initiate the task, but don't want to block a worker waiting for it to finish, then you want a task queue such as RQ or Celery. You could make a CLI command that submits the job too, to be able to trigger it on request and on a schedule.
#rq.job
def send_purchases():
...
#app.route("/admin/send-purchases", endpoint="send_purchases")
def send_purchases_view():
send_purchases.queue()
return "started"
#app.cli.command("send-purchases")
def send_purchases_command():
send_purchases.queue()
click.echo("started")
Flask's development server will spawn a new thread for each request. Similary, production servers can be started with multiple workers.
You can run your app with gunicorn or similar with multiple processes. For example with four process workers:
gunicorn -w 4 app:app
For example with eventlet workers:
gunicorn -k eventlet app:app
See the docs on deploying in production as well: https://flask.palletsprojects.com/en/1.1.x/deploying/
Following this guide(Complex Example: Showing Status Updates and Results section) i have 2 flask endpoints for starting a bound task (POST request) and retreiving tasks result by its id (GET request to /status/<task_id>).
Running flask app as flask run in one shell and celery worker -A app.celery -l info in another one, it is possible to run task and then get its result by GET request to the /status/ endpoint.
After adding gunicorn and setting number of workers to 3, POST requests run normally, but getting status of a specific running task is a problem, as it can't get task(task.info is None). There is a chance of getting task result by this status endpoint, but if i correctrly understand the problem, it depends on which flask instance gunicorn redirects a request to.
I dont set any specific celery setting, only broker and result_backend(using RabbitMQ).
How to correctly configure the gunicorn+flask+celery for this sort of tasks?
Fixed by using redis as result backend(or any other than RPC, i believe).
According to the documentation
The RPC result backend (rpc://) is special as it doesn’t actually store the states, but rather sends them as messages. This is an important difference as it means that a result can only be retrieved once, and only by the client that initiated the task. Two different processes can’t wait for the same result.
I have configured gunicorn for 5 workers. However only 1 is started others are like in sleeping state. Only when I would try to login i.e. a request is sent to them they would fork/start for the first time. Below is config.
$VIRT_ENV/gunicorn -c config.py utrade.wsgi:application \
--preload \
--log-level=debug \
--timeout=30 \
--access-logfile=- \
--access-logformat="%(r)s %(s)s" \
--log-file=-
Inside the django views file there tornado worker initilaztion code, there is some sort of dependency where I want all the tornado processes to be up before user logins. To make it clear, Let's say if I put print('Hello') in my django views file. It's not printed by a worker until a request is served.
How can I make gunicorn start all the workers and not wait for request? I tried preload flag but it didn't help.
config.py
bind = 'unix:/code/internal.utradesolutions.com/tanmay.garg/web/web/utrade/run/gunicorn.sock'
workers = 5
daemon=True
You need to use a process manager (like supervisor) to manage the tornado process separately from gunicorn which is running your wsgi process.
As far as preload, this is an optimization that loads your code before the workers start (docs):
Load application code before the worker processes are forked.
By preloading an application you can save some RAM resources as well
as speed up server boot times. Although, if you defer application
loading to each worker process, you can reload your application code
easily by restarting workers.
I just want to know is it possible to run and handle multiple processes in Django when i am using gunicorn server.
If one client is requesting for data and at the same time other client request the same then both process should be executed simultaneous instead of queue.
Is there any other way to make this happen.?
You can start multiple worker processes:
gunicorn -w 4 ...
That would create 4 processes which each can handle one request at a time.
You could also use a different worker type, like gevent or meinheld, to make gunicorn handle requests asynchronously:
gunicorn --worker-class=gevent ...
gunicorn --worker-class="egg:meinheld#gunicorn_worker" ...
For those last two, you need to either install gevent (one of the rc versions) or meinheld.
We are running django/gunicorn server on heroku. Most of our users are in a country where the mobile network is not-that-great, so frequently they have flaky connections.
Most of our requests are "raw posts" from mobile devices and it seems that even when the POST request is not fully transmitted the request is already being sent off to be handled by a gunicorn worker. When the worker tries to process the request and read the data it simply hangs waiting for the remaining data. While this behaviour makes sense for reading file/image data in "streaming" mode, it makes no sense in our case because all our posts are relatively small and could be easily read by the web server as a whole and only then forwarded to our gunicorn worker.
This early handoff causes trouble when we have many such requests in parallel - because all the workers might get blocked. Currently we solve the problem by increasing number of workers/dynos but it is pretty costly. I could not find any way to force either the web server or gunicorn to wait and only forward the request to a worker once it is fully transmitted.
Is there a way to make heroku's web server/gunicorn only transfer the request to a gunicorn worker when it has been fully transmitted from the client side (fully received by the server)?
Some example code (we've added newrelic 'per-instruction' tracing to make sure that this is the exact line that causes the problem):
def syncGameState(request):
transaction = agent.current_transaction()
with agent.FunctionTrace(transaction, "syncGameState_raw_post_data", 'Python/EndPoint'):
data = request.raw_post_data
with agent.FunctionTrace(transaction, "syncGameState_gameStateSyncRequest", 'Python/EndPoint'):
sync_request = sync_pb2.gameStateSyncRequest()
with agent.FunctionTrace(transaction, "syncGameState_ParseFromString", 'Python/EndPoint'):
sync_request.ParseFromString(data)
Here are the New Relic measurements for this example slow request (it was a POST with 7K of data). Reading the POST takes 99% of the method time....
It seems to me that the real issue here is that gunicorn is blocking. This is because gunicorn (by default) uses synchronous workers to run your tasks. This means that when a web request hits gunicorn it will block until it has returned a response--in your case, a long time.
To get around this issue, you can use gevent with gunicorn to do non-blocking IO. Since most of your time is spent doing IO stuff, this will ensure gunicorn can handle many more web requests in parallel.
To use gevent with gunicorn you need to install gevent (pip install -U gevent), and change your gunicorn startup command by adding: gunicorn -k gevent (this will tell gunicorn to use gevent as the worker).
You might want to give this article a read and investigate a request buffering HTTP server such as Waitress.