Gunicorn multiple flask workers and bound tasks celery - python

Following this guide(Complex Example: Showing Status Updates and Results section) i have 2 flask endpoints for starting a bound task (POST request) and retreiving tasks result by its id (GET request to /status/<task_id>).
Running flask app as flask run in one shell and celery worker -A app.celery -l info in another one, it is possible to run task and then get its result by GET request to the /status/ endpoint.
After adding gunicorn and setting number of workers to 3, POST requests run normally, but getting status of a specific running task is a problem, as it can't get task(task.info is None). There is a chance of getting task result by this status endpoint, but if i correctrly understand the problem, it depends on which flask instance gunicorn redirects a request to.
I dont set any specific celery setting, only broker and result_backend(using RabbitMQ).
How to correctly configure the gunicorn+flask+celery for this sort of tasks?

Fixed by using redis as result backend(or any other than RPC, i believe).
According to the documentation
The RPC result backend (rpc://) is special as it doesn’t actually store the states, but rather sends them as messages. This is an important difference as it means that a result can only be retrieved once, and only by the client that initiated the task. Two different processes can’t wait for the same result.

Related

During a long-running process, will Flask be insensitive to new requests?

My Flask project takes in orders as POST requests from multiple online stores, saves those orders to a database, and forwards the purchase information to a service which delivers the product. Sometimes, the product is not set up in the final service and the request sits in my service's database in an "unresolved" state.
When the product is set up in the final service, I want to kick off a long-running (maybe a minute) process to send all "unresolved" orders to the final service. During this process, will Flask still be able to receive orders from the stores and continue processing as normal? If not, do I need to offload this to a task runner like rq?
I'm not worried about speed as much as I am about consistency. The items being purchased are tickets to a live event so as long as the order information is passed along before the event begins, it should make no difference to the customer.
There's a few different answers that are all valid in different situations. The quick answer is that a job queue like RQ is usually the right solution, especially in the long run as your project grows.
As long as the WSGI server has workers available, another request can be handled. Each worker handles one request at a time. The development server uses threads, so an unlimited number of workers are available (with the performance constraints of threads in Python). Production servers like Gunicorn can use multiple workers, and different types of workers such as threads, processes, or eventlets. If you want to run a task in response to an HTTP request and wait until the task is finished to send a response, you'll need enough workers to block on those tasks along with handling regular requests.
#app.route("/admin/send-purchases")
def send_purchases():
... # do stuff, wait for it to finish
return "success"
However, the task you're describing seems like a cleanup task that should be run regardless of HTTP requests from a user. In that case, you should write a Flask CLI command and call it using cron or another scheduling system.
#app.cli.command()
def send_purchases():
...
click.echo("done")
# crontab hourly job
0 * * * * env FLASK_APP=myapp /path/to/venv/bin/flask send-purchases
If you do want a user to initiate the task, but don't want to block a worker waiting for it to finish, then you want a task queue such as RQ or Celery. You could make a CLI command that submits the job too, to be able to trigger it on request and on a schedule.
#rq.job
def send_purchases():
...
#app.route("/admin/send-purchases", endpoint="send_purchases")
def send_purchases_view():
send_purchases.queue()
return "started"
#app.cli.command("send-purchases")
def send_purchases_command():
send_purchases.queue()
click.echo("started")
Flask's development server will spawn a new thread for each request. Similary, production servers can be started with multiple workers.
You can run your app with gunicorn or similar with multiple processes. For example with four process workers:
gunicorn -w 4 app:app
For example with eventlet workers:
gunicorn -k eventlet app:app
See the docs on deploying in production as well: https://flask.palletsprojects.com/en/1.1.x/deploying/

Best way to handle webhook response timeouts in Flask?

I have a Flask application running on a Google Cloud Function that receives a Webhook from Shopify when an order is created. The problem is I'm timing out very often, here's what I mean by that:
#app.route('/', methods=['POST'])
def connectToSheets(request):
print('Webhook received...')
# Verify request is coming from Shopify
data = request.data
hmac_header = request.headers.get('X-Shopify-Hmac-SHA256')
verify_webhook(data, hmac_header)
print('Request validated...')
# Do some stuff...
Shopify's docs states that there is a 5 sec timeout period and a retry period for subscriptions. After I validate the request, there is quite a lot of code so I'm timing out almost every time.
Is there a way I can send a 200 status code to Shopify after I validate the Webhook and before I start processing the Webhook? Or is there a work-around to this?
One way to do this entirely w/in Cloud Functions is to set up two functions:
one that handles the initial request
a second one that does the processing and then follows up with the response
In addition to handling the initial request, the first function also invokes the second function via Cloud Pub/Sub.
See https://dev.to/googlecloud/getting-around-api-timeouts-with-cloud-functions-and-cloud-pub-sub-47o3 for a complete example (this uses Slack's webhook, but the behavior should be similar).
I used to face the same issue as yours. So, we moved the processing code from being executed inline to be executed in a background task by using celery and rabbitMq. RabbitMq was used for queue management. You can use Redis for queue management also.
Celery - https://docs.celeryproject.org/en/stable/getting-started/index.html
RabbitMq - https://www.rabbitmq.com/documentation.html
Asynchronous Tasks Using Flask, Redis, and Celery - https://stackabuse.com/asynchronous-tasks-using-flask-redis-and-celery/
How to Set Up a Task Queue with Celery and RabbitMQ - https://www.linode.com/docs/development/python/task-queue-celery-rabbitmq/

Restart Gunicorn after specific request completes

I want to kill a gunicorn worker after each request to a specific endpoint:
User sends GET /endpoint
A particular Gunicorn worker process X accepts the request and responds with 200 OK, using Flask underneath.
X gets killed, replaced by a new Gunicorn worker process Y automatically.
I could only find a way to restart after any request (use Gunicorn's max_requests settings), but that's not what I want. Basically, I need max_requests=1 but only applied to one specific endpoint, not all endpoints.

Celery REST API

Is there a way to use Celery for:
Queue a HTTP call to external URL with Form parameters (HTTP Post to
URL)
The external URL will respond HTTP response, 200, 404, 400 etc, if
response is in form of error non-200-ish response it will retry for
a certain number of retry and will retire as needed
Add Task / Job / Work queue into Celery using REST API, passing the URL to call and Form parameters
For that you need to create a task in your celery application that would perform that request for you and return the result.
Handling the errors and retries can be done within the code of your task, or can alternatively be taken care by celery if you schedule the task with the right arguments: see the arguments of .apply_async()
You can schedule new tasks via REST API if you run Celery Flower. It has a REST API (see documentation), in particular a POST endpoint to schedule a task.
Yes, create an I/O class that handle your http requests and process.
Read about celery tasks and remember to set connect_timeout= 5.0, read_timeout = 30.0 timeouts to your I/O ops to not block your workers.
There is a precise example of using requests in the celery worker tasks.
you can use flower rest API to do the same, as flower is a monitoring tool for celery. But it comes with rest api to add task and all
https://flower.readthedocs.io/en/latest/index.html

Workers in django run on heroku are hanging on post when client has flaky connection

We are running django/gunicorn server on heroku. Most of our users are in a country where the mobile network is not-that-great, so frequently they have flaky connections.
Most of our requests are "raw posts" from mobile devices and it seems that even when the POST request is not fully transmitted the request is already being sent off to be handled by a gunicorn worker. When the worker tries to process the request and read the data it simply hangs waiting for the remaining data. While this behaviour makes sense for reading file/image data in "streaming" mode, it makes no sense in our case because all our posts are relatively small and could be easily read by the web server as a whole and only then forwarded to our gunicorn worker.
This early handoff causes trouble when we have many such requests in parallel - because all the workers might get blocked. Currently we solve the problem by increasing number of workers/dynos but it is pretty costly. I could not find any way to force either the web server or gunicorn to wait and only forward the request to a worker once it is fully transmitted.
Is there a way to make heroku's web server/gunicorn only transfer the request to a gunicorn worker when it has been fully transmitted from the client side (fully received by the server)?
Some example code (we've added newrelic 'per-instruction' tracing to make sure that this is the exact line that causes the problem):
def syncGameState(request):
transaction = agent.current_transaction()
with agent.FunctionTrace(transaction, "syncGameState_raw_post_data", 'Python/EndPoint'):
data = request.raw_post_data
with agent.FunctionTrace(transaction, "syncGameState_gameStateSyncRequest", 'Python/EndPoint'):
sync_request = sync_pb2.gameStateSyncRequest()
with agent.FunctionTrace(transaction, "syncGameState_ParseFromString", 'Python/EndPoint'):
sync_request.ParseFromString(data)
Here are the New Relic measurements for this example slow request (it was a POST with 7K of data). Reading the POST takes 99% of the method time....
It seems to me that the real issue here is that gunicorn is blocking. This is because gunicorn (by default) uses synchronous workers to run your tasks. This means that when a web request hits gunicorn it will block until it has returned a response--in your case, a long time.
To get around this issue, you can use gevent with gunicorn to do non-blocking IO. Since most of your time is spent doing IO stuff, this will ensure gunicorn can handle many more web requests in parallel.
To use gevent with gunicorn you need to install gevent (pip install -U gevent), and change your gunicorn startup command by adding: gunicorn -k gevent (this will tell gunicorn to use gevent as the worker).
You might want to give this article a read and investigate a request buffering HTTP server such as Waitress.

Categories

Resources