Is there a way to use Celery for:
Queue a HTTP call to external URL with Form parameters (HTTP Post to
URL)
The external URL will respond HTTP response, 200, 404, 400 etc, if
response is in form of error non-200-ish response it will retry for
a certain number of retry and will retire as needed
Add Task / Job / Work queue into Celery using REST API, passing the URL to call and Form parameters
For that you need to create a task in your celery application that would perform that request for you and return the result.
Handling the errors and retries can be done within the code of your task, or can alternatively be taken care by celery if you schedule the task with the right arguments: see the arguments of .apply_async()
You can schedule new tasks via REST API if you run Celery Flower. It has a REST API (see documentation), in particular a POST endpoint to schedule a task.
Yes, create an I/O class that handle your http requests and process.
Read about celery tasks and remember to set connect_timeout= 5.0, read_timeout = 30.0 timeouts to your I/O ops to not block your workers.
There is a precise example of using requests in the celery worker tasks.
you can use flower rest API to do the same, as flower is a monitoring tool for celery. But it comes with rest api to add task and all
https://flower.readthedocs.io/en/latest/index.html
Related
How can I queue the multiple API request in FastAPI.
For example if there is an API which takes 10 seconds to complete the request and there are 10 API requests at same time then how can We handle that part.
If anyone can help on this then it would be a great help
Thanks
I tried to use the selenium using fastAPI and Gunicorn and its working fine if there one request at a time but if I will do multiple request the selemium generate error.
So I'm trying to find a solution like queue processing in FastAPI.
Queueing is done by the server (gunicorn in your case), not by FastAPI.
Gunicorn has a backlog where the requests are being queued. You can control it using --backlog=N switch. The default backlog size is 2048, so the error is probably due to a timeout on the client. Try to increase it.
You can spawn multiple workers in Gunicorn, use --workers=N switch. If you worker is CPU-bound, you should set N to the number of available cores. It's a better solution.
And, if your worker is IO bound, use an asynchronous model to process more requests at once. This is the best solution.
My Flask project takes in orders as POST requests from multiple online stores, saves those orders to a database, and forwards the purchase information to a service which delivers the product. Sometimes, the product is not set up in the final service and the request sits in my service's database in an "unresolved" state.
When the product is set up in the final service, I want to kick off a long-running (maybe a minute) process to send all "unresolved" orders to the final service. During this process, will Flask still be able to receive orders from the stores and continue processing as normal? If not, do I need to offload this to a task runner like rq?
I'm not worried about speed as much as I am about consistency. The items being purchased are tickets to a live event so as long as the order information is passed along before the event begins, it should make no difference to the customer.
There's a few different answers that are all valid in different situations. The quick answer is that a job queue like RQ is usually the right solution, especially in the long run as your project grows.
As long as the WSGI server has workers available, another request can be handled. Each worker handles one request at a time. The development server uses threads, so an unlimited number of workers are available (with the performance constraints of threads in Python). Production servers like Gunicorn can use multiple workers, and different types of workers such as threads, processes, or eventlets. If you want to run a task in response to an HTTP request and wait until the task is finished to send a response, you'll need enough workers to block on those tasks along with handling regular requests.
#app.route("/admin/send-purchases")
def send_purchases():
... # do stuff, wait for it to finish
return "success"
However, the task you're describing seems like a cleanup task that should be run regardless of HTTP requests from a user. In that case, you should write a Flask CLI command and call it using cron or another scheduling system.
#app.cli.command()
def send_purchases():
...
click.echo("done")
# crontab hourly job
0 * * * * env FLASK_APP=myapp /path/to/venv/bin/flask send-purchases
If you do want a user to initiate the task, but don't want to block a worker waiting for it to finish, then you want a task queue such as RQ or Celery. You could make a CLI command that submits the job too, to be able to trigger it on request and on a schedule.
#rq.job
def send_purchases():
...
#app.route("/admin/send-purchases", endpoint="send_purchases")
def send_purchases_view():
send_purchases.queue()
return "started"
#app.cli.command("send-purchases")
def send_purchases_command():
send_purchases.queue()
click.echo("started")
Flask's development server will spawn a new thread for each request. Similary, production servers can be started with multiple workers.
You can run your app with gunicorn or similar with multiple processes. For example with four process workers:
gunicorn -w 4 app:app
For example with eventlet workers:
gunicorn -k eventlet app:app
See the docs on deploying in production as well: https://flask.palletsprojects.com/en/1.1.x/deploying/
Following this guide(Complex Example: Showing Status Updates and Results section) i have 2 flask endpoints for starting a bound task (POST request) and retreiving tasks result by its id (GET request to /status/<task_id>).
Running flask app as flask run in one shell and celery worker -A app.celery -l info in another one, it is possible to run task and then get its result by GET request to the /status/ endpoint.
After adding gunicorn and setting number of workers to 3, POST requests run normally, but getting status of a specific running task is a problem, as it can't get task(task.info is None). There is a chance of getting task result by this status endpoint, but if i correctrly understand the problem, it depends on which flask instance gunicorn redirects a request to.
I dont set any specific celery setting, only broker and result_backend(using RabbitMQ).
How to correctly configure the gunicorn+flask+celery for this sort of tasks?
Fixed by using redis as result backend(or any other than RPC, i believe).
According to the documentation
The RPC result backend (rpc://) is special as it doesn’t actually store the states, but rather sends them as messages. This is an important difference as it means that a result can only be retrieved once, and only by the client that initiated the task. Two different processes can’t wait for the same result.
I have a Flask application running on a Google Cloud Function that receives a Webhook from Shopify when an order is created. The problem is I'm timing out very often, here's what I mean by that:
#app.route('/', methods=['POST'])
def connectToSheets(request):
print('Webhook received...')
# Verify request is coming from Shopify
data = request.data
hmac_header = request.headers.get('X-Shopify-Hmac-SHA256')
verify_webhook(data, hmac_header)
print('Request validated...')
# Do some stuff...
Shopify's docs states that there is a 5 sec timeout period and a retry period for subscriptions. After I validate the request, there is quite a lot of code so I'm timing out almost every time.
Is there a way I can send a 200 status code to Shopify after I validate the Webhook and before I start processing the Webhook? Or is there a work-around to this?
One way to do this entirely w/in Cloud Functions is to set up two functions:
one that handles the initial request
a second one that does the processing and then follows up with the response
In addition to handling the initial request, the first function also invokes the second function via Cloud Pub/Sub.
See https://dev.to/googlecloud/getting-around-api-timeouts-with-cloud-functions-and-cloud-pub-sub-47o3 for a complete example (this uses Slack's webhook, but the behavior should be similar).
I used to face the same issue as yours. So, we moved the processing code from being executed inline to be executed in a background task by using celery and rabbitMq. RabbitMq was used for queue management. You can use Redis for queue management also.
Celery - https://docs.celeryproject.org/en/stable/getting-started/index.html
RabbitMq - https://www.rabbitmq.com/documentation.html
Asynchronous Tasks Using Flask, Redis, and Celery - https://stackabuse.com/asynchronous-tasks-using-flask-redis-and-celery/
How to Set Up a Task Queue with Celery and RabbitMQ - https://www.linode.com/docs/development/python/task-queue-celery-rabbitmq/
When I get a GET request from a user, I send them the response and then spend maybe a second logging stuff about that request. Is there a way to close the connection when I have the response ready, but continue doing that logging part, so that the user wouldn't have to wait for it to complete?
From the Google App Engine docs for the Response object:
App Engine does not support sending
data to the user's browser before
exiting the handler. Some web servers
use this technique to "stream" data to
the user's browser over a period of
time in response to a single request.
App Engine does not support this
streaming technique.
So there's no easy way. If you have a bundle of data that you can pass to a longer-running "process and log" method, try using the deferred library. Note that this will requiring bundling your data up and sending it to the task queue to do your processing and logging, so
you may not save much time, and
the results may not look much like you'd want - for example, you'd be logging from a different request, so might need to radically alter the logging
Still, you could try.
You have two options:
Use the Task Queue API. Enqueueing a task should be fast, so long as you have less than 10k of data (which is the limit on a Task Queue payload).
Use the 'sneaky' trick described by Rafe in this video to do processing after the response completes.