I currently have a route in a Flask app that pulls data from an external server and then pushes the results to the front end. The external server is occasionally slow or unresponsive. What's the best way to place a timeout on the route call, so that the front end doesn't hang if the external server is lagging? Or is there a more appropriate way to handle this situation in Flask (not Apache, nginx, etc)?
UPDATE: My goal is to timeout a route call, not keep an arbitrary long process alive like this SO question: Time out issues with chrome and flask. Options like websockets run background processes/threads until they finish; however, I want to stop a slow route call after some fixed amount of time has elapsed. Like Timeout on a function call and Python Timeout but within a Flask context. Celery's task decorator (Concurrent asynchronous processes with Python, Flask and Celery) seems like a great solution, but I don't want to require a large dependency to only use a small amount of its functionality.
I reopened this question here: Place a timeout on calls to an unresponsive Flask route (updated).
Related
I deployed a Django app on Heroku. I have a function (inside views) in my app that take some time (3m-5m) before it returns.
The problem is that function doesn't return when the app is deployed to Heroku. On my PC it works fine.
Heroku is not giving me useful feedback. There is no 'timeout' or anything in the logs.
Three to five minutes is way too long for a request to take. Heroku will kill such requests:
Best practice is to get the response time of your web application to be under 500ms, this will free up the application for more requests and deliver a high quality user experience to your visitors. Occasionally a web request may hang or take an excessive amount of time to process by your application. When this happens the router will terminate the request if it takes longer than 30 seconds to complete.
I'm not sure why you aren't seeing timeouts in the logs, but if you truly need that much time to compute something you'll need to do it asynchronously.
There are lots of ways to do that, e.g. you could queue the work and then respond immediately with a "loading" state, then poll the back-end and update the view when the result is ready.
Start by reading Worker Dynos, Background Jobs and Queueing and then decide how you wish to proceed. We can't tell you the "right" way of doing this; it's something you need to decide about your application.
I have just started with Python, although I have been programming in other languages over the past 30 years. I wanted to keep my first application simple, so I started out with a little home automation project hosted on a Raspberry Pi.
I got my code to work fine (controlling a valve, reading a flow sensor and showing some data on a display), but when I wanted to add some web interactivity it came to a sudden halt.
Most articles I have found on the subject suggest to use the Flask framework to compose dynamic web pages. I have tried, and understood, the basics of Flask, but I just can't get around the issue that Flask is blocking once I call the "app.run" function. The rest of my python code waits for Flask to return, which never happens. I.e. no more water flow measurement, valve motor steering or display updating.
So, my basic question would be: What tool should I use in order to serve a simple dynamic web page (with very low load, like 1 request / week), in parallel to my applications main tasks (GPIO/Pulse counting)? All this in the resource constrained environment of a Raspberry Pi (3).
If you still suggest Flask (because it seems very close to target), how should I arrange my code to keep handling the real-world events, such as mentioned above?
(This last part might be tough answering without seeing the actual code, but maybe it's possible answering it in a "generic" way? Or pointing to existing examples that I might have missed while searching.)
You're on the right track with multithreading. If your monitoring code runs in a loop, you could define a function like
def monitoring_loop():
while True:
# do the monitoring
Then, before you call app.run(), start a thread that runs that function:
import threading
from wherever import monitoring_loop
monitoring_thread = threading.Thread(target = monitoring_loop)
monitoring_thread.start()
# app.run() and whatever else you want to do
Don't join the thread - you want it to keep running in parallel to your Flask app. If you joined it, it would block the main execution thread until it finished, which would be never, since it's running a while True loop.
To communicate between the monitoring thread and the rest of the program, you could use a queue to pass messages in a thread-safe way between them.
The way I would probably handle this is to split your program into two distinct separately running programs.
One program handles the GPIO monitoring and communication, and the other program is your small Flask server. Since they run as separate processes, they won't block each other.
You can have the two processes communicate through a small database. The GIPO interface can periodically record flow measurements or other relevant data to a table in the database. It can also monitor another table in the database that might serve as a queue for requests.
Your Flask instance can query that same database to get the current statistics to return to the user, and can submit entries to the requests queue based on user input. (If the GIPO process updates that requests queue with the current status, the Flask process can report that back out.)
And as far as what kind of database to use on a little Raspberry Pi, consider sqlite3 which is a very small, lightweight file-based database well supported as a standard library in Python. (It doesn't require running a full "database server" process.)
Good luck with your project, it sounds like fun!
Hi i was trying the connection with dronekit_sitl and i got the same issue , after 30 seconds the connection was closed.To get rid of that , there are 2 solutions:
You use the decorator before_request:in this one you define a method that will handle the connection before each request
You use the decorator before_first_request : in this case the connection will be made once the first request will be called and the you can handle the object in the other route using a global variable
For more information https://pythonise.com/series/learning-flask/python-before-after-request
My Flask application will receive a request, do some processing, and then make a request to a slow external endpoint that takes 5 seconds to respond. It looks like running Gunicorn with Gevent will allow it to handle many of these slow requests at the same time. How can I modify the example below so that the view is non-blocking?
import requests
#app.route('/do', methods = ['POST'])
def do():
result = requests.get('slow api')
return result.content
gunicorn server:app -k gevent -w 4
If you're deploying your Flask application with gunicorn, it is already non-blocking. If a client is waiting on a response from one of your views, another client can make a request to the same view without a problem. There will be multiple workers to process multiple requests concurrently. No need to change your code for this to work. This also goes for pretty much every Flask deployment option.
First a bit of background, A blocking socket is the default kind of socket, once you start reading your app or thread does not regain control until data is actually read, or you are disconnected. This is how python-requests, operates by default. There is a spin off called grequests which provides non blocking reads.
The major mechanical difference is that send, recv, connect and accept
can return without having done anything. You have (of course) a number
of choices. You can check return code and error codes and generally
drive yourself crazy. If you don’t believe me, try it sometime
Source: https://docs.python.org/2/howto/sockets.html
It also goes on to say:
There’s no question that the fastest sockets code uses non-blocking
sockets and select to multiplex them. You can put together something
that will saturate a LAN connection without putting any strain on the
CPU. The trouble is that an app written this way can’t do much of
anything else - it needs to be ready to shuffle bytes around at all
times.
Assuming that your app is actually supposed to do something more than
that, threading is the optimal solution
But do you want to add a whole lot of complexity to your view by having it spawn it's own threads. Particularly when gunicorn as async workers?
The asynchronous workers available are based on Greenlets (via
Eventlet and Gevent). Greenlets are an implementation of cooperative
multi-threading for Python. In general, an application should be able
to make use of these worker classes with no changes.
and
Some examples of behavior requiring asynchronous workers: Applications
making long blocking calls (Ie, external web services)
So to cut a long story short, don't change anything! Just let it be. If you are making any changes at all, let it be to introduce caching. Consider using Cache-control an extension recommended by python-requests developers.
You can use grequests. It allows other greenlets to run while the request is made. It is compatible with the requests library and returns a requests.Response object. The usage is as follows:
import grequests
#app.route('/do', methods = ['POST'])
def do():
result = grequests.map([grequests.get('slow api')])
return result[0].content
Edit: I've added a test and saw that the time didn't improve with grequests since gunicorn's gevent worker already performs monkey-patching when it is initialized: https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/ggevent.py#L65
Am running an app with Flask , UWSGI and Nginx. My UWSGI is set to spawn out 4 parallel processes to handle multiple requests at the same time. Now I have one request that takes lot of time and that changes important data concerning the application. So, when one UWSGI process is processing that request and say all others are also busy, the fifth request would have to wait. The problem here is I cannot change this request to run in an offline mode as it changes important data and the user cannot simply remain unknown about it. What is the best way to handle this situation ?
As an option you can do the following:
Separate the heavy logic from the function which is being called
upon #route and move it into a separate place (a file, another
function, etc)
Introduce Celery to run that pieces of heavy logic
(it will be processed in a separate thread from the #route-decorated functions).
A quick way of doing this is using Redis as a message broker.
Schedule the time-consuming functions from your #route-decorated
functions in Celery (it is possible to pass parameters as well)
This way the HTTP requests won't be blocked for the complete function execution time.
I am working on a django web app that has functions (say for e.g. sync_files()) that take a long time to return. When I use gevent, my app does not block when sync_file() runs and other clients can connect and interact with the webapp just fine.
My goal is to have the webapp responsive to other clients and not block. I do not expect a zillion users to connect to my webapp (perhaps max 20 connections), and I do not want to set this up to become the next twitter. My app is running on a vps, so I need something light weight.
So in my case listed above, is it redundant to use celery when I am using gevent? Is there a specific advantage to using celery? I prefer not to use celery since it is yet another service that will be running on my machine.
edit: found out that celery can run the worker pool on gevent. I think I am a litle more unsure about the relationship between gevent & celery.
In short you do need a celery.
Even if you use gevent and have concurrency, the problem becomes request timeout. Lets say your task takes 10 minutes to run however the typical request timeout is about up to a minute. So what will happen if you trigger the task directly within a view is that the server will start processing it however after a minute a client (browser) will probably disconnect the connection since it will think the server is offline. As a result, your data can become corrupt since you cannot be guaranteed what will happen when connection will close. Celery solves this because it will trigger a background process which will process the task independent of the view. So the user will get the view response right away and at the same time the server will start processing the task. That is a correct pattern to handle any scenarios which require lots of processing.