I have a very basic question. It seems like I don't get it, or I simply need a confirmation.
Let's say I set up a flask app and let it run using gunicorn.
I use --wokers=2 and --threads=2, so I can serve 4 requests on parallel.
Now let's say a client does 4 parallel requests, which will do a requests.get in the flask app, which needs 5 seconds to get a response (in theory). I fifth client call will need to wait for one of the 4 others to be finshed, before it's even started in the backend (and will take another 5 seconds for execution).
Now my question: When I switch to --worker-class gevent, will it help getting more parallel requests without adaping the code? If I understand it correctly, I need to properly use async library calls to get advantage of gevent, and to get a maximum parallel request execution of 1000 for example, right? Am I right by saying: If the code continues to simply do requests.get (or a sleep or whatever) without using async client libs, it the fifth request will still be blocked?
Thank you!
(I've never worked with asnycio and coroutines, so I'm sorry)
Related
I've been into Python/Django and things are going good but I'm very curious about how django handle any asynchronous tasks for example if there is a database query that took 10 seconds to execute and another request arrives on the server during these 10 seconds?
def my_api(id):
do_somthing()
user = User.objects.get(id=id) # took 10 sec to execute
do_somthing_with_user()
I've used the queryset method that fetch something from database and let's say it took 10 seconds to complete. what happens when another request arrives on the django server, how django respond to it? will the request be pending until the first query responds or will it be handled in a parallel way? what is the deep concept for it?
How Python Handles these type things under the hood?
Handling requests is the main job of the web server that u are using. Django users by default use WSGI/ASGI.
Any requests coming to those web servers have a unique session and it's a separate thread. So no conflicts but there are race conditions. (OS resources, Queue, Stack...)
So consider we have a function that has a 2 seconds sleep inside.
import time
def function1():
time.sleep(2)
User 1 and user 2 request to this function by API.(e.x GET test/API)
two threads start for each one and if both almost start at time 0:0:0 then both end like 0:0:2.
So How about Async? Async request work concurrently for each request. (Not Parallel)
assume another API should call that sleepy function twice. (e.x Get test/API2)
The first call would take 2 seconds(Async function and await) and the second call 2 seconds too.
if we called this at 0:0:0 this would be like 0:0:2 again.
In sync way that would be 0:0:4.
So finally how about the Database?
there are many ways to handle that and the popular one is using Database pooling. However, generally, each query request makes a new connection (multi-thread) to the database and runs those stuff.
database pool is a thing that reduces a runtime because it has some ready-to-use connections to the database and whenever it has a job to do it goes and runs the query and back to the place where in without that pool the connection lifetime is over usage and end whenever the job has done.
My Flask application will receive a request, do some processing, and then make a request to a slow external endpoint that takes 5 seconds to respond. It looks like running Gunicorn with Gevent will allow it to handle many of these slow requests at the same time. How can I modify the example below so that the view is non-blocking?
import requests
#app.route('/do', methods = ['POST'])
def do():
result = requests.get('slow api')
return result.content
gunicorn server:app -k gevent -w 4
If you're deploying your Flask application with gunicorn, it is already non-blocking. If a client is waiting on a response from one of your views, another client can make a request to the same view without a problem. There will be multiple workers to process multiple requests concurrently. No need to change your code for this to work. This also goes for pretty much every Flask deployment option.
First a bit of background, A blocking socket is the default kind of socket, once you start reading your app or thread does not regain control until data is actually read, or you are disconnected. This is how python-requests, operates by default. There is a spin off called grequests which provides non blocking reads.
The major mechanical difference is that send, recv, connect and accept
can return without having done anything. You have (of course) a number
of choices. You can check return code and error codes and generally
drive yourself crazy. If you don’t believe me, try it sometime
Source: https://docs.python.org/2/howto/sockets.html
It also goes on to say:
There’s no question that the fastest sockets code uses non-blocking
sockets and select to multiplex them. You can put together something
that will saturate a LAN connection without putting any strain on the
CPU. The trouble is that an app written this way can’t do much of
anything else - it needs to be ready to shuffle bytes around at all
times.
Assuming that your app is actually supposed to do something more than
that, threading is the optimal solution
But do you want to add a whole lot of complexity to your view by having it spawn it's own threads. Particularly when gunicorn as async workers?
The asynchronous workers available are based on Greenlets (via
Eventlet and Gevent). Greenlets are an implementation of cooperative
multi-threading for Python. In general, an application should be able
to make use of these worker classes with no changes.
and
Some examples of behavior requiring asynchronous workers: Applications
making long blocking calls (Ie, external web services)
So to cut a long story short, don't change anything! Just let it be. If you are making any changes at all, let it be to introduce caching. Consider using Cache-control an extension recommended by python-requests developers.
You can use grequests. It allows other greenlets to run while the request is made. It is compatible with the requests library and returns a requests.Response object. The usage is as follows:
import grequests
#app.route('/do', methods = ['POST'])
def do():
result = grequests.map([grequests.get('slow api')])
return result[0].content
Edit: I've added a test and saw that the time didn't improve with grequests since gunicorn's gevent worker already performs monkey-patching when it is initialized: https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/ggevent.py#L65
Am running an app with Flask , UWSGI and Nginx. My UWSGI is set to spawn out 4 parallel processes to handle multiple requests at the same time. Now I have one request that takes lot of time and that changes important data concerning the application. So, when one UWSGI process is processing that request and say all others are also busy, the fifth request would have to wait. The problem here is I cannot change this request to run in an offline mode as it changes important data and the user cannot simply remain unknown about it. What is the best way to handle this situation ?
As an option you can do the following:
Separate the heavy logic from the function which is being called
upon #route and move it into a separate place (a file, another
function, etc)
Introduce Celery to run that pieces of heavy logic
(it will be processed in a separate thread from the #route-decorated functions).
A quick way of doing this is using Redis as a message broker.
Schedule the time-consuming functions from your #route-decorated
functions in Celery (it is possible to pass parameters as well)
This way the HTTP requests won't be blocked for the complete function execution time.
I have a REST API and now I want to create a web site that will use this API as only and primary datasource. The system is distributed: REST API is on one group of machines and the site will be on the other(s).
I'm expecting to have quite a lot of load, so I'd like to make requests as efficient, as possible.
Do I need some async HTTP requests library or any HTTP client library will work?
API is done using Flask, web site will be also built using Flask and Jinja as template engine.
You could use gevent with Flask to get asynchronous I/O from normally synchronous libraries. See this question for an example of someone getting help with doing that.
You could also run Flask behind gunicorn, which has support for spawning multiple workers (threads, processes, or greenlets) for handling concurrent requests. If you were to take that approach, Flask would remain completely synchronous, and gunicorn would handle creating multiple Flask instances to handle concurrent requests.
Start simple and use the way, which seems to be easy to use for you. Consider optimization to be done later on only if needed.
Use of async libraries would come into play as helpful if you would have thousands of request a second. Much sooner you are likely to have performance problems related to database (if you use it), which is not to be resolved by async magic.
Actually your API is on a separate machine. Even if you make your client calls asynchronous , it will not have any impact on the server. By making your calls asynchronous, your thread in the client will not wait for the response. Your server will react the same when call is sync /async.
And if you want to make your calls async , please check http://stackandqueue.com/?p=57 . It uses unirest to make both get and post async calls
Using Django (hosted by Webfaction), I have the following code
import time
def my_function(request):
time.sleep(10)
return HttpResponse("Done")
This is executed via Django when I go to my url, www.mysite.com
I enter the url twice, immediately after each other. The way I see it, both of these should finish after 10 seconds. However, the second call waits for the first one and finishes after 20 seconds.
If, however, I enter some dummy GET parameter, www.mysite.com?dummy=1 and www.mysite.com?dummy=2 then they both finish after 10 seconds. So it is possible for both of them to run simultaneously.
It's as though the scope of sleep() is somehow global?? Maybe entering a parameter makes them run as different processes instead of the same???
It is hosted by Webfaction. httpd.conf has:
KeepAlive Off
Listen 30961
MaxSpareThreads 3
MinSpareThreads 1
ServerLimit 1
SetEnvIf X-Forwarded-SSL on HTTPS=1
ThreadsPerChild 5
I do need to be able to use sleep() and trust that it isn't stopping everything. So, what's up and how to fix it?
Edit: Webfaction runs this using Apache.
As Gjordis pointed out, sleep will pause the current thread. I have looked at Webfaction and it looks like their are using WSGI for running the serving instance of Django. This means, every time a request comes in, Apache will look at how many worker processes (that are processes that each run a instance of Django) are currently running. If there are none/to view it will spawn additonally workers and hand the requests to them.
Here is what I think is happening in you situation:
first GET request for resource A comes in. Apache uses a running worker (or starts a new one)
the worker sleeps 10 seconds
during this, a new request for resource A comes in. Apache sees it is requesting the same resource and sends it to the same worker as for request A. I guess the assumption here is that a worker that recently processes a request for a specific resource it is more likely that the worker has some information cached/preprocessed/whatever so it can handle this request faster
this results in a 20 second block since there is only one worker that waits 2 times 10 seconds
This behavior makes complete sense 99% of the time so it's logical to do this by default.
However, if you change the requested resource for the second request (by adding GET parameter) Apache will assume that this is a different resource and will start another worker (since the first one is already "busy" (Apache can not know that you are not doing any hard work). Since there are now two worker, both waiting 10 seconds the total time goes down to 10 seconds.
Additionally I assume that something is **wrong** with your design. There are almost no cases which I can think of where it would be sensible to not respond to a HTTP request as fast as you can. After all, you want to serve as many requests as possible in the shortest amount of time, so sleeping 10 seconds is the most counterproductive thing you can do. I would recommend the you create a new question and state what you actual goal is that you are trying to achieve. I'm pretty sure there is a more sensible solution to this!
Assuming you run your Django-server just with run() , by default this makes a single threaded server. If you use sleep on a single threaded process, the whole application freezes for that sleep time.
It may simply be that your browser is queuing the second request to be performed only after the first one completes. If you are opening your URLs in the same browser, try using the two different ones (e.g. Firefox and Chrome), or try performing requests from the command line using wget or curl instead.