I'm creating a REST API for an application using Falcon. When launching two or more requests to the API on different endpoints, there's no multi-threaded execution (One request has to be finished to execute the next one)
The problem is coming from a POST endpoint that executes a complex machine learning process (takes dozen of seconds to finish) and the whole API is blocked when the process is being executed, because it waits for the process to be completed to return some results.
I'm using wsgiref simple_server to serve the requests:
if __name__ == '__main__':
httpd = simple_server.make_server('127.0.0.1', 8000, app)
httpd.serve_forever()
Is there any way to make the execution parallel to serve multiple requests in the same time.
Probably the server is not running in multiprocess or multithreaded mode.
But even if it was, it is not a good idea to occupy the web server for long-running tasks. The long running tasks should be run by some other worker processes.
Take a look at Celery
zaher ideally you should use Celery as giorgosp mention but if it is mandatory to return result for API request then you can use Gunicorn
gunicorn --workers 3 -b localhost:8000 main:app --reload
Here, in above code I have mention 3 workers so at a time you can serve/process 3 requests.
Ideally no of workers can be
cpu_count * 2 + 1
You can use any port number you like, but make sure that it is above 1024 and it's not used by any other program.
The main:app option tells Gunicorn to invoke the application object app available in the file main.py.
Gunicorn provides an optional --reload switch that tells Gunicorn to detect any code changes on the fly. This way you can change your code without having to restart Gunicorn.
And if this approach is not suitable for your need than I think you should use Tornado instead of Falcon.
Let me know if any further clarification needed.
This can be easily achieved by coupling Falcon with Gunicorn. With Gunicorn, achieving multi-threading/multi-processing will be relatively easier without needing to implement Celery (Although, nothing is stopping one from implementing it. Celery is awesome!)
gunicorn -b localhost:8000 main:app --threads 3 --workers 3 --reload
The above command will sping up 3 workers with each worker having 3 threads. You as a developer can tweak the number of workers and threads required. I would strongly advise to understand difference between multithreading and multiprocessing before tweaking these settings.
Related
I'm have deployed my Flask in AWS using gunicorn server.
This is my gunicorn configuration in Dockerfile,
CMD gunicorn api:app -w 1 --threads 2 -b 0.0.0.0:8000
It's clear that I'm having one master worker and that worker has 2 threads, the problem I was facing was that server getting stuck sometimes, meaning it was not processing any requests, when I redeployed the app, it started to process the requests once again.
I can increase the number of threads or increase the number of master workers to resolve this issue. But one question I have is how to get information about the threads running in Gunicorn, meaning which thread is processing which request.
Thanks in advance!
I am using Django with Nginx and want to serve multiple requests in parallel.
We have Docker configuration and one pod has 10 cores. I am trying to create multiple workers in uWSGI like (uwsgi --socket /tmp/main.sock --module main.wsgi --enable-threads --master --processes=10 --threads=1 --chmod-socket=666)
Request first lands to view and from there it calls service file which does heavy work.
Actually, I am using openCV library in service file which has loop over all pixels to remove colored ones(pretty time consuming..)
I also tried using multiple cores and 1 worker as
(uwsgi --socket /tmp/main.sock --module main.wsgi --enable-threads --master --processes=1 --threads=10 --chmod-socket=666).
But still performance did not improve. I think it is due to GIL which is getting acquired while doing heavy I/O operations, not sure how I can find a work around it. Or use all cores in some other efficient way? TIA!
I'm hoping to multithread pyramid 1.10.4 requests ... but it appears that pserve is already multithreaded. The Pyramid docs seems to say pserve is single threaded, but when I put
sleep(10)
in my view, and issue
for ii in $(seq 20); do
time wget -O tempa$ii http://localhost:6543 &> outa$ii &
done
I find that 4 of the requests complete in 10 seconds, the next 4 in 20 seconds, the next 4 in 30 seconds, etc.
Apparently somebody (pserve?) is already running 4 threads.
But nowhere do I find this documented. There is no mention of threading in either development.ini or production.ini.
How can I control the number of available threads for pserve?
If pserve is the wrong way to do threading, what is the right way?
pserve is just a thin CLI runner and is not a server. You likely have the server section of your ini configured to tell pserve to use waitress. Waitress is a WSGI server that utilizes a threadpool to serve requests and you’ll want to read its docs. To change the size of the thread pool you can set threads = 10 in the server section.
So what's the trick? Nginx is facing the client. Normally the requests are forwarded to gunicorn A at port 80.
You can't run code update in-place, since something might be wrong. So you do a fresh code checkout and launch a separate gunicorn B on some port 5678.
Once you test the new code on a development/testing database, you:
Adjust gunicorn B to point to the database, but do not send any requests.
Stop gunicorn A. Nginx now, ever so briefly, responds with an error.
Set nginx to point to gunicorn B, still at port 5678.
Restart nginx.
Is this about right? Do you just write a script to run the four actions faster and minimize the duration (between steps 2 and 4) the server responds with an error?
Nginx supports configuration reloading. Using this feature, updating your application can work like this:
Start a new instance Gunicorn B.
Adjust the nginx configuration to forward traffic to Gunicorn B.
Reload the nginx configuration with nginx -s reload. After this, Gunicorn B will serve new requests, while Gunicorn A will still finish serving old requests.
Wait for the old nginx worker process to exit (which means all requests initiated before the reload are now done) and then stop Gunicorn A.
Assuming your application works correctly with two concurrent instances, this gives you a zero-downtime update.
The relevant excerpt from the nginx documentation:
Once the master process receives the signal to reload configuration, it checks the syntax validity of the new configuration file and tries to apply the configuration provided in it. If this is a success, the master process starts new worker processes and sends messages to old worker processes, requesting them to shut down. Otherwise, the master process rolls back the changes and continues to work with the old configuration. Old worker processes, receiving a command to shut down, stop accepting new connections and continue to service current requests until all such requests are serviced. After that, the old worker processes exit.
I would like to run APScheduler which is a part of WSGI (via Apache's modwsgi with 3 workers) webapp. I am new in WSGI world thus I would appreciate if you could resolve my doubts:
If APScheduler is a part of webapp - it becomes alive just after first request (first after start/reset Apache) which is run at least by one worker? Starting/resetting Apache won't start it - at least one request is needed.
What about concurrent requests - would every worker run same set of APScheduler's tasks or there will be only one set shared between all workers?
Would once running process (webapp run via worker) keep alive (so APScheduler's tasks will execute) or it could terminate after some idle time (as a consequence - APScheduler's tasks won't execute)?
Thank you!
You're right -- the scheduler won't start until the first request comes in.
Therefore running a scheduler in a WSGI worker is not a good idea. A better idea would be to run the scheduler in a separate process and connect to the scheduler when necessary via some RPC mechanism like RPyC or Execnet.