I'm trying to publish a Django application on the production server using Nginx + Gunicorn. When I doing a simple stress test on the server (holding the F5 key for a minute) the server returns a 504 Gateway Time-out error. Why does this happen? This error only appears for the user when doing multiple concurrent requests, or the system will be fully unavailable to everyone?
When you hold down F5:
You've started hundreds of requests.
Those requests have filled your gunicorn request queue.
The request handlers have not been culled as soon as the connection drops.
Your latest requests are stuck in the queue behind all the previous requests.
Nginx times out.
For everyone.
Solutions:
Set up rate-limiting buckets in Nginx, keyed on IP, such that one malicious user can't spam you with requests and DOS your site.
Set up a global rate-limiting bucket in Nginx such that you don't overfill your request queue.
Make Nginx serve a nice "Reddit is under heavy load" style page, so users know that this is a purposeful event
Or:
Replace gunicorn with uwsgi. It's faster, more memory efficient, integrates smoothly with nginx, and most importantly: It will kill the request handler immediately if the connection drops, such that F5 spam can't kill your server.
https://medium.com/#paragsharma.py/504-gateway-timeout-django-gunicorn-nginx-4570deaf0922
504 can be caused by gunicorn timeout you need to start it with --timeout arg like
gunicorn --access-logfile - --workers 3 --timeout 300 --bind unix:/home/ubuntu/myproject/myproject.sock myproject.wsgi:application
Related
I'm have deployed my Flask in AWS using gunicorn server.
This is my gunicorn configuration in Dockerfile,
CMD gunicorn api:app -w 1 --threads 2 -b 0.0.0.0:8000
It's clear that I'm having one master worker and that worker has 2 threads, the problem I was facing was that server getting stuck sometimes, meaning it was not processing any requests, when I redeployed the app, it started to process the requests once again.
I can increase the number of threads or increase the number of master workers to resolve this issue. But one question I have is how to get information about the threads running in Gunicorn, meaning which thread is processing which request.
Thanks in advance!
So what's the trick? Nginx is facing the client. Normally the requests are forwarded to gunicorn A at port 80.
You can't run code update in-place, since something might be wrong. So you do a fresh code checkout and launch a separate gunicorn B on some port 5678.
Once you test the new code on a development/testing database, you:
Adjust gunicorn B to point to the database, but do not send any requests.
Stop gunicorn A. Nginx now, ever so briefly, responds with an error.
Set nginx to point to gunicorn B, still at port 5678.
Restart nginx.
Is this about right? Do you just write a script to run the four actions faster and minimize the duration (between steps 2 and 4) the server responds with an error?
Nginx supports configuration reloading. Using this feature, updating your application can work like this:
Start a new instance Gunicorn B.
Adjust the nginx configuration to forward traffic to Gunicorn B.
Reload the nginx configuration with nginx -s reload. After this, Gunicorn B will serve new requests, while Gunicorn A will still finish serving old requests.
Wait for the old nginx worker process to exit (which means all requests initiated before the reload are now done) and then stop Gunicorn A.
Assuming your application works correctly with two concurrent instances, this gives you a zero-downtime update.
The relevant excerpt from the nginx documentation:
Once the master process receives the signal to reload configuration, it checks the syntax validity of the new configuration file and tries to apply the configuration provided in it. If this is a success, the master process starts new worker processes and sends messages to old worker processes, requesting them to shut down. Otherwise, the master process rolls back the changes and continues to work with the old configuration. Old worker processes, receiving a command to shut down, stop accepting new connections and continue to service current requests until all such requests are serviced. After that, the old worker processes exit.
I am working on making a Flask server that can listen to a node that would like to contact the server at any point of time and then the flask server process the response. I would like to ask whether there's any method to listen to the connection attempted by the remote node.
Flask has a development server, documented here:
FLASK_APP=hello.py flask run
Note that the server is not intended for production. If you're looking for production server, you can read this page. There are several solutions, such as Gunicorn, uWSGI and more.
I am using Daphne for both socket and http connections. I am running 4 worker containers and running everything locally right now in a docker container.
My daphne server fails if I try to upload a file that is 400MB. It works fine for small files upto 15MB.
My docker container quits with error code 137. I dont get any error in daphne logs. The daphne container just dies but the worker containers keep on running.
Does anyone know if there is a way to increase upload limits on daphne or I am missing something else?
I start the daphne server by
daphne -b 0.0.0.0 -p 8001 project.asgi:channel_layer --access-log=${LOGS}/daphne.access.log
This is because daphne loads the entire HTTP POST request body completely and immediately before transferring control to the django with channels.
All your 400 MB are loaded into RAM here. Your docker container died due to the out of memory reason.
This happens even before checking for the size of the request body in django. See here
There is the open ticket here
If you want to prevent it right now use a uvicorn instead daphne. Uvicorn must pass control to Django with chunks. And depending on the FILE_UPLOAD_MAX_MEMORY_SIZE django setting you will receive a temporary file on your hard disk (not in RAM). But you need to write your own AsyncHttpConsumer or AsgiHandler because AsgiHandler and AsgiRequest from channels do not support chunked body too. This will be possible after the PR.
While testing nginx server with uwsgi and django I am having problem with uwsgi process. I am sending two posts, which are taking a lot of time. Meanwhile server processing I am sending get request from webrowser and I must wait till this two post finished. I am starting uwsgi with this command:
cd /home/pi/cukierek && uwsgi -
-max-requests=5000
--socket /tmp/cukierek.sock
--module config.wsgi
--master-fifo /tmp/cukierek.fifo
--chmod-socket=777 --processes 2
--daemonize /home/pi/cukierek/wsgi.log
--enable-threads
It is possible to get answer from browser while this two post are beeing in progress ? I am using default nginx settings.
You have a uwsgi server configured to spawn 2 processes. Then you run 2 long requests. Those 2 processes are busy with the long requests, so new requests must wait until the long requests finish.
If you want to send new reqeusts to the server while the long requests run, increase the processes to more than 2 (ie --processes 4)