AWS Elastic Beanstalk sqsd error while processing a message - python

I have an Elastic Beanstalk Python worker environment. The average job running time is about 20 seconds. Sometimes the following scenario happens,
sqsd picks a message from the sqs queue and sends it to the worker.
The worker starts processing the message.
in few seconds (ranges from 1 to 30 seconds) sqsd gets the following error and parks the message in the Dead letter queue as I configured the retries to 1.
127.0.0.1 (-) - - [23/Nov/2017:19:48:17 +0000] "POST / HTTP/1.1" 500 527 "-" "aws-sqsd/2.3"
The worker continues to process the message and finishes successfully. I have logs to trace that.
That makes the environment in general not healthy.
I have the connection timeout = 60 seconds, Inactivity timeout = 600, Visibility timeout = 600, HTTP connections = 2.
I have the following in the configs as well
option_settings:
aws:elasticbeanstalk:container:python:
NumProcesses: 3
NumThreads: 10
files:
"/etc/httpd/conf.d/wsgi_custom.conf":
mode: "000644"
owner: root
group: root
content: |
WSGIApplicationGroup %{GLOBAL}
Is this because of some memory limit that wsgi puts to every request? That is the only thing that I can think of.

Related

HTTP server Nginx - Uwsgi - Flask throttling at high load

I am load testing my application. I have an EC2 server running Flask + Uwsgi + Nginx (Configured as per https://www.digitalocean.com/community/tutorials/how-to-serve-flask-applications-with-uwsgi-and-nginx-on-ubuntu-20-04)
I tested with 4K records in 4 seconds. I can see a lot of errors like below.
2022/04/17 15:16:37 [error] 19929#19929: *7769 connect() to unix:/home/ubuntu/wip/iotlistener/iotlistener.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: XX.XX.XX.XX, server: , request: "POST / HTTP/1.1", upstream: "uwsgi://unix:/home/ubuntu/wip/iotlistener/iotlistener.sock:
I can see the EC2 server is quite stable, and the CPU load does not go beyond 50%. The network usage is high ofcourse, but no red lines. The service itself is very light - it just dumps data into DynamoDB. I can see the DB metrics are quite stable.
So I feel this is due to some default configuration that restricts the load. Can you please help me identify?
iotlistener.ini
[uwsgi]
module = wsgi:app
master = true
processes = 25
socket = iotlistener.sock
chmod-socket = 660
vacuum = true
die-on-term = true
The process count was 5. I changed it to 25 - with no change in behaviour.
And this is the nginx configuration:
server {
listen 80;
location / {
include uwsgi_params;
uwsgi_pass unix:/home/ubuntu/wip/iotlistener/iotlistener.sock;
}
}
I am expecting a production load well beyond 1K records per second. So please help me with this!

Server Request Timeout Error for Flask Application in Heroku using gunicorn

I have tried increasing the timeout in Procfile but I am still facing a request timeout error when I process some data on server. after 30 sec of putting request to server server timeouts. Is there any way to increase this request timeout?
I am getting the category and page no from user and then scraped the data from website and when its actually busy in scraping the server timeouts after 30 sec but still the request is processing in backside.
I am using Heroku with gunicorn and my Procfile settings are:
web gunicorn main:app --timeout 60 --workers=3 --threads=3 --worker-connections=1000
It is not possible to increase the HTTP timeout over 30 seconds.
It looks like you need a different approach where the users do not hang waiting for the response to be processed and trasmissed. You could consider:
a page showing "Working in progress", which reloads every 30 sec (checking if a background process has completed)
notify the users (email, browser notification) once the request has been processed by the backend

402 Bad Request using Python, Nginx, and flask on a Raspberry Pi

I am trying to get my Python application to run on port 80 so I can host my page to the Internet and see my temperature and all that remotely.
I get a 402 Bad Request error and I can't seem to figure out why. It seems it's having trouble writting my .sock file to a temp directory.
I am following this tutorial.
https://iotbytes.wordpress.com/python-flask-web-application-on-raspberry-pi-with-nginx-and-uwsgi/
/home/pi/sampleApp/sampleApp.py
from flask import Flask
first_app = Flask(__name__)
#first_app.route("/")
def first_function():
return "<html><body><h1 style='color:red'>I am hosted on Raspberry Pi !!!</h1></body></html>"
if __name__ == "__main__":
first_app.run(host='0.0.0.0')
/home/pi/sampleApp/uwsgi_config.ini
[uwsgi]
chdir = /home/pi/sampleApp
module = sample_app:first_app
master = true
processes = 1
threads = 2
uid = www-data
gid = www-data
socket = /tmp/sample_app.sock
chmod-socket = 664
vacuum = true
die-on-term = true
/etc/rc.local just before exit 0
/usr/local/bin/uwsgi --ini /home/pi/sampleApp/uwsgi_config.ini --uid www- data --gid www-data --daemonize /var/log/uwsgi.log
/etc/nginx/sites-available/sample_app_proxy and I verified this moved to sites-enabled after I linked it.
server {
listen 80;
server_name localhost;
location / { try_files $uri #app; }
location #app {
include uwsgi_params;
uwsgi_pass unix:/tmp/sample_app.sock;
}
}
I got all the way to the final step with 100 percent success. After I linked the sample_app_proxy file so it gets copied into /nginx/sites-enabled/ I do a service nginx restart. When I open my browser 'localhost' I get a 502 Bad Request.
I noticed in the nginx logs at the bottom that there was an error.
2017/01/29 14:49:08 [crit] 1883#0: *8 connect() to unix:///tmp/sample_app.sock failed (2: No such file or directory) while connecting to upstream, client: 127.0.0.1, server: localhost, request: "GET /favicon.ico HTTP/1.1", upstream: "uwsgi://unix:///tmp/sample_app.sock:", host: "localhost", referrer: "http://localhost/"
My source code is exactly as you see it in the tutorial, I checked it over many times.
I looked at the /etc/logs/uwsgi.log and found this message at the bottom.
*** WARNING: you are running uWSGI without its master process manager ***
your processes number limit is 7336
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
The -s/--socket option is missing and stdin is not a socket.
I am not sure what is going on and why it doesn't seem to write the .sock file to the /tmp/ directory. The test I did earlier in the tutorial worked fine and the sample_app.sock file showed up in /tmp/ But when I run the application it doesn't seem to work.
I did a lot of searching and I saw many posts saying to use "///" instead on "/" in the /etc/nginx/sites-available/sample_app_proxy file, but whether I use one or three, I still get the 502 error.
uwsgi_pass unix:///tmp/sample_app.sock;
Any help would be greatly appreciated as this is the last step I need to accomplish so I can do remote stuff to my home. Thanks!

Gunicorn worker killed

I am running a flask application (rest api) with gunicorn and I am seeing almost every 30 seconds a batch of [CRITICAL] WORKER TIMEOUT (pid:14727).
My settings are the following:
gunicorn --worker-class gevent \
--timeout 30 --graceful-timeout 20
--max-requests-jitter 2000 --max-requests 1500
-w 50
--log-level DEBUG --capture-output
--bind 0.0.0.0:5000 run:app
I saw previous post that had said to throw more RAM at this but from the looks of it:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 513926
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 131071
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1550298
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
The Heap is unlimited and stack size is slightly over 8Mb.
Log sample
+0000] [26657] [DEBUG] GET /timer
[2017-01-21 14:07:30 +0000] [26657] [DEBUG] GET /timer
[2017-01-21 14:07:33 +0000] [26657] [DEBUG] GET /timer
[2017-01-21 14:07:33 +0000] [26652] [DEBUG] GET /timer
10.193.80.149 - - [21/Jan/2017:14:07:34 +0000] "GET /timer?id=699ec59eccd3fb929b3dd7707e542ed15acd4181:6f136b54-2cb5-42ef-9def-f69caaba57ef HTTP/1.1" 200 - "-" "-"
10.193.80.147 - - [21/Jan/2017:14:07:35 +0000] "GET /timer?id=e7963c53603ed9249b0aa557d8a64cea89fb0bf4:6f136b54-2cb5-42ef-9def-f69caaba57ef HTTP/1.1" 200 - "-" "-"
10.193.80.150 - - [21/Jan/2017:14:07:35 +0000] "GET /timer?id=4b750805193fb4d00c3ce1465c266ed932a24e55:6f136b54-2cb5-42ef-9def-f69caaba57ef HTTP/1.1" 200 - "-" "-"
[2017-01-21 14:07:37 +0000] [26657] [DEBUG] GET /timer
[2017-01-21 14:07:37 +0000] [26657] [DEBUG] GET /timer
[2017-01-21 14:07:37 +0000] [26635] [CRITICAL] WORKER TIMEOUT (pid:27202)
[2017-01-21 14:07:37 +0000] [26635] [CRITICAL] WORKER TIMEOUT (pid:27205)
What I noticed was only a handful of workers are always doing the work 26657, 26652 26651 everything else just seems to be giving me the Worker timeout
You have some requests which take longer than 30 seconds to finish, that's why they are killed. Either:
tune your code so each request is done within less than 30 seconds (that might also be due to slow database or other dependencies)
check if your host is short on resources, this could be due to CPU or RAM. Tuning your machine by putting more RAM into the machine only helps if each of your unicorn processes eats a lot of RAM and the machine starts swapping. Try e.g. top to check if CPU or RAM are saturated.
increase the timeout by changing the --timeout 30 to a higher number. That's the worst idea really, as you don't solve your underlying problem that your flask app responds slowly to incoming requests. Plus killing long running requests often helps the other flask threads to not run into resource problems as well.

How to find out why uWSGI kill workers?

i have app on Pyramid. I run it in uWSGI with these config:
[uwsgi]
socket = mysite:8055
master = true
processes = 4
vacuum = true
lazy-apps = true
gevent = 100
And nginx config:
server {
listen 8050;
include uwsgi_params;
location / {
uwsgi_pass mysite:8055;
}
}
Usually all fine, but sometimes uWSGI kills workers. And i have no idea why.
I see in uWSGI logs:
DAMN ! worker 2 (pid: 4247) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 2 (new pid: 4457)
but in the logs there is no Python exceptions.
sometimes i see in uWSGI logs:
invalid request block size: 11484 (max 4096)...skip
[uwsgi-http key: my site:8050 client_addr: 127.0.0.1 client_port: 63367] hr_instance_read(): Connection reset by peer [plugins/http/http.c line 614]
And nginx errors.log:
*13388 upstream prematurely closed connection while reading response header from upstream, client: 127.0.0.1,
*13955 recv() failed (104: Connection reset by peer) while reading response header from upstream, client:
I think this can be solved by adding buffer-size=32768, but it is unlikely due to this uWSGI kill workers.
Why uwsgi can kill workers? And how can I know the reason?
The line "DAMN ! worker 2 (pid: 4247) died, ..." nothing to tells.
signal 9 means it received a SIGKILL. so something sent a kill to your worker. it's relatively likely that the out-of-memory killer decided to kill your app because it was using too much memory. try watching the workers with a process monitor and see if it uses a lot of memory.
Try to add harakiri-verbose = true option in the uWSGI config.
I had the same problem, for me changing the uwsgi.ini file, changing the value of the reload-on-rss setting from 2048 to 4048, and harakiri to 600 solved the problem.
For me it was that I hadn't filled out app.config["SERVER_NAME"] = "x"

Categories

Resources