airflow webserver Workers terminating due to signal 11

airflow webserver Workers terminating due to signal 11 - python

I'm trying to start the airflow webserver with an existing application and the latest version (2.2.2). The executor is set as LocalExecutor, the datastore is Postgres, Python v3.9. Upon start up, it launches 4 workers which promptly die. It then spins its wheels continually trying to restart them.
Here's an example of the messages showing a worker starting and dying with signal 11 (segmentation violation). This is all within a second of starting.
Using worker: sync
[2021-11-30 17:29:31 -0500] [12529] [INFO] Booting worker with pid: 12529
[2021-11-30 17:29:31 -0500] [12530] [INFO] Booting worker with pid: 12530
[2021-11-30 17:29:31 -0500] [12531] [INFO] Booting worker with pid: 12531
[2021-11-30 17:29:31 -0500] [12532] [INFO] Booting worker with pid: 12532
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
settings.prepare_engine_args(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=12529
[WARNING] Worker with pid 12529 was terminated due to signal 11
Any suggestions on how to debug these workers?

I gave up on trying to run Airflow directly on my M1 Mac. I suspect it has something to do with the emulator for the M1 chip, but don't know for sure. I fell back to running Airflow in Docker. It takes a very long time to start, but does run OK that way.

Related

Python + FastAPI + OracleCloud: How to expose my python fastapi endpoint on Oracle Cloud to Internet

I have a Python + FastAPI restful API project running the free tier of Oracle Cloud VM instance.
I use Gunicorn to serve the api and also installed Nginx just in case it's needed.
I have tested my running project with
curl http://localhost:8000
and I can see my API response.
Now my question is : how can I expose this api endpoint outside on the Internet?
Update 1
I started my Python API project with this command:
gunicorn -w 4 -k uvicorn.workers.UvicornWorker main:app --timeout 1200 -b 0.0.0.0
I saw the messages below:
[2021-05-23 00:40:28 +0000] [3850] [INFO] Starting gunicorn 20.0.2
[2021-05-23 00:40:28 +0000] [3850] [INFO] Listening at: http://0.0.0.0:8000 (3850)
[2021-05-23 00:40:28 +0000] [3850] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2021-05-23 00:40:28 +0000] [3853] [INFO] Booting worker with pid: 3853
[2021-05-23 00:40:28 +0000] [3854] [INFO] Booting worker with pid: 3854
[2021-05-23 00:40:28 +0000] [3857] [INFO] Booting worker with pid: 3857
[2021-05-23 00:40:28 +0000] [3858] [INFO] Booting worker with pid: 3858
[2021-05-23 00:42:04 +0000] [3853] [INFO] Started server process [3853]
[2021-05-23 00:42:04 +0000] [3857] [INFO] Started server process [3857]
[2021-05-23 00:42:04 +0000] [3857] [INFO] Waiting for application startup.
[2021-05-23 00:42:04 +0000] [3858] [INFO] Started server process [3858]
[2021-05-23 00:42:04 +0000] [3858] [INFO] Waiting for application startup.
[2021-05-23 00:42:04 +0000] [3858] [INFO] Application startup complete.
[2021-05-23 00:42:04 +0000] [3853] [INFO] Waiting for application startup.
[2021-05-23 00:42:04 +0000] [3853] [INFO] Application startup complete.
[2021-05-23 00:42:04 +0000] [3857] [INFO] Application startup complete.
[2021-05-23 00:42:04 +0000] [3854] [INFO] Started server process [3854]
[2021-05-23 00:42:04 +0000] [3854] [INFO] Waiting for application startup.
[2021-05-23 00:42:04 +0000] [3854] [INFO] Application startup complete.
Then I copied the IP address from the Compute >> Instances >> Instance Details panel and accessed it from my Chrome. Straightaway, it shows me
Unable to connect
Also read through several articles about using Nginx and tried without any luck.
Update 2
Using curl to access the website from my local machine
$ curl http://168.138.12.192:8000/
curl: (7) Failed to connect to 168.138.12.192 port 8000: No route to host
However, when access the IP directly using curl, I was able to get the default Nginx website.
$ curl http://168.138.12.192

Finally, I found out what I missed:
sudo iptables -I INPUT -p tcp -s 0.0.0.0/0 --dport 8000 -j ACCEPT
I have to run this command to open the port 8000(yes, my website is using port 8000).
I thought I have added Ingress Rule to accept tcp 8000, but it turns out that I still need to run the aforementioned command.
I do not quite understand why I need to do it, but it solves the problem.

Did you changed the default html page under /var/www/html directory? If not try customize the html page as per your requirement and see if it works for you or else it would just show the default nginx page when accessed from browser using public IP.
Adding to this, Also check if the port 8000 is allowed in the security list and OS firewall. The default port for http request is 80, you need to change the default port from 80 to 8000 in the config file to make this work. refer this page this might be useful How to Change Apache HTTP Port in Linux.

Gunicorn is timeout and no logs in error files

I am running django application serving with nginx and gunicorn in supervisor,I am getting gunicorn timeout
errors in gunicorn error logs but don't know what causing the errors.
2018-10-01 20:20:19 [20529] [CRITICAL] WORKER TIMEOUT (pid:20646)
2018-10-01 20:20:19 [23948] [INFO] Booting worker with pid: 23948
Is there a way we can configure gunicorn to write logs entry before the gunicorn process is timeout and killed ?

You can try by increasing the timeout value to a bigger value than the default 30sec

Which logger is outputting this in gunicorn?

When I start gunicorn it prints this:
[2018-11-09 16:30:20 +0000] [16] [INFO] Starting gunicorn 19.9.0
[2018-11-09 16:30:20 +0000] [16] [INFO] Listening at: http://0.0.0.0:8000 (16)
[2018-11-09 16:30:20 +0000] [16] [INFO] Using worker: sync
[2018-11-09 16:30:20 +0000] [19] [INFO] Booting worker with pid: 19
Setting up a logger for the gunicorn package (propagating) doesn't seem to affect it. What module is the one I should configure to modify these messages?

Those messages are output by the Arbiter class in gunicorn/arbiter.py, but it may be that any configuring you try and do is overridden by gunicorn's machinery, or not applicable - for example, trying to set up logging in a worker won't affect what the arbiter does, as they are separate processes. So you may need to invoke the arbiter in a special way (i.e. not just through running a canned gunicorn script) if you want to affect its logging, or amend the gunicorn configuration used for the arbiter.

Issues with Python Threading, Flask and Gunicorn

I've been working on a garage door automation application for the Raspberry Pi that will allow you to remote open/close the garage door as well as check the status of the door (whether open or closed). I've posted the code up on GitHub here: Link to GitHub.
From time to time, it seems that the application is simply unreachable over my network as if Flask has simply stopped responding to web requests. I can still, however, SSH into my pi just fine when this happens. A reboot brings the web interface back, with no issues.
After doing some reading, as I understand it, the built-in webserver in Flask is really not robust and shouldn't be used in a production environment, so I decided that I would try to setup Gunicorn and nginx to handle the job instead. I modified the code and tried running Gunicorn. Unfortunately, Gunicorn seems to report errors with the Threading library (which I'm using to check the status of the door in the background). Here is some of the output from Gunicorn when I try to run it:
pi#raspi-4:~/garage-pi $ sudo gunicorn app:app
[2016-07-17 11:06:08 +0000] [745] [INFO] Starting gunicorn 19.6.0
[2016-07-17 11:06:08 +0000] [745] [INFO] Listening at: http://127.0.0.1:8000 (745)
[2016-07-17 11:06:08 +0000] [745] [INFO] Using worker: sync
[2016-07-17 11:06:08 +0000] [750] [INFO] Booting worker with pid: 750
[2016-07-17 11:06:08 +0000] [750] [INFO] Worker exiting (pid: 750)
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/pi/garage-pi/app.py", line 21, in checkdoorstate
if (RPi.GPIO.input(18) == True and door_state != True):
AttributeError: 'NoneType' object has no attribute 'GPIO'
With all of that said, I'm out of my depth here, and wanted to know if this setup is at all possible, or will I continue to have issues with using the Threading Python library with Gunicorn? Any thoughts on how to resolve this issue?
Many thanks in advance.

You can start your app with multiple workers or async workers with Gunicorn.
Gunicorn with gevent async worker
gunicorn server:app -k gevent --worker-connections 1000
Gunicorn 1 worker 12 threads:
gunicorn server:`app -w 1 --threads 12
Gunicorn with 4 workers (multiprocessing):
gunicorn server:app -w 4
More information on Flask concurrency in this post: How many concurrent requests does a single Flask process receive?.
Source
How to run Flask with Gunicorn in multithreaded mode

django app in heroku getting worker timeout error

I have deployed a django app and deployed to Heroku it takes facebook account id's as input through CSV file
and parses information. it works fine in the local server but I am getting the below mentioned error when I try to upload a larger CSV file.
14:12:16 web.1 | 2014-07-17 14:12:16 [30747] [INFO] Using worker: sync
14:12:16 web.1 | 2014-07-17 14:12:16 [30752] [INFO] Booting worker with pid: 30752
14:13:21 web.1 | 2014-07-17 14:13:21 [30747] [CRITICAL] WORKER TIMEOUT (pid:30752)
14:13:21 web.1 | 2014-07-17 03:43:21 [30752] [INFO] Worker exiting (pid: 30752)
14:13:21 web.1 | 2014-07-17 14:13:21 [30841] [INFO] Booting worker with pid: 30

Heroku requests are limited to 30 seconds, if the request take longer than this the router will terminate the request
You can increase the LOG LEVEL of gunicorn to see if there's some error in your process

Example of Procfile with timeout increased to 15 sec
web: gunicorn myproject.wsgi --timeout 15 --keep-alive 5 --log-level debug

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

airflow webserver Workers terminating due to signal 11 - python

I gave up on trying to run Airflow directly on my M1 Mac. I suspect it has something to do with the emulator for the M1 chip, but don't know for sure. I fell back to running Airflow in Docker. It takes a very long time to start, but does run OK that way.

Related

Python + FastAPI + OracleCloud: How to expose my python fastapi endpoint on Oracle Cloud to Internet

Gunicorn is timeout and no logs in error files

Which logger is outputting this in gunicorn?

Issues with Python Threading, Flask and Gunicorn

django app in heroku getting worker timeout error

Categories

Resources