FastAPI server running on AWS App Runner fails after 24 hours

FastAPI server running on AWS App Runner fails after 24 hours - python

I have a FastAPI server configured with Gunicorn, deployed on AWS App Runner. When I try to access the endpoint, it works perfectly, however, after 24 hours, when I try to access the same endpoint, I get a 502 bad gateway error, and nothing is logged on cloudWatch after this point, until I redeploy the application, then it starts working fine again.
I suspect this has to do with my Gunicorn configuration itself which was somehow shutting down my API after some time, and not AWS App Runner, but I have not found any solution. I have also shown my Gunicorn setup below. Any hep will be appreciated.
from fastapi import FastAPI
import uvicorn
from fastapi.middleware.cors import CORSMiddleware
from gunicorn.app.base import BaseApplication
import os
import multiprocessing
api = FastAPI()
def number_of_workers():
print((multiprocessing.cpu_count() * 2) + 1)
return (multiprocessing.cpu_count() * 2) + 1
class StandaloneApplication(BaseApplication):
def __init__(self, app, options=None):
self.options = options or {}
self.application = app
super().__init__()
def load_config(self):
config = {
key: value for key, value in self.options.items()
if key in self.cfg.settings and value is not None
}
for key, value in config.items():
self.cfg.set(key.lower(), value)
def load(self):
return self.application
#api.get("/test")
async def root():
return 'Success'
if __name__ == "__main__":
if os.environ.get('APP_ENV') == "development":
uvicorn.run("api:api", host="0.0.0.0", port=2304, reload=True)
else:
options = {
"bind": "0.0.0.0:2304",
"workers": number_of_workers(),
"accesslog": "-",
"errorlog": "-",
"worker_class": "uvicorn.workers.UvicornWorker",
"timeout": "0"
}
StandaloneApplication(api, options).run()

I had the same problem. After a lot of trial and error, two changes seemed to resolve this for me.
Set uvicorn --timeout-keep-alive to 65. For gunicorn this param is --keep-alive. I'm assuming the Application Load Balancer throws 502 if uvicorn closes the tcp socket before ALB does.
Change the App Runner health check to use HTTP rather than TCP ping to manage container recycling. Currently the AWS UI doesn't allow you to make this change. You will have to do this using aws cli. Use any active URL path for ping check - in your case /test
aws apprunner update-service --service-arn <arn> --health-check-configuration Protocol=HTTP,Path=/test
#2 might just be enough to resolve the issue.

Related

Mongoengine connecting to atlas No replica set members found yet

after a working production app was connected to local mongo, we've decided to move to Mongo Atlas.
This caused production errors.
Our stack is docker -> alping 3.6 -> python 2.7.13 -> Flask -> uwsgi (2.0.17) -> nginx running on aws
flask-mongoengine-0.9.3 mongoengine-0.14.3 pymongo-3.5.1
when starting the app in staging/production, The uwsgi is sending No replica set members found yet.
We don't know why.
We've tried different connection settings connect: False which is lazy connecting, meaning not on initializing, but on first query.
It caused the nginx to fail for resource temporarily unavailable error on some of our apps. We had to restart multiple times for the app to finally start serving requests.
I think the issue is with pymongo and the fact that it's not fork-safe
http://api.mongodb.com/python/current/faq.html?highlight=thread#id3
and uwsgi is using forks
I suspect it might be related to the way my app is being initalized.
might by aginst Using PyMongo with Multiprocessing
here is the app init code:
from app import FlaskApp
from flask import current_app
app = None
from flask_mongoengine import MongoEngine
import logging
application, app = init_flask_app(app_instance, module_name='my_module')
def init_flask_app(app_instance, **kwargs):
app_instance.init_instance(environment_config)
application = app_instance.get_instance()
app = application.app
return application, app
# app_instance.py
import FlaskApp
def init_instance(env):
global app
app = FlaskApp(env)
return app
def get_instance():
if globals().get('app') is None:
app = current_app.flask_app_object
else:
app = globals().get('app')
assert app is not None
return app
class FlaskApp(object):
def __init__(self, env):
.....
# Initialize the DB
self.db = Database(self.app)
....
# used in app_instance.py to get the flask app object in case it's None
self.app.flask_app_object = self
def run_server(self):
self.app.run(host=self.app.config['HOST'], port=self.app.config['PORT'], debug=self.app.config['DEBUG'])
class Database(object):
def __init__(self, app):
self.db = MongoEngine(app)
def drop_all(self, database_name):
logging.warn("Dropping database %s" % database_name)
self.db.connection.drop_database(database_name)
if __name__ == '__main__':
application.run_server()
help in debugging this will be appreciated!

nginx buffering flask event-stream in Docker image

I have a REST API backend with python/flask and want to stream the response in an event stream. Everything is running inside a docker container with nginx/uwsgi (https://hub.docker.com/r/tiangolo/uwsgi-nginx-flask/).
The API works fine until it comes to the event-stream. It seems like something (probably nginx) is buffering the "yields" because nothing is received by any kind of client until the server finished the calculation and everything is sent together.
I tried to adapt the nginx settings (according to the docker image instructions) with an additional config (nginx_streaming.conf) file saying:
server {
location / {
include uwsgi_params;
uwsgi_request_buffering off;
}
}
dockerfile:
FROM tiangolo/uwsgi-nginx-flask:python3.6
COPY ./app /app
COPY ./nginx_streaming.conf /etc/nginx/conf.d/nginx_streaming.conf
But I am not really familiar with nginx settings and sure what I am doing here^^ This at least does not work.. any suggestions?
My server side implementation:
from flask import Flask
from flask import stream_with_context, request, Response
from werkzeug.contrib.cache import SimpleCache
cache = SimpleCache()
app = Flask(__name__)
from multiprocessing import Pool, Process
#app.route("/my-app")
def myFunc():
global cache
arg = request.args.get(<my-arg>)
cachekey = str(arg)
print(cachekey)
result = cache.get(cachekey)
if result is not None:
print('Result from cache')
return result
else:
print('object not in Cache...calculate...')
def calcResult():
yield 'worker thread started\n'
with Pool(processes=cores) as parallel_pool:
[...]
yield 'Somewhere in the processing'
temp_result = doSomethingWith(
savetocache = cache.set(cachekey, temp_result, timeout=60*60*24) #timeout in seconds
yield 'saved to cache with key:' + cachekey +'\n'
print(savetocache, flush=True)
yield temp_result
return Response(calcResult(), content_type="text/event-stream")
if __name__ == "__main__":
# Only for debugging while developing
app.run(host='0.0.0.0', debug=True, port=80)

I ran into the same problem. Try changing
return Response(calcResult(), content_type="text/event-stream")
to
return Response(calcResult(), content_type="text/event-stream", headers={'X-Accel-Buffering': 'no'})

Following the answer from #u-rizwan here, I added this to the /etc/nginx/conf.d/mysite.conf and it resolved the problem:
add_header X-Accel-Buffering no;
I have added it under location /, but it is probably a good idea to put it under the specific location of the event stream (I have a low traffic intranet use case here).
Note: Looks like nginx could be stripping this header by default if it comes from the application: https://serverfault.com/questions/937665/does-nginx-show-x-accel-headers-in-response

connecting to flask app over VPN

I am new to Flask and please do not mind if the problem sounds trivial.
I have a Flask app (not written by me) which works fine from the local machine as well as remote machines as well when I am directly connected to the network.
But when I connect to the app over VPN it doesn't work. I am able to ssh on that machine as well as access other servers running on the same machine. It is a physical machine and not a VM
app = Flask(__name__)
def loadAppVariables():
mc = pylibmc.Client(["127.0.0.1"], binary=True,
behaviors={"tcp_nodelay": True,
"ketama": True});
app.mc=mc
def initApp():
app.fNet= {some object }
mc = pylibmc.Client(["127.0.0.1"], binary=True,
behaviors={"tcp_nodelay": True,
"ketama": True});
app.mc=mc;
#app.route('/classify', methods=['POST'])
def classify():
# We will save the file to disk for possible data collection.
imagefile = request.files['imagefile']
processImageFile(imagefile)
#app.route('/')
def index():
return render_template('cindex.html', has_result=False)
#app.before_request
def before_request():
loadAppVariables()
#app.teardown_request
def teardown_request(exception):
storeAppVariables()
if __name__ == '__main__':
initApp();
app.run(debug=False,host='0.0.0.0')
I am running latest Flask version and python 2.7. Can anyone please suggest what may be wrong here ?

It seems that you want to access to local enabled flask over another network.
0.0.0.0 ip is to connect to flask from different machines, but in the same network range. so if your IP isn't in the same range, this fails.
if you want to access your web page from the internet, you should consider to deploy your webapp.

debug Flask server inside Jupyter Notebook

I want to debug small flask server inside jupyter notebook for demo.
I created virtualenv on latest Ubuntu and Python2 (on Mac with Python3 this error occurs as well), pip install flask jupyter.
However, when I create a cell with helloworld script it does not run inside notebook.
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello World!"
if __name__ == "__main__":
app.run(debug=True,port=1234)
File
"/home/***/test/local/lib/python2.7/site-packages/ipykernel/kernelapp.py",
line 177, in _bind_socket
s.bind("tcp://%s:%i" % (self.ip, port)) File "zmq/backend/cython/socket.pyx", line 495, in
zmq.backend.cython.socket.Socket.bind
(zmq/backend/cython/socket.c:5653) File
"zmq/backend/cython/checkrc.pxd", line 25, in
zmq.backend.cython.checkrc._check_rc
(zmq/backend/cython/socket.c:10014)
raise ZMQError(errno) ZMQError: Address already in use
NB - I change the port number after each time it fails.
Sure, it runs as a standalone script.
update without (debug=True) it's ok.

I installed Jupyter and Flask and your original code works.
The flask.Flask object is a WSGI application, not a server. Flask uses Werkzeug's development server as a WSGI server when you call python -m flask run in your shell. It creates a new WSGI server and then passes your app as paremeter to werkzeug.serving.run_simple. Maybe you can try doing that manually:
from werkzeug.wrappers import Request, Response
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello World!"
if __name__ == '__main__':
from werkzeug.serving import run_simple
run_simple('localhost', 9000, app)
Flask.run() calls run_simple() internally, so there should be no difference here.

The trick is to run the Flask server in a separate thread. This code allows registering data providers. The key features are
Find a free port for the server. If you run multiple instances of the server in different notebooks they would compete for the same port.
The register_data function returns the URL of the server so you can use it for whatever you need.
The server is started on-demand (when the first data provider is registered)
Note: I added the #cross_origin() decorator from the flask-cors package. Else you cannot call the API form within the notebook.
Note: there is no way to stop the server in this code...
Note: The code uses typing and python 3.
Note: There is no good error handling at the moment
import socket
import threading
import uuid
from typing import Any, Callable, cast, Optional
from flask import Flask, abort, jsonify
from flask_cors import cross_origin
from werkzeug.serving import run_simple
app = Flask('DataServer')
#app.route('/data/<id>')
#cross_origin()
def data(id: str) -> Any:
func = _data.get(id)
if not func:
abort(400)
return jsonify(func())
_data = {}
_port: int = 0
def register_data(f: Callable[[], Any], id: Optional[str] = None) -> str:
"""Sets a callback for data and returns a URL"""
_start_sever()
id = id or str(uuid.uuid4())
_data[id] = f
return f'http://localhost:{_port}/data/{id}'
def _init_port() -> int:
"""Creates a random free port."""
# see https://stackoverflow.com/a/5089963/2297345
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('localhost', 0))
port = sock.getsockname()[1]
sock.close()
return cast(int, port)
def _start_sever() -> None:
"""Starts a flask server in the background."""
global _port
if _port:
return
_port = _init_port()
thread = threading.Thread(target=lambda: run_simple('localhost', _port, app))
thread.start()

Although this question was asked long ago, I come up with another suggestion:
The following code is adapted from how PyCharm starts a Flask console.
import sys
from flask.cli import ScriptInfo
app = None
locals().update(ScriptInfo(create_app=None).load_app().make_shell_context())
print("Python %s on %s\nApp: %s [%s]\nInstance: %s" % (sys.version, sys.platform, app.import_name, app.env, app.instance_path))
Now you can access app and use everything described in the Flask docs on working with the shell

Gunicorn Flask Caching

I have a Flask application that is running using gunicorn and nginx. But if I change the value in the db, the application fails to update in the browser under some conditions.
I have a flask script that has the following commands
from msldata import app, db, models
path = os.path.dirname(os.path.abspath(__file__))
manager = Manager(app)
#manager.command
def run_dev():
app.debug = True
if os.environ.get('PROFILE'):
from werkzeug.contrib.profiler import ProfilerMiddleware
app.config['PROFILE'] = True
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, restrictions=[30])
if 'LISTEN_PORT' in app.config:
port = app.config['LISTEN_PORT']
else:
port = 5000
print app.config
app.run('0.0.0.0', port=port)
print app.config
#manager.command
def run_server():
from gunicorn.app.base import Application
from gunicorn.six import iteritems
# workers = multiprocessing.cpu_count() * 2 + 1
workers = 1
options = {
'bind': '0.0.0.0:5000',
}
class GunicornRunner(Application):
def __init__(self, app, options=None):
self.options = options or {}
self.application = app
super(GunicornRunner, self).__init__()
def load_config(self):
config = dict([(key, value) for key, value in iteritems(self.options) if key in self.cfg.settings and value is not None])
for key, value in iteritems(config):
self.cfg.set(key.lower(), value)
def load(self):
return self.application
GunicornRunner(app, options).run()
Now if i run the server run_dev in debug mode db modifications are updated
if run_server is used the modifications are not seen unless the app is restarted
However if i run like gunicorn -c a.py app:app, the db updates are visible.
a.py contents
import multiprocessing
bind = "0.0.0.0:5000"
workers = multiprocessing.cpu_count() * 2 + 1
Any suggestions on where I am missing something..

I also ran into this situation. Running flask in Gunicorn with several workers and the flask-cache won´t work anymore.
Since you are already using
app.config.from_object('default_config') (or similar filename)
just add this to you config:
CACHE_TYPE = "filesystem"
CACHE_THRESHOLD = 1000000 (some number your harddrive can manage)
CACHE_DIR = "/full/path/to/dedicated/cache/directory/"
I bet you used "simplecache" before...

I was/am seeing the same thing, Only when running gunicorn with flask. One workaround is to set Gunicorn max-requests to 1. However thats not a real solution if you have any kind of load due to the resource overhead of restarting the workers after each request. I got around this by having nginx serve the static content and then changing my flask app to render the template and write to static, then return a redirect to the static file.

Flask-Caching SimpleCache doesn't work w. workers > 1 Gunicorn
Had similar issue using version Flask 2.02 and Flask-Caching 1.10.1.
Everything works fine in development mode until you put on gunicorn with more than 1 worker. One probably reason is that on development there is only one process/worker so weirdly under this restrict circumstances SimpleCache works.
My code was:
app.config['CACHE_TYPE'] = 'SimpleCache' # a simple Python dictionary
cache = Cache(app)
Solution to work with Flask-Caching use FileSystemCache, my code now:
app.config['CACHE_TYPE'] = 'FileSystemCache'
app.config['CACHE_DIR'] = 'cache' # path to your server cache folder
app.config['CACHE_THRESHOLD'] = 100000 # number of 'files' before start auto-delete
cache = Cache(app)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

FastAPI server running on AWS App Runner fails after 24 hours - python

Related

Mongoengine connecting to atlas No replica set members found yet

nginx buffering flask event-stream in Docker image

connecting to flask app over VPN

debug Flask server inside Jupyter Notebook

Gunicorn Flask Caching

Categories

Resources