I have a REST API backend with python/flask and want to stream the response in an event stream. Everything is running inside a docker container with nginx/uwsgi (https://hub.docker.com/r/tiangolo/uwsgi-nginx-flask/).
The API works fine until it comes to the event-stream. It seems like something (probably nginx) is buffering the "yields" because nothing is received by any kind of client until the server finished the calculation and everything is sent together.
I tried to adapt the nginx settings (according to the docker image instructions) with an additional config (nginx_streaming.conf) file saying:
server {
location / {
include uwsgi_params;
uwsgi_request_buffering off;
}
}
dockerfile:
FROM tiangolo/uwsgi-nginx-flask:python3.6
COPY ./app /app
COPY ./nginx_streaming.conf /etc/nginx/conf.d/nginx_streaming.conf
But I am not really familiar with nginx settings and sure what I am doing here^^ This at least does not work.. any suggestions?
My server side implementation:
from flask import Flask
from flask import stream_with_context, request, Response
from werkzeug.contrib.cache import SimpleCache
cache = SimpleCache()
app = Flask(__name__)
from multiprocessing import Pool, Process
#app.route("/my-app")
def myFunc():
global cache
arg = request.args.get(<my-arg>)
cachekey = str(arg)
print(cachekey)
result = cache.get(cachekey)
if result is not None:
print('Result from cache')
return result
else:
print('object not in Cache...calculate...')
def calcResult():
yield 'worker thread started\n'
with Pool(processes=cores) as parallel_pool:
[...]
yield 'Somewhere in the processing'
temp_result = doSomethingWith(
savetocache = cache.set(cachekey, temp_result, timeout=60*60*24) #timeout in seconds
yield 'saved to cache with key:' + cachekey +'\n'
print(savetocache, flush=True)
yield temp_result
return Response(calcResult(), content_type="text/event-stream")
if __name__ == "__main__":
# Only for debugging while developing
app.run(host='0.0.0.0', debug=True, port=80)
I ran into the same problem. Try changing
return Response(calcResult(), content_type="text/event-stream")
to
return Response(calcResult(), content_type="text/event-stream", headers={'X-Accel-Buffering': 'no'})
Following the answer from #u-rizwan here, I added this to the /etc/nginx/conf.d/mysite.conf and it resolved the problem:
add_header X-Accel-Buffering no;
I have added it under location /, but it is probably a good idea to put it under the specific location of the event stream (I have a low traffic intranet use case here).
Note: Looks like nginx could be stripping this header by default if it comes from the application: https://serverfault.com/questions/937665/does-nginx-show-x-accel-headers-in-response
Related
I have a FastAPI server configured with Gunicorn, deployed on AWS App Runner. When I try to access the endpoint, it works perfectly, however, after 24 hours, when I try to access the same endpoint, I get a 502 bad gateway error, and nothing is logged on cloudWatch after this point, until I redeploy the application, then it starts working fine again.
I suspect this has to do with my Gunicorn configuration itself which was somehow shutting down my API after some time, and not AWS App Runner, but I have not found any solution. I have also shown my Gunicorn setup below. Any hep will be appreciated.
from fastapi import FastAPI
import uvicorn
from fastapi.middleware.cors import CORSMiddleware
from gunicorn.app.base import BaseApplication
import os
import multiprocessing
api = FastAPI()
def number_of_workers():
print((multiprocessing.cpu_count() * 2) + 1)
return (multiprocessing.cpu_count() * 2) + 1
class StandaloneApplication(BaseApplication):
def __init__(self, app, options=None):
self.options = options or {}
self.application = app
super().__init__()
def load_config(self):
config = {
key: value for key, value in self.options.items()
if key in self.cfg.settings and value is not None
}
for key, value in config.items():
self.cfg.set(key.lower(), value)
def load(self):
return self.application
#api.get("/test")
async def root():
return 'Success'
if __name__ == "__main__":
if os.environ.get('APP_ENV') == "development":
uvicorn.run("api:api", host="0.0.0.0", port=2304, reload=True)
else:
options = {
"bind": "0.0.0.0:2304",
"workers": number_of_workers(),
"accesslog": "-",
"errorlog": "-",
"worker_class": "uvicorn.workers.UvicornWorker",
"timeout": "0"
}
StandaloneApplication(api, options).run()
I had the same problem. After a lot of trial and error, two changes seemed to resolve this for me.
Set uvicorn --timeout-keep-alive to 65. For gunicorn this param is --keep-alive. I'm assuming the Application Load Balancer throws 502 if uvicorn closes the tcp socket before ALB does.
Change the App Runner health check to use HTTP rather than TCP ping to manage container recycling. Currently the AWS UI doesn't allow you to make this change. You will have to do this using aws cli. Use any active URL path for ping check - in your case /test
aws apprunner update-service --service-arn <arn> --health-check-configuration Protocol=HTTP,Path=/test
#2 might just be enough to resolve the issue.
I'm using a flask server for RESTful web services and python-socketio to achieve bi-directional communication between the server and the client to keep track of download progress on the backend.
I take variable sio declared in the server.py file and pass it in as a parameter into a new object that will use it to emit to the client certain messages about it progress with downloading a file on the server.
sio = socketio.Server(async_mode='threading')
omics_env = None
#sio.on('init', namespace='/guardiome')
def init(sid, data):
global omics_env
if omics_env == None:
omics_env = Environment(socket=sio)
omics_env.conda.download_conda()
omics_env.data_management.download_omics_data()
The issue is when the file is downloading in python server, it emits a message to the client every time it has written 1 percent of data to file. But it's doesn't always emit to the client every time it has downloaded/written 1 percent of the data to file.
It will usually report progress to 18% percent, hold off for a while, then report back 40%, skipping the emits between 18% and 40%.
Some might say it's internet probably lagging, but I did print statements in the download function on top of the emit function which shows that it's writing/downloading every 1 percent of the data.
I also have checked online for other resource. Some mentioned using eventlet and do something like this at the highlest level of code of the server.
import eventlet
evenlet.monkey_patch()
But that doesn't lead to the code emitting at all.
Others have mentioned using a message queue like redis, but I can't use redis and I plan on turning the whole python code into a binary executable for it to be completely portable on linux platform to communicate with a local client.
Here is my server.py
import socketio
import eventlet.wsgi
from environment import Environment
from flask import Flask, jsonify, request, send_file
from flask_cors import CORS
omics_env = None
sio = socketio.Server(async_mode='threading')
app = Flask(__name__)
CORS(app)
#sio.on('init', namespace='/guardiome')
def init(sid, data):
global omics_env
if omics_env == None:
omics_env = Environment(socket=sio)
omics_env.conda.download_conda()
omics_env.data_management.download_omics_data()
omics_env.logger.info('_is_ready()')
sio.emit(
event='init',
data={'status': True, 'information': None},
namespace='/guardiome')
try:
# wrap Flask application with engineio's middleware
app.wsgi_app = socketio.Middleware(sio, app.wsgi_app)
# Launch the server with socket integration
app.run(port=8008, debug=False, threaded=True)
finally:
pass
# LOGGER.info('Exiting ...')
Here is the download_w_progress function that i pass sio into as reporter parameter
def download_w_progress(url , path, reporter=None):
ssl._create_default_https_context = ssl._create_unverified_context
r = requests.get(url, stream=True)
# Helper lambda functions
progress_report = lambda current, total: int((current/total)*100)
raw_percent = lambda current, total: (current/total)*100
# TODO(mak3): Write lambda function for reporting amount of file downloaded
# in MB, KB, GB, or whatever
with open(path, 'wb') as f:
total_length = int(r.headers.get('content-length'))
progress_count = 0
chunk_size = 1024
# Used to cut down on emit the same rounded percentage number
previous_percent = -1
# Read and write the file in chunks to its destination
for chunk in r.iter_content(chunk_size=1024):
progress_dict = {
"percent": progress_report(progress_count, total_length)
}
if reporter != None:
# Limit the number of emits sent to prevent
# to socket from overworking
if progress_dict["percent"] != previous_percent:
reporter.emit(event="environment", namespace="/guardiome", data=progress_dict)
# TODO(mak3): Remove or uncomment in production
if progress_dict["percent"] != previous_percent:
print(progress_dict["percent"], end='\r')
progress_count += chunk_size
previous_percent = progress_dict["percent"]
if chunk:
f.write(chunk)
f.flush()
Sorry I missed this question when you posted it.
There are a couple of problems in your code. You are choosing the async_mode='threading. In general it is best to omit this argument and let the server choose the best async mode depending on the server that you are using. When you add eventlet, for example, the threading mode is not going to work, there is actually a specific async mode for eventlet.
So my recommendation would be to:
remove the async_mode argument in the socketio.Server() constructor
install eventlet in your virtual environment
replace the app.run() section in your script with code that starts the eventlet server, or given that you are using Flask, use the Flask-SocketIO extension, which already has this code built in.
add a sio.sleep(0) call inside the loop where you read your file. This will give eventlet a chance to keep all tasks running smoothly.
I have written a single user application that currently works with Flask internal web server. It does not seem to be very robust and it crashes with all sorts of socket errors as soon as a page takes a long time to load and the user navigates elsewhere while waiting. So I thought to replace it with Apache.
The problem is, my current code is a single program that first launches about ten threads to do stuff, for example set up ssh tunnels to remote servers and zmq connections to communicate with a database located there. Finally it enters run() loop to start the internal server.
I followed all sorts of instructions and managed to get Apache service the initial page. However, everything goes wrong as I now don't have any worker threads available, nor any globally initialised classes, and none of my global variables holding interfaces to communicate with these threads do not exist.
Obviously I am not a web developer.
How badly "wrong" my current code is? Is there any way to make that work with Apache with a reasonable amount of work? Can I have Apache just replace the run() part and have a running application, with which Apache communicates? My current app in a very simplified form (without data processing threads) is something like this:
comm=None
app = Flask(__name__)
class CommsHandler(object):
__init__(self):
*Init communication links to external servers and databases*
def request_data(self, request):
*Use initialised links to request something*
return result
#app.route("/", methods=["GET"]):
def mainpage():
return render_template("main.html")
#app.route("/foo", methods=["GET"]):
def foo():
a=comm.request_data("xyzzy")
return render_template("foo.html", data=a)
comm = CommsHandler()
app.run()
Or have I done this completely wrong? Now when I remove app.run and just import app class to wsgi script, I do get a response from the main page as it does not need reference to global variable comm.
/foo does not work, as "comm" is an uninitialised variable. And I can see why, of course. I just never thought this would need to be exported to Apache or any other web server.
So the question is, can I launch this application somehow in a rc script at boot, set up its communication links and everyhing, and have Apache/wsgi just call function of the running application instead of launching a new one?
Hannu
This is the simple app with flask run on internal server:
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello World!"
if __name__ == "__main__":
app.run()
To run it on apache server Check out fastCGI doc :
from flup.server.fcgi import WSGIServer
from yourapplication import app
if __name__ == '__main__':
WSGIServer(app).run()
I have a Flask application that is running using gunicorn and nginx. But if I change the value in the db, the application fails to update in the browser under some conditions.
I have a flask script that has the following commands
from msldata import app, db, models
path = os.path.dirname(os.path.abspath(__file__))
manager = Manager(app)
#manager.command
def run_dev():
app.debug = True
if os.environ.get('PROFILE'):
from werkzeug.contrib.profiler import ProfilerMiddleware
app.config['PROFILE'] = True
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, restrictions=[30])
if 'LISTEN_PORT' in app.config:
port = app.config['LISTEN_PORT']
else:
port = 5000
print app.config
app.run('0.0.0.0', port=port)
print app.config
#manager.command
def run_server():
from gunicorn.app.base import Application
from gunicorn.six import iteritems
# workers = multiprocessing.cpu_count() * 2 + 1
workers = 1
options = {
'bind': '0.0.0.0:5000',
}
class GunicornRunner(Application):
def __init__(self, app, options=None):
self.options = options or {}
self.application = app
super(GunicornRunner, self).__init__()
def load_config(self):
config = dict([(key, value) for key, value in iteritems(self.options) if key in self.cfg.settings and value is not None])
for key, value in iteritems(config):
self.cfg.set(key.lower(), value)
def load(self):
return self.application
GunicornRunner(app, options).run()
Now if i run the server run_dev in debug mode db modifications are updated
if run_server is used the modifications are not seen unless the app is restarted
However if i run like gunicorn -c a.py app:app, the db updates are visible.
a.py contents
import multiprocessing
bind = "0.0.0.0:5000"
workers = multiprocessing.cpu_count() * 2 + 1
Any suggestions on where I am missing something..
I also ran into this situation. Running flask in Gunicorn with several workers and the flask-cache won´t work anymore.
Since you are already using
app.config.from_object('default_config') (or similar filename)
just add this to you config:
CACHE_TYPE = "filesystem"
CACHE_THRESHOLD = 1000000 (some number your harddrive can manage)
CACHE_DIR = "/full/path/to/dedicated/cache/directory/"
I bet you used "simplecache" before...
I was/am seeing the same thing, Only when running gunicorn with flask. One workaround is to set Gunicorn max-requests to 1. However thats not a real solution if you have any kind of load due to the resource overhead of restarting the workers after each request. I got around this by having nginx serve the static content and then changing my flask app to render the template and write to static, then return a redirect to the static file.
Flask-Caching SimpleCache doesn't work w. workers > 1 Gunicorn
Had similar issue using version Flask 2.02 and Flask-Caching 1.10.1.
Everything works fine in development mode until you put on gunicorn with more than 1 worker. One probably reason is that on development there is only one process/worker so weirdly under this restrict circumstances SimpleCache works.
My code was:
app.config['CACHE_TYPE'] = 'SimpleCache' # a simple Python dictionary
cache = Cache(app)
Solution to work with Flask-Caching use FileSystemCache, my code now:
app.config['CACHE_TYPE'] = 'FileSystemCache'
app.config['CACHE_DIR'] = 'cache' # path to your server cache folder
app.config['CACHE_THRESHOLD'] = 100000 # number of 'files' before start auto-delete
cache = Cache(app)
I have a question to ask regarding the performance of my flask app when I incorporated uwsgi and nginx.
My app.view file looks like this:
import app.lib.test_case as test_case
from app import app
import time
#app.route('/<int:test_number>')
def test_case_match(test_number):
rubbish = test_case.test(test_number)
return "rubbish World!"
My app.lib.test_case file look like this:
import time
def test_case(test_number):
time.sleep(30)
return None
And my config.ini for my uwsgi looks like this:
[uwsgi]
socket = 127.0.0.1:8080
chdir = /home/ubuntu/test
module = app:app
master = true
processes = 2
daemonize = /tmp/uwsgi_daemonize.log
pidfile = /tmp/process_pid.pid
Now if I run this test case just purely through the flask framework without switching on uwsgi + nginx, using the ab benchmark, I received a response in 31seconds which is expected owning to the sleep function. What I dont get is when I run the app through uwsgi + nginx , the response time I got was 38 seconds, which is an overhead of around 25%. Can anyone enlighten me?
time.sleep() is not time-safe.
From the documentation of time.sleep(secs):
[…] Also, the suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.