Flask SocketIO + Gevent - buffering events from external processes - python

I want to send socket from asynchronous class in my Flask project. But when I send it, it takes enormous time before it arrives to JavaScript. I am sending it as:
socket_io.emit("event_name", {"foo": "bar"}, broadcast=True, namespace="/com")
App with socketio is initialised as:
app = Flask(__name__, template_folder="templates", static_folder="static", static_url_path="/static")
socketio = SocketIO(app=app, cookie="cookie_name", async_mode=None)
And it is started by this command:
socketio.run(app=app, host="0.0.0.0", port=5000, log_output=False)
My Python library versions are:
# Python == 3.8.5
Flask==1.1.2
Flask-SocketIO==5.0.1
python-engineio==4.0.0
python-socketio==5.04
gevent==20.12.1
gevent-websocket==0.10.1
JavaScript SocketIO: v3.0.4
When I send socket normally by emit command in socket_io handler, it works ok. But when I want to send same socket from external process, it takes a long time.
Does anyone know, how can I solve this problem?
Thank you

The problem was with monkey patching and 32-bit Python3. I must install 64-bit Python3 and then add this at the first line:
import gevent.monkey; gevent.monkey.patch_all()

Related

Cannot get celery to run task with Flask-SockeIO and eventlet monkey patching

Update 2:
The solution is in how monkey patching actually gets done. See my answer below.
Update 1:
The issue is the monkey patching of eventlet. Monkey patching is pretty magic to me, so I don't fully understand why exactly. I can get the celery task to run if I don't monkey patch eventlet. However, if I don't monkey patch eventlet, I cannot use a SocketIO instance in another process, as I cannot use redis as the SocketIO message queue without monkey patching. Trying gevent and still running into issues but will update with results.
Also note, I had to change the Celery object instantiation to Celery("backend") rather than Celery(app.import_name) or Celery(app.name) or Celery(__name__) to get the non-monkey patched task to run. Because I am not using anything from the app context in my task, I actually don't even need the make_celery.py module, and can just instantiate it directly within backend.py.
I also tried different databases within redis, thought it might be a conflict.
I also moved to remote debugging of the celery task through telnet as described here. Again, when NOT monkey patching, the task runs, the external socketio object exists, though it cannot communicate to the main server to emit. When I DO monkey patch, the task won't even run.
When using gevent and monkey patching, the application doesn't even start up.
The Goal:
Run a real-time web application that pushes data created in parallel process to the client via Socket.IO. I am using Flask and Flask-SocketIO.
What I've Tried:
I originally was just emitting from an external process as described in the docs (see my original minimum working example, here). However, this proved to be buggy. Specifically, it worked flawlessly when the data streaming object was instantiated and the external process started within the if __name__ == "__main__: block, but failed when it was instantiated and started on demand from a Socket.IO event. Much research led me to this eventlet issue, which is still open and suggests eventlet and multiprocessing do not play well together. I then tried gevent and it worked for a while but was still buggy when left running for long periods of time (e.g. 12 hours).
This answer led me to try Celery in my app and I have been struggling ever since. Specifically, my issues is that the task status shows pending for a while (I'm guessing the defualt timeout amount of time) and then shows failure. When running the worker with debug logging level, the error shows Received unregistered task of type '__main__.stream_data'. I've tried every way I can think of to start the worker and register the task. I am confused because my Celery instance is defined in the same scope as the task definition, and, like the countless tutorials and examples I've found online, I start the worker with celery -A backend.cel worker -l DEBUG to tell it to queue from the celery instance within the backend.py module (at least that's what I think that command is doing).
My Current Project State:
.
├── backend.py
├── static
│   └── js
│   ├── main.js
│   ├── socket.io.min.js
│   └── socket.io.min.js.map
└── templates
└── index.html
backend.py
import eventlet
eventlet.monkey_patch()
# ^^^ COMMENT/UNCOMMENT to get the task to RUN/NOT RUN
from random import randrange
import time
from redis import Redis
from flask import Flask, render_template, request
from flask_socketio import SocketIO
from celery import Celery
from celery.contrib import rdb
def message_queue(db):
# thought it might be conflicting redis databases so I allowed choice
# this was not the issue.
return f"redis://localhost:6379/{db}"
app = Flask(__name__)
socketio = SocketIO(app, message_queue=message_queue(0))
cel = Celery("backend", broker=message_queue(0), backend=message_queue(0))
#app.route('/')
def index():
return render_template("index.html")
#socketio.on("start_data_stream")
def start_data_stream():
socketio.emit("new_data", {"value" : 666}) # <<< sanity check, socket server is working here
stream_data.delay(request.sid)
#cel.task()
def stream_data(sid):
data_socketio = SocketIO(message_queue=message_queue(0))
i = 1
while i <= 100:
value = randrange(0, 10000, 1) / 100
data_socketio.emit("new_data", {"value" : value})
i += 1
time.sleep(0.01)
# rdb.set_trace() # <<<< comment/uncomment as needed for debugging, see: https://docs.celeryq.dev/en/latest/userguide/debugging.html
return i, value
if __name__ == "__main__":
r = Redis()
r.flushall()
if r.ping():
pass
else:
raise Exception("You need redis: https://redis.io/docs/getting-started/installation/. Check that redis-server.service is running!")
ip = "192.168.1.8" # insert LAN address here
port = 8080
socketio.run(app, host=ip, port=port, use_reloader=False, debug=True)
index.html
<!DOCTYPE html>
<html>
<head>
<title>Minimal Example</title>
<script src="{{ url_for('static', filename='js/socket.io.min.js') }}"></script>
</head>
<body>
<button id="start" onclick="button_handler()">Start Stream</button>
<span id="data"></span>
<script type="text/javascript" src="{{ url_for('static', filename='js/main.js') }}"></script>
</body>
</html>
main.js
var socket = io(location.origin);
var span = document.getElementById("data");
function button_handler() {
socket.emit("start_data_stream");
}
socket.on("new_data", function(data) {
span.innerHTML = data.value;
});
dependencies
Package Version
---------------- -------
amqp 5.1.1
async-timeout 4.0.2
bidict 0.22.0
billiard 3.6.4.0
celery 5.2.7
click 8.1.3
click-didyoumean 0.3.0
click-plugins 1.1.1
click-repl 0.2.0
Deprecated 1.2.13
dnspython 2.2.1
eventlet 0.33.2
Flask 2.2.2
Flask-SocketIO 5.3.2
gevent 22.10.2
gevent-websocket 0.10.1
greenlet 2.0.1
itsdangerous 2.1.2
Jinja2 3.1.2
kombu 5.2.4
MarkupSafe 2.1.1
packaging 22.0
pip 22.3.1
prompt-toolkit 3.0.36
python-engineio 4.3.4
python-socketio 5.7.2
pytz 2022.6
redis 4.4.0
setuptools 58.1.0
six 1.16.0
vine 5.0.0
wcwidth 0.2.5
Werkzeug 2.2.2
wrapt 1.14.1
zope.event 4.6
zope.interface 5.5.2
Questions:
Is this still an issue with eventlet? This answer leads me to believe that Celery is the workaround to the eventlet issue and suggests Celery doesn't even need eventlet to work. However, eventlet seems entrenched in my source, as nothing works on the redis side if I do not monkey patch. Also, the docs suggest that Flask-SocketIO automatically looks for eventlet, so just by instantiating the external SocketIO server in the celery task would bring in eventlet, correct? Is there something else I am doing wrong? Perhaps there are better ways to debug the worker and task?
Any help would be greatly appreciated, thanks!
The Flask-SocketIO documentation led me to beleive that I needed to patch the entire standard library with:
import eventlet
eventlet.monkey_patch()
However, after reading about eventlet monkey patching here and here, I discovered this is not the case. For redis and flask_socketio I only need to patch the socket library as follows:
import eventlet
eventlet.monkey_patch(all=False, socket=True)
In addition, celery needs to be specifically imported with a patched version as well with:
celery = eventlet.import_patched("celery")
This gives the following full code for backend.py:
import eventlet
eventlet.monkey_patch(all=False, socket=True)
from random import randrange
import time
import redis
celery = eventlet.import_patched("celery")
from flask import Flask, render_template, request
from flask_socketio import SocketIO
message_queue = "redis://localhost:6379/0"
app = Flask(__name__)
socketio = SocketIO(app, message_queue=message_queue)
cel = celery.Celery("backend", broker=message_queue, backend=message_queue)
#app.route('/')
def index():
return render_template("index.html")
#socketio.on("start_data_stream")
def start_data_stream():
stream_data.delay(request.sid, message_queue)
#cel.task()
def stream_data(sid, message_queue):
data_socketio = SocketIO(message_queue=message_queue)
i = 1
while i <= 100:
value = randrange(0, 10000, 1) / 100
data_socketio.emit("new_data", {"value" : value})
i += 1
time.sleep(0.01)
return i, value
if __name__ == "__main__":
r = redis.Redis()
r.flushall() # clear the old, abandoned messages from the queue
if r.ping():
pass
else:
raise Exception("You need redis: https://redis.io/docs/getting-started/installation/. Check that redis-server.service is running!")
ip = "192.168.1.8" # insert LAN address here
port = 8080
socketio.run(app, host=ip, port=port, use_reloader=False, debug=True)
Package:
Here's the whole working example as a package: https://github.com/jacoblapenna/Eventlet_Flask-SocketIO_Celery.git
NOTE:
This solution likely is also a workaround for the eventlet and multiprocessing issue (discussed here and here), and emitting from an external process wouldn't even need celery. However, I also plan to distribute other control tasks to other devices on the same LAN, and celery should be perfect for this.

wxPython and Flask integration

I am trying to integrate wxPython and Flask into a single application, but I am not sure how to get them to work together as they both want exclusive use of the main thread.
I am calling the application with:
export FLASK_APP=keypad_controller
python3 -m flask run -p 2020 -h 0.0.0.0 --eager-loading --no-reload
The main code block using Flask is:
from flask import Flask
def create_app(test_config=None):
app = Flask(__name__)
return app
I am not sure how to integrate wxPython (below) into the above code, how do I run flask?
wx_app = wx.App()
main_window = MainWindow(config)
main_window.Show()
wx_app.MainLoop()
Start Flask (app.run()) in a separate thread.
Not that if you want app.run(debug=True), you must also pass use_reloader=False, because things will go off the rails quickly if Flask decides that it wants needs to reload anything from other than the main thread.

Specify number of processes for eventlet wsgi server

I am trying to add websocket functionality to an existing application. The existing structure of the app is
In /server/__init__.py:
from connexion import App
...
connexion_app = App(__name__, specification_dir='swagger/') # Create Connexion App
app = connexion_app.app # Configure Flask Application
...
connexion_app.add_api('swagger.yaml', swagger_ui=True) # Initialize Connexion api
In startserver.py:
from server import connexion_app
connexion_app.run(
processes=8,
debug=True
)
In this way, I was able to specify the number of processes. There are some long-running tasks that make it necessary to have as many processes as possible.
I have modified the application to include websocket functionality as below. It seems to be that I only have one process available. Once the application attempts to run one of the long-running processes, all API calls hang. Also, if the long-runnign process fails, the application is stuck in a hanging state
In /server/__init__.py:
from connexion import App
import socketio
...
connexion_app = App(__name__, specification_dir='swagger/') # Create Connexion App
sio = socketio.Server() # Create SocketIO for websockets
app = connexion_app.app # Configure Flask Application
...
connexion_app.add_api('swagger.yaml', swagger_ui=True) # Initialize Connexion api
In startserver.py:
import socketio
import eventlet
from server import sio
from server import app
myapp = socketio.Middleware(sio, app)
eventlet.wsgi.server(eventlet.listen(('', 5000)), myapp)
What am I missing here?
(side note: If you have any resources available to better understand the behemoth of the Flask object, please point me to them!!)
Exact answer to question: Eventlet built-in WSGI does not support multiple processes.
Approach to get the best solution for described problem: share one file that contains absolute minimum code required to reproduce problem. Maybe here https://github.com/eventlet/eventlet/issues or any other way you prefer.
Way of hope. Random stuff to poke at: eventlet.monkey_patch(), isolate Eventlet and long blocking calls in separate threads or processes.

Replacing flask internal web server with Apache

I have written a single user application that currently works with Flask internal web server. It does not seem to be very robust and it crashes with all sorts of socket errors as soon as a page takes a long time to load and the user navigates elsewhere while waiting. So I thought to replace it with Apache.
The problem is, my current code is a single program that first launches about ten threads to do stuff, for example set up ssh tunnels to remote servers and zmq connections to communicate with a database located there. Finally it enters run() loop to start the internal server.
I followed all sorts of instructions and managed to get Apache service the initial page. However, everything goes wrong as I now don't have any worker threads available, nor any globally initialised classes, and none of my global variables holding interfaces to communicate with these threads do not exist.
Obviously I am not a web developer.
How badly "wrong" my current code is? Is there any way to make that work with Apache with a reasonable amount of work? Can I have Apache just replace the run() part and have a running application, with which Apache communicates? My current app in a very simplified form (without data processing threads) is something like this:
comm=None
app = Flask(__name__)
class CommsHandler(object):
__init__(self):
*Init communication links to external servers and databases*
def request_data(self, request):
*Use initialised links to request something*
return result
#app.route("/", methods=["GET"]):
def mainpage():
return render_template("main.html")
#app.route("/foo", methods=["GET"]):
def foo():
a=comm.request_data("xyzzy")
return render_template("foo.html", data=a)
comm = CommsHandler()
app.run()
Or have I done this completely wrong? Now when I remove app.run and just import app class to wsgi script, I do get a response from the main page as it does not need reference to global variable comm.
/foo does not work, as "comm" is an uninitialised variable. And I can see why, of course. I just never thought this would need to be exported to Apache or any other web server.
So the question is, can I launch this application somehow in a rc script at boot, set up its communication links and everyhing, and have Apache/wsgi just call function of the running application instead of launching a new one?
Hannu
This is the simple app with flask run on internal server:
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello World!"
if __name__ == "__main__":
app.run()
To run it on apache server Check out fastCGI doc :
from flup.server.fcgi import WSGIServer
from yourapplication import app
if __name__ == '__main__':
WSGIServer(app).run()

Is there any way of detecting an automatic reload in flask's debug mode?

I have a flask app where I'd like to execute some code on the first time the app is run, not on the automatic reloads triggered by the debug mode. Is there any way of detecting when a reload is triggered so that I can do this?
To give an example, I might want to open a web browser every time I run the app from sublime text, but not when I subsequently edit the files, like so:
import webbrowser
if __name__ == '__main__':
webbrowser.open('http://localhost:5000')
app.run(host='localhost', port=5000, debug=True)
You can set an environment variable.
import os
if 'WERKZEUG_LOADED' in os.environ:
print 'Reloading...'
else:
print 'Starting...'
os.environ['WERKZEUG_LOADED']='TRUE'
I still don't know how to persist a reference that survives the reloading, though.
What about using Flask-Script to kick off a process before you start your server? Something like this (cribbed from their documentation and edited slightly):
# run_devserver.py
import webbrowser
from flask.ext.script import Manager
from myapp import app
manager = Manager(app)
if __name__ == "__main__":
webbrowser.open('http://localhost:5000')
manager.run(host='localhost', port=5000, debug=True)
I have a Flask app where it's not really practical to change the DEBUG flag or disable reloading, and the app is spun up in a more complex way than just flask run.
#osa's solution didn't work for me with flask debug on, because it doesn't have enough finesse to pick out the werkzeug watcher process from the worker process that gets reloaded.
I have this code in my main package's __init__.py (the package that defines the flask app). This code is run by another small module which has from <the_package_name> import app followed by app.run(debug=True, host='0.0.0.0', port=5000). Therefore this code is executed before the app starts.
import ptvsd
import os
my_pid = os.getpid()
if os.environ.get('PPID') == str(os.getppid()):
logger.debug('Reloading...')
logger.debug(f"Current process ID: {my_pid}")
try:
port = 5678
ptvsd.enable_attach(address=('0.0.0.0', port))
logger.debug(f'========================== PTVSD waiting on port {port} ==========================')
# ptvsd.wait_for_attach() # Not necessary for my app; YMMV
except Exception as ex:
logger.debug(f'PTVSD raised {ex}')
else:
logger.debug('Starting...')
os.environ['PPID'] = str(my_pid)
logger.debug(f"First process ID: {my_pid}")
NB: note the difference between os.getpid() and os.getppid() (the latter gets the parent process's ID).
I can attach at any point and it works great, even if the app has reloaded already before I attach. I can detach and re-attach. The debugger survives a reload.

Categories

Resources