Python deploying infinite while loop to production and handling failures - python

I have a script that waits for tasks from a task queue and then runs them. Something like this minimal example:
import redis
cache = redis.Redis(host='127.0.0.1', port=6379)
import time
def main():
while True:
message = cache.blpop('QUEUE', timeout=0)
work(message)
def work(message):
print(f"beginning work: {message}")
time.sleep(10)
if __name__ == "__main__":
main()
I am not using a web server because the script does not need to answer http requests. However I've confused myself a bit about how to make this script robust against errors in production.
With a web server and gunicorn, gunicorn would handle forking a process for each request. If the request causes an error then the worker dies and the request fails but the server continues to run.
How can I achieve this if I'm not running an http server? I could fork a process to perform the "work" function, but the code performing the fork would still be application code.
Is it possible to deploy a non-http server script like mine using Gunicorn? Is there something else I should be using to handle forking processes?
Or is it reasonable to fork inside the application and deploy to production?

what about this:
while True:
try:
message = cache.blpop('QUEUE', timeout=0)
work(message)
except: Exception as e :
print(e)

Related

python : dynamically spawn multithread workers with flask-socket io and python-binance

Hello fellow developers,
I'm actually trying to create a small webapp that would allow me to monitor multiple binance accounts from a dashboard and maybe in the futur perform some small automatic trading actions.
My frontend is implemented with Vue+quasar and my backend server is based on python Flask for the REST api.
What I would like to do is being able to start a background process dynamically when a specific endpoint of my server is called. Once this process is started on the server, I would like it to communicate via websocket with my Vue client.
Right now I can spawn the worker and create the websocket communication, but somehow, I can't figure out how to make all the threads in my worker to work all together. Let me get a bit more specific:
Once my worker is started, I'm trying to create at least two threads. One is the infinite loop allowing me to automate some small actions and the other one is the flask-socketio server that will handle the sockets connections. Here is the code of that worker :
customWorker.py
import time
from flask import Flask
from flask_socketio import SocketIO, send, emit
import threading
import json
import eventlet
# custom class allowing me to communicate with my mongoDD
from db_wrap import DbWrap
from binance.client import Client
from binance.exceptions import BinanceAPIException, BinanceWithdrawException, BinanceRequestException
from binance.websockets import BinanceSocketManager
def process_message(msg):
print('got a websocket message')
print(msg)
class customWorker:
def __init__(self, workerId, sleepTime, dbWrap):
self.workerId = workerId
self.sleepTime = sleepTime
self.socketio = None
self.dbWrap = DbWrap()
# this retrieves worker configuration from database
self.config = json.loads(self.dbWrap.get_worker(workerId))
keys = self.dbWrap.get_worker_keys(workerId)
self.binanceClient = Client(keys['apiKey'], keys['apiSecret'])
def handle_message(self, data):
print ('My PID is {} and I received {}'.format(os.getpid(), data))
send(os.getpid())
def init_websocket_server(self):
app = Flask(__name__)
socketio = SocketIO(app, async_mode='eventlet', logger=True, engineio_logger=True, cors_allowed_origins="*")
eventlet.monkey_patch()
socketio.on_event('message', self.handle_message)
self.socketio = socketio
self.app = app
def launch_main_thread(self):
while True:
print('My PID is {} and workerId {}'
.format(os.getpid(), self.workerId))
if self.socketio is not None:
info = self.binanceClient.get_account()
self.socketio.emit('my_account', info, namespace='/')
def launch_worker(self):
self.init_websocket_server()
self.socketio.start_background_task(self.launch_main_thread)
self.socketio.run(self.app, host="127.0.0.1", port=8001, debug=True, use_reloader=False)
Once the REST endpoint is called, the worker is spawned by calling birth_worker() method of "Broker" object available within my server :
from custom_worker import customWorker
#...
def create_worker(self, workerid, sleepTime, dbWrap):
worker = customWorker(workerid, sleepTime, dbWrap)
worker.launch_worker()
def birth_worker(workerid, 5, dbwrap):
p = Process(target=self.create_worker, args=(workerid,10, botPipe, dbWrap))
p.start()
So when this is done, the worker is launched in a separate process that successfully creates threads and listens for socket connection. But my problem is that I can't use my binanceClient in my main thread. I think that it is using threads and the fact that I use eventlet and in particular the monkey_patch() function breaks it. When I try to call the binanceClient.get_account() method I get an error AttributeError: module 'select' has no attribute 'poll'
I'm pretty sure about that it comes from monkey_patch because if I use it in the init() method of my worker (before patching) it works and I can get the account info. So I guess there is a conflict here that I've been trying to resolve unsuccessfully.
I've tried using only the thread mode for my socket.io app by using async_mode=threading but then, my flask-socketio app won't start and listen for sockets as the line self.socketio.run(self.app, host="127.0.0.1", port=8001, debug=True, use_reloader=False) blocks everything
I'm pretty sure I have an architecture problem here and that I shouldn't start my app by launching socketio.run. I've been unable to start it with gunicorn for example because I need it to be dynamic and call it from my python scripts. I've been struggling to find the proper way to do this and that's why I'm here today.
Could someone please give me a hint on how is this supposed to be achieved ? How can I dynamically spawn a subprocess that will manage a socket server thread, an infinite loop thread and connections with binanceClient ? I've been roaming stack overflow without success, every advice is welcome, even an architecture reforge.
Here is my environnement:
Manjaro Linux 21.0.1
pip-chill:
eventlet==0.30.2
flask-cors==3.0.10
flask-socketio==5.0.1
pillow==8.2.0
pymongo==3.11.3
python-binance==0.7.11
websockets==8.1

How to capture ctrl-c for killing a Flask python script

I have the following script which just boots up a web server serving a dynamically created website. In order to get dynamic data the script opens a file to read the data.
My concern is how can I catch CTRL-C command for killing the python script so I can close the file before script thread is killed.
I tried the following couple things but neither work:
from flask import Flask, render_template
import time
# Initialize the Flask application
app = Flask(__name__)
fileNames = {}
fileDesc = {}
for idx in range(1,4):
fileNames["name{}".format(idx)] = "./name" + str(idx) + ".txt"
fileDesc["name{}".format(idx)] = open(fileNames["name{}".format(idx)],'r')
try:
#app.route('/')
def index():
# code for reading data from files
return render_template('index.html', var1 = var1)
#app.errorhandler(Exception)
def all_exception_handler(error):
print("Closing")
for key, value in fileDesc:
val.close()
print("Files closed")
if __name__ == '__main__':
app.run(
host="192.168.0.166",
port=int("8080"),
debug=True
)
except KeyboardInterrupt:
print("Closing")
for key, value in fileDesc:
val.close()
print("Files closed")
Thanks in advance.
I am struggling with the same thing in my project. Something that did work for me was using signal to capture CTRL-C.
import sys
import signal
def handler(signal, frame):
print('CTRL-C pressed!')
sys.exit(0)
signal.signal(signal.SIGINT, handler)
signal.pause()
When this piece of code is put in the script that is running the Flask app, the CTRL-C can be captured. As of now, you have to use CTRL-C twice and then the handler is executed though. I'll investigate further and edit the answer if I find something new.
Edit 1
Okay I've done some more research and came up with some other methods, as the above is quite hack 'n slash.
In production, clean-up code such as closing databases or files is done via the #app.teardown_appcontext decorator. See this part of the tutorial.
When using the simple server, you can shut it down via exposing the werkzeug shutdown function. See this post.
Edit 2
I've tested the Werkzeug shutdown function, and it also works together with the teardown_appcontext functions. So I suggest to write your teardown functions using the decorator and writing a simple function that just does the shutdown of the werkzeug server. That way production and development code are the same.
Use atexit to handle this, from: https://stackoverflow.com/a/30739397/5782985
import atexit
#defining function to run on shutdown
def close_running_threads():
for thread in the_threads:
thread.join()
print "Threads complete, ready to finish"
#Register the function to be called on exit
atexit.register(close_running_threads)
#start your process
app.run()

Return HTTP status code from Flask without "returning"

Context
I have a server called "server.py" that functions as a post-commit webhook from GitLab.
Within "server.py", there is a long-running process (~40 seconds)
SSCCE
#!/usr/bin/env python
import time
from flask import Flask, abort, jsonify
debug = True
app = Flask(__name__)
#app.route("/", methods=['POST'])
def compile_metadata():
# the long running process...
time.sleep(40)
# end the long running process
return jsonify({"success": True})
if __name__ == "__main__":
app.run(host='0.0.0.0', port=8082, debug=debug, threaded=True)
Problem Statement
GitLab's webhooks expect return codes to be returned quickly. Since my webhook returns after or around 40 seconds; GitLab sends a retry sending my long running process in a loop until GitLab tries too many times.
Question
Am I able to return a status code from Flask back to GitLab, but still run my long running process?
I've tried adding something like:
...
def compile_metadata():
abort(200)
# the long running process
time.sleep(40)
but abort() only supports failure codes.
I've also tried using #after_this_request:
#app.route("/", methods=['POST'])
def webhook():
#after_this_request
def compile_metadata(response):
# the long running process...
print("Starting long running process...")
time.sleep(40)
print("Process ended!")
# end the long running process
return jsonify({"success": True})
Normally, flask returns a status code only from python's return statement, but I obviously cannot use that before the long running process as it will escape from the function.
Note: I am not actually using time.sleep(40) in my code. That is there only for posterity, and for the SSCCE. It will return the same result
Have compile_metadata spawn a thread to handle the long running task, and then return the result code immediately (i.e., without waiting for the thread to complete). Make sure to include some limitation on the number of simultaneous threads that can be spawned.
For a slightly more robust and scalable solution, consider some sort message queue based solution like celery.
For the record, a simple solution might look like:
import time
import threading
from flask import Flask, abort, jsonify
debug = True
app = Flask(__name__)
def long_running_task():
print 'start'
time.sleep(40)
print 'finished'
#app.route("/", methods=['POST'])
def compile_metadata():
# the long running process...
t = threading.Thread(target=long_running_task)
t.start()
# end the long running process
return jsonify({"success": True})
if __name__ == "__main__":
app.run(host='0.0.0.0', port=8082, debug=debug, threaded=True)
I was able to achieve this by using multiprocessing.dummy.Pool. After using threading.Thread, it proved unhelpful as Flask would still wait for the thread to finish (even with t.daemon = True)
I achieved the result of returning a status code before the long-running task like such:
#!/usr/bin/env python
import time
from flask import Flask, jsonify, request
from multiprocessing.dummy import Pool
debug = True
app = Flask(__name__)
pool = Pool(10)
def compile_metadata(data):
print("Starting long running process...")
print(data['user']['email'])
time.sleep(5)
print("Process ended!")
#app.route('/', methods=['POST'])
def webhook():
data = request.json
pool.apply_async(compile_metadata, [data])
return jsonify({"success": True}), 202
if __name__ == "__main__":
app.run(host='0.0.0.0', port=8082, debug=debug, threaded=True)
When you want to return a response from the server quickly, and still do some time consuming work, generally you should use some sort of shared storage like Redis to quickly store all the stuff you need, then return your status code. So the request gets served very quickly.
And have a separate server routinely work that semantic job queue to do the time consuming work. And then remove the job from the queue once the work is done. Perhaps storing the final result in shared storage as well. This is the normal approach, and it scales very well. For example, if your job queue grows too fast for a single server to keep up with, you can add more servers to work that shared queue.
But even if you don't need scalability, it's a very simple design to understand, implement, and debug. If you ever get an unexpected spike in request load, it just means that your separate server will probably be chugging away all night long. And you have peace of mind that if your servers shut down, you won't lose any unfinished work because they're safe in the shared storage.
But if you have one server do everything, performing the long running tasks asynchronously in the background, I guess maybe just make sure that the background work is happening like this:
------------ Serving Responses
---- Background Work
And not like this:
---- ---- Serving Responses
---- Background Work
Otherwise it would be possible that if the server is performing some block of work in the background, it might be unresponsive to a new request, depending on how long that time consuming work takes (even under very little request load). But if the client times out and retries, I think you're still safe from performing double work. But you're not safe from losing unfinished jobs.

Python asynchronous request handling while calling subprocess

I have a REST API written in Flask and using the Tornado web server.
When a request comes in, I execute a long running shell command using tornado.process.Subprocess.
Is there a way to keep handling new requests while the shell command is running, and return the result after the shell command is complete.
I am thinking
from tornado import gen
from tornado.process import Subprocess
#gen.coroutine
def callSubProcess(commandString):
p = Subprocess(commandString)
yield p.wait_for_exit()
raise gen.Return(someResult)
#app.route(url, methods=['POST'])
#gen.coroutine
def process():
result = yield callSubProcess(commandString)
raise gen.Return(result)
http_server = HTTPServer(WSGIContainer(app))
http_server.listen(5000)
IOLoop.instance().start()
But this doesn't seem to work:
I just get 'Future' object is not callable when making a request. What is the problem here?
https://github.com/celery/celery is exactly what you need,
Celery allows you to execute the shell command asynchronously thus you'll be able to handle all coming requests.
To use coroutines, you must use tornado.web.Application, not tornado.wsgi.WSGIContainer. Tornado does not magically give WSGI frameworks like Flask the ability to handle asynchronous operations and coroutines.

Using gevent and Flask to implement websocket, how to achieve concurrency?

So I'm using Flask_Socket to try to implement a websocket on Flask. Using this I hope to notify all connected clients whenever a piece of data has changed. Here's a simplification of my routes/index.py. The issue that I have is that when a websocket connection is opened, it will stay in the notify_change loop until the socket is closed, and in the meantime, other routes like /users can't be accessed.
from flask_sockets import Sockets
sockets = Sockets(app)
#app.route('/users',methods=['GET'])
def users():
return json.dumps(get_users())
data = "Some value" # the piece of data to push
is_dirty = False # A flag which is set by the code that changes the data
#sockets.route("/notifyChange")
def notify_change(ws):
global is_dirty, data
while not ws.closed:
if is_dirty:
is_dirty = False
ws.send(data)
This seems a normal consequence of what is essentially a while True: however, I've been looking online for a way to get around this while still using flask_sockets and haven't had any luck. I'm running the server on GUnicorn
flask/bin/gunicorn -b '0.0.0.0:8000' -k flask_sockets.worker app:app
I tried adding threads by doing --threads 12 but no luck.
Note: I'm only serving up to 4 users at a time, so scaling is not a requirement, and the solution doesn't need to be ultra-optimized.

Categories

Resources