Summary
We run into the MySQL “max connection reached” issue by making a lot of read/write queries from different python multiprocessing workers from different autoscaled AWS server instances, because we have limited “max connections” for the AWS RDS database instance. While we could beef up the RDS instance type (this shows approximately how many max concurrent connections each instance type can have) and have a higher max connection limit, at some point also those connections will get exhausted if we scale up enough new server instances with new workers.
Questions
Is there a way to create a Connection Pool as a separate service on a separate AWS server instance, so that all python multiprocessing workers across all autoscaled AWS server instances can use the pool and thus we would not exceed the RDS DB max connection limit?
We are able to create the pool using SQLAlchemy (direct link to pool docs) on the first server instance for example, but how can the workers from the other AWS server instances connect to that pool? This is the reason why I highlight creating a pool on a separate AWS server instance because workers from all other servers would connect to that.
Are there any libraries that already handle this scenario? If not, this sounds like a huge effort to implement?
Main Components/Concepts of the Current APP
Flask backend. It has a connection pool and the size is set to 10. This never exceeds the connection beyond 10. There is no issue with this part as it is a separate web facing part that does not relate to the ‘python processing’ workers.
Python Workers. Those are multiprocessing workers which consume messages from the message broker. Whenever a python worker gets a message, the DB connection is established and closed at the end of the task. We have 4 types of workers and each worker has at least 5 instances (we could config this to 10 for example if we use a larger AWS instance). This leads to 20 concurrent connections (5x4) at a worst case scenario when all workers are making a db connection at the same time.
Autoscale. We automatically create new instances for additional workers when there is an overload of messages (tasks). This means that every time a new server instance is added, there could be another 20 concurrent DB connections in the worst case if all workers connect at the same time. So if we have two server instances, that would be 40 concurrent DB connections in the worst case. If we have 100 servers then that could be 2000 concurrent connections.
flask_app.py
app = Flask(__name__)
app.config.from_pyfile('../api.conf')
CORS(app)
jwt = JWTManager(app)
db = SQLAlchemy(app)
app.logger.info("[SQLPOOLSTATUS] pool size = {}".format(db.engine.pool.status()))
#app.route('/upload', methods=['POST'])
def api_upload_file():
log_request(request)
payload = request.get_json()
#--- database read and write -----
img_rec = db.session.query(Table).filter(Table.id == payload.get("img_id")).all()
user_rec = db.session.query(Table2).filter(Table2.id == payload.get("user_id")).first()
#------
some more code for write records for table --
db.session.add(record)
db.session.commit()
return json_response
worker.py
from models import Image, Upload, File, PDF, Album, Account
import os, sys, signal
import socket
import multiprocessing
import time
import pika
from utils import *
def run_priority(workerid, stop_event):
connection = amqp_connect()
channel = connection.channel()
amqp_init_queue(channel)
channel.queue_declare(queue=queue, durable=True, exclusive=False, auto_delete=True)
channel.queue_bind(routing_key=routing_key,queue=queue,exchange=exchange)
method_frame, header_frame, body = channel.basic_get(queue)
# --- Establish database connection ---
engine = db_engine()
connection = engine.connect() #
Session = sessionmaker(bind=engine)
session = Session()
#--- doing some database operation ----
record = session.query(Table).first()
try:
session.add(new_record)
session.commit()
except Exception as e:
session.rollback()
if __name__ == '__main__':
stop_event = multiprocessing.Event()
workers = []
workerid = 0
try:
default_handler = signal.getsignal(signal.SIGINT)
signal.signal(signal.SIGINT, signal.SIG_IGN)
workercount = int(config.get('backend', 'priority_upload_workers'))
for x in range(workercount):
worker = multiprocessing.Process(target=run_priority, args=(workerid, stop_event))
workers.append(worker)
worker.daemon = True
worker.start()
workercount = int(config.get('backend', 'upload_workers'))
for x in range(workercount):
worker = multiprocessing.Process(target=run, args=(workerid, stop_event))
workers.append(worker)
worker.daemon = True
worker.start()
signal.signal(signal.SIGTERM, upload_sigterm_handler)
signal.signal(signal.SIGINT, default_handler)
monitor_worker(workers)
except Exception as e:
# some code to handle exceptions
Tried: create an flask application with sqlachemy pool as a seperate service but the challenge is that i need to rewrite SQLAlchemy ORM queries everywhere in the workers code. Is there a better way to tackle the problem?
Expectation: Any alternative solution/suggestions to use a connection pool globally to all multiprocessing workers and use database connection with the limited connections and
never exceed the limit in the pool.
Any links or resources would be helpful.
Related
I have a Flask webapp running on Heroku. There are functions that require more than 30 seconds to process data and for those tasks I using heroku background jobs with Redis with 20 connections limit. However, these tasks are only available for specific users.
My understanding is that Redis opens connection after I initiate the Queue, no matter if the job was queued and processed or not.
Here's my import and Queue initiation:
from rq import Queue
from rq.job import Job
from worker import conn as rconn
q = Queue(connection=rconn)
And here's my worker file:
import os
import urllib
from redis import Redis
from rq import Worker, Queue, Connection
listen = ['high', 'default', 'low']
redis_url = os.getenv('REDIS_URL')
urllib.parse.uses_netloc.append('redis')
url = urllib.parse.urlparse(redis_url)
conn = Redis(host=url.hostname, port=url.port, db=0, password=url.password)
if __name__ == '__main__':
with Connection(conn):
worker = Worker(map(Queue, listen))
worker.work()
I am looking for a way to initiate redis connection only for users with specific access level, so the app won't reach connection error.
Does it make sense to initiate Queue from user_login function as global variable, like this:
if check_password_hash(db_pwd, pwd) and acces_level==4:
q global
q = Queue(connection=rconn)
I have below code in standalone script which is using django orm (outside django) with multithreading.
import threading
MAX_THREADS = 30
semaphore = threading.Semaphore(value=MAX_THREADS)
books = Books.objects.all()
for book in books:
book_id = book.id
t = threading.Thread(target=process_book, args=[book_id])
t.start()
threads.append(t)
for t in threads:
t.join()
def process_book(book_id):
semaphore.acquire()
book = Books.objects.get(id=book_id)
# Do some time taking stuff here
book.save()
semaphore.release()
Once the number of threads reaches MAX_CLIENT_CONN setting of postgres (which is 100 by default), I start getting following error with further calls:
operationalError at FATAL: remaining connection slots are reserved for
non-replication superuser connections
Researching this I got to solutions of using a database pooler like pgbouncer, However that only stalls the new connections until connections are available but again after django's query wat timeout I hit
OperationalError at / query_wait_timeout server closed the
connection unexpectedly This probably means the server terminated
abnormally before or while processing the request.
I understand that this is happening because the threads are not closing the db connections they are making but I am not sure how to even close the orm call connections? Is there something I could be doing differently in above code flow to reduce the number of connections?
I need to do a get on individual instances in order to update them because .update() or .save() wont work on queryset items.
this update the field in all the books on the database
for book in books:
Books.objects.filter(id=book.id).bulk_update(book, ['field to update'])
update each single book
def process_book(book_id):
semaphore.acquire()
book = get_object_or_404(Books, id=book_id).update(field) # Do some time taking stuff here
semaphore.release()
Just close the database connections at the end of your threads
from django import db
def process_book(book_id):
semaphore.acquire()
book = Books.objects.get(id=book_id)
# Do some time taking stuff here
book.save()
semaphore.release()
db.connections.close_all()
Hello fellow developers,
I'm actually trying to create a small webapp that would allow me to monitor multiple binance accounts from a dashboard and maybe in the futur perform some small automatic trading actions.
My frontend is implemented with Vue+quasar and my backend server is based on python Flask for the REST api.
What I would like to do is being able to start a background process dynamically when a specific endpoint of my server is called. Once this process is started on the server, I would like it to communicate via websocket with my Vue client.
Right now I can spawn the worker and create the websocket communication, but somehow, I can't figure out how to make all the threads in my worker to work all together. Let me get a bit more specific:
Once my worker is started, I'm trying to create at least two threads. One is the infinite loop allowing me to automate some small actions and the other one is the flask-socketio server that will handle the sockets connections. Here is the code of that worker :
customWorker.py
import time
from flask import Flask
from flask_socketio import SocketIO, send, emit
import threading
import json
import eventlet
# custom class allowing me to communicate with my mongoDD
from db_wrap import DbWrap
from binance.client import Client
from binance.exceptions import BinanceAPIException, BinanceWithdrawException, BinanceRequestException
from binance.websockets import BinanceSocketManager
def process_message(msg):
print('got a websocket message')
print(msg)
class customWorker:
def __init__(self, workerId, sleepTime, dbWrap):
self.workerId = workerId
self.sleepTime = sleepTime
self.socketio = None
self.dbWrap = DbWrap()
# this retrieves worker configuration from database
self.config = json.loads(self.dbWrap.get_worker(workerId))
keys = self.dbWrap.get_worker_keys(workerId)
self.binanceClient = Client(keys['apiKey'], keys['apiSecret'])
def handle_message(self, data):
print ('My PID is {} and I received {}'.format(os.getpid(), data))
send(os.getpid())
def init_websocket_server(self):
app = Flask(__name__)
socketio = SocketIO(app, async_mode='eventlet', logger=True, engineio_logger=True, cors_allowed_origins="*")
eventlet.monkey_patch()
socketio.on_event('message', self.handle_message)
self.socketio = socketio
self.app = app
def launch_main_thread(self):
while True:
print('My PID is {} and workerId {}'
.format(os.getpid(), self.workerId))
if self.socketio is not None:
info = self.binanceClient.get_account()
self.socketio.emit('my_account', info, namespace='/')
def launch_worker(self):
self.init_websocket_server()
self.socketio.start_background_task(self.launch_main_thread)
self.socketio.run(self.app, host="127.0.0.1", port=8001, debug=True, use_reloader=False)
Once the REST endpoint is called, the worker is spawned by calling birth_worker() method of "Broker" object available within my server :
from custom_worker import customWorker
#...
def create_worker(self, workerid, sleepTime, dbWrap):
worker = customWorker(workerid, sleepTime, dbWrap)
worker.launch_worker()
def birth_worker(workerid, 5, dbwrap):
p = Process(target=self.create_worker, args=(workerid,10, botPipe, dbWrap))
p.start()
So when this is done, the worker is launched in a separate process that successfully creates threads and listens for socket connection. But my problem is that I can't use my binanceClient in my main thread. I think that it is using threads and the fact that I use eventlet and in particular the monkey_patch() function breaks it. When I try to call the binanceClient.get_account() method I get an error AttributeError: module 'select' has no attribute 'poll'
I'm pretty sure about that it comes from monkey_patch because if I use it in the init() method of my worker (before patching) it works and I can get the account info. So I guess there is a conflict here that I've been trying to resolve unsuccessfully.
I've tried using only the thread mode for my socket.io app by using async_mode=threading but then, my flask-socketio app won't start and listen for sockets as the line self.socketio.run(self.app, host="127.0.0.1", port=8001, debug=True, use_reloader=False) blocks everything
I'm pretty sure I have an architecture problem here and that I shouldn't start my app by launching socketio.run. I've been unable to start it with gunicorn for example because I need it to be dynamic and call it from my python scripts. I've been struggling to find the proper way to do this and that's why I'm here today.
Could someone please give me a hint on how is this supposed to be achieved ? How can I dynamically spawn a subprocess that will manage a socket server thread, an infinite loop thread and connections with binanceClient ? I've been roaming stack overflow without success, every advice is welcome, even an architecture reforge.
Here is my environnement:
Manjaro Linux 21.0.1
pip-chill:
eventlet==0.30.2
flask-cors==3.0.10
flask-socketio==5.0.1
pillow==8.2.0
pymongo==3.11.3
python-binance==0.7.11
websockets==8.1
I am working on a REST web service built with Flask which needs to query a Cassandra database. The most expensive part of the logic is creating the connection to the Cassandra cluster.
What do I need to do with Flask so that I do not have to create the connection to the Cluster on every request?
You should not create new connection on every request, rather you should create a connection object for each process.
If you are running your flask application with uwsgi , I suggest to use #postfork decorator.
Say - You are spawning 4 processes with uwsgi, then
one session for each process is created after the process is spawned.
from uwsgidecorators import postfork
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.query import dict_factory
from cassandra.policies import RoundRobinPolicy
session = None
hosts=["127.0.0.1","127.0.0.2"]
keyspace="mykeyspace"
def get_new_session():
global cluster
cluster = Cluster(hosts, protocol_version=4, auth_provider=auth_provider, control_connection_timeout=None,
max_schema_agreement_wait=10, port=9042, load_balancing_policy=RoundRobinPolicy())
s = cluster.connect(keyspace)
s.row_factory = dict_factory
return s
#initializing session in every process spawned by uwsgi
#postfork
def connect():
global session
session = get_new_session()
session.row_factory = dict_factory
How can I manage my rabbit-mq connection in Pyramid app?
I would like to re-use a connection to the queue throughout the web application's lifetime. Currently I am opening/closing connection to the queue for every publish call.
But I can't find any "global" services definition in Pyramid. Any help appreciated.
Pyramid does not need a "global services definition" because you can trivially do that in plain Python:
db.py:
connection = None
def connect(url):
global connection
connection = FooBarBaz(url)
your startup file (__init__.py)
from db import connect
if __name__ == '__main__':
connect(DB_CONNSTRING)
elsewhere:
from db import connection
...
connection.do_stuff(foo, bar, baz)
Having a global (any global) is going to cause problems if you ever run your app in a multi-threaded environment, but is perfectly fine if you run multiple processes, so it's not a huge restriction. If you need to work with threads the recipe can be extended to use thread-local variables. Here's another example which also connects lazily, when the connection is needed the first time.
db.py:
import threading
connections = threading.local()
def get_connection():
if not hasattr(connections, 'this_thread_connection'):
connections.this_thread_connection = FooBarBaz(DB_STRING)
return connections.this_thread_connection
elsewhere:
from db import get_connection
get_connection().do_stuff(foo, bar, baz)
Another common problem with long-living connections is that the application won't auto-recover if, say, you restart RabbitMQ while your application is running. You'll need to somehow detect dead connections and reconnect.
It looks like you can attach objects to the request with add_request_method.
Here's a little example app using that method to make one and only one connection to a socket on startup, then make the connection available to each request:
from wsgiref.simple_server import make_server
from pyramid.config import Configurator
from pyramid.response import Response
def index(request):
return Response('I have a persistent connection: {} with id {}'.format(
repr(request.conn).replace("<", "<"),
id(request.conn),
))
def add_connection():
import socket
s = socket.socket()
s.connect(("google.com", 80))
print("I should run only once")
def inner(request):
return s
return inner
if __name__ == '__main__':
config = Configurator()
config.add_route('index', '/')
config.add_view(index, route_name='index')
config.add_request_method(add_connection(), 'conn', reify=True)
app = config.make_wsgi_app()
server = make_server('0.0.0.0', 8080, app)
server.serve_forever()
You'll need to be careful about threading / forking in this case though (each thread / process will need its own connection). Also, note that I am not very familiar with pyramid, there may be a better way to do this.