Django & Redis: How do I properly use connection pooling?

Django & Redis: How do I properly use connection pooling? - python

I have a Redis server which I query on almost every Django view for fetching some cached data. I've done some reading on some stackoverflow questions and learned that making a new Redis connection via r = redis.StrictRedis(host='localhost', port=6379, db=0) for every single web request is bad and that I should be using connection pooling.
Here is the approach I came up with for connection pooling in Django:
In settings.py so I can pull it up easily in any Django view as this is like a global variable:
# Redis Settings
import redis
REDIS_CONN_POOL_1 = redis.ConnectionPool(host='localhost', port=6379, db=0)
In some views.py:
from django.conf import settings
REDIS_CONN_POOL_1 = settings.REDIS_POOL_1
r = redis.Redis(connection_pool=REDIS_CONN_POOL_1)
r.get("foobar") # Whatever operation
So, my question is: Is this the right way to do connection pooling in Django? Are there any better approaches you guys use for those have experienced a similar scenario like this? This is probably better than my old approach of opening and closing a new connection to redis on every request.
EDIT: Gathered my understanding about why it's wrong to open a new connection on every request from this stackoverflow question.

A better approach would be to setup redis as your Django's cache backend with Django redis cache app. It gives you a done solution for your problem and you can use Django's official cache library to reach redis whenever you want get or set cached information. You can also avoid compatibility issues in your application if you decide to change your cache backend to something else.
Here's an easy to follow tutorial:
Using Redis as Django's session store and cache backend

Related

SQLAlchemy used in Flask, Session management implementation

As I cannot use the Flask-SQLAlchemy due to models definitions and use of the database part of the app in other contexts than Flask, I found several ways to manage sessions and I am not sure what to do.
One thing that everyone seems to agree (including me) is that a new session should be created at the beginning of each request and be committed + closed when the request has been processed and the response is ready to be sent back to the client.
Currently, I implemented the session management that way:
I have a database initialization python script which creates the engine (engine = create_engine(app.config["MYSQL_DATABASE_URI"])) and defines the session maker Session = sessionmaker(bind=engine, expire_on_commit=False).
In another file I defined two function decorated with flask's before_request and teardown_request applications decorators.
#app.before_request
def create_db_session():
g.db_session = Session()
#app.teardown_request
def close_db_session(exception):
try:
g.db_session.commit()
except:
g.db_session.rollback()
finally:
g.db_session.close()
I then use the g.db_session when I need to perform queries: g.db_session.query(models.User.user_id).filter_by(username=username)
Is this a correct way to manage sessions ?
I also took a look at the scoped sessions proposed by SQLAlchemy and this might be anotherway of doing things, but I am not sure about how to change my system to use scoped sessions...
If I understood it well, I would not use the g variable, but I would instead always refer to the Session definition declared by Session = scoped_session(sessionmaker(bind=engine, expire_on_commit=False)) and I would not need to initialize a new session explicitly when a request arrives.
I could just perform my queries as usual with Session.query(models.User.user_id).filter_by(username=username) and I would just need to remove the session when the request ends:
#app.teardown_request
def close_db_session(exception):
Session.commit()
Session.remove()
I am a bit lost with this session management topic and I would need help to understand how to manage sessions. Is there a real difference between the two approaches above?

Your approach of managing the session via flask.g is completely acceptable to my point of view. Whatever we are trying to do with SQLAlchemy, one must remember the basic principles:
Always clean up after yourself. At web application runtime, if you spawn a lot of sessions without .close()ing them, this will eventually lead to connection overflow at your DB instance. You are handling this by calling finally: session.close()
Maintain session independence. It's not good if various application contexts ( requests, threads, etc..) share the same session instance, because it's not deterministic. You are doing this by ensuring only one session runs per one request.
The scoped_session can be considered as just an alternative of flask.g - it ensures that within one thread, each call to the Session() constructor returns the same object - https://docs.sqlalchemy.org/en/13/orm/contextual.html#unitofwork-contextual
It's a SQLA batteries included version of your session management code.
So far, if you are using Flask, which is a synchronous framework, I don't think you will have any issues with this setup.

Django redis LPUSH / RPUSH

I am using the django-redis backend and the django.core.cache.cache module.
The django cache module does not seem to support proper functionality of pushing to lists and manipulating certain data structures.
The implied implementation used to update a list in the django cache module:
my_list = cache.get('my_list')
my_list.append('my value')
cache.set('my_list', my_list)
This approach is not efficient because the entire list is being loaded into the application server's memory.
Redis has support for the LPUSH / RPUSH commands to dynamically update a list. However, it doesn't look like these methods are available in the django cache module.
The official python redis client seems to implement these methods.
Is there any reason why django wouldn't offer this implementation? I'm asking out of my curiosity. Possibly I missed some details?

It does support raw client and command access, for that you would have to get access to raw client instead of using django cache.
https://github.com/jazzband/django-redis#raw-client-access
In some situations your application requires access to a raw Redis client to use some advanced features that aren't exposed by the Django cache interface. To avoid storing another setting for creating a raw connection, django-redis exposes functions with which you can obtain a raw client reusing the cache connection string: get_redis_connection(alias).
Code example:
>>> from django_redis import get_redis_connection
>>> con = get_redis_connection("default")
>>> con
<redis.client.StrictRedis object at 0x2dc4510>
>>> con.lpush('mylist',1)

Do I authenticate at database level, at Flask User level, or both?

I have an MS-SQL deployed on AWS RDS, that I'm writing a Flask front end for.
I've been following some intro Flask tutorials, all of which seem to pass the DB credentials in the connection string URI. I'm following the tutorial here:
https://medium.com/#rodkey/deploying-a-flask-application-on-aws-a72daba6bb80#.e6b4mzs1l
For deployment, do I prompt for the DB login info and add to the connection string? If so, where? Using SQLAlchemy, I don't see any calls to create_engine (using the code in the tutorial), I just see an initialization using config.from_object, referencing the config.py where the SQLALCHEMY_DATABASE_URI is stored, which points to the DB location. Trying to call config.update(dict(UID='****', PASSWORD='******')) from my application has no effect, and looking in the config dict doesn't seem to have any applicable entries to set for this purpose. What am I doing wrong?
Or should I be authenticating using Flask-User, and then get rid of the DB level authentication? I'd prefer authenticating at the DB layer, for ease of use.

The tutorial you are using uses Flask-Sqlalchemy to abstract the database setup stuff, that's why you don't see engine.connect().
Frameworks like Flask-Sqlalchemy are designed around the idea that you create a connection pool to the database on launch, and share that pool amongst your various worker threads. You will not be able to use that for what you are doing... it takes care of initializing the session and things early in the process.
Because of your requirements, I don't know that you'll be able to make any use of things like connection pooling. Instead, you'll have to handle that yourself. The actual connection isn't too hard...
engine = create_engine('dialect://username:password#host/db')
connection = engine.connect()
result = connection.execute("SOME SQL QUERY")
for row in result:
# Do Something
connection.close()
The issue is that you're going to have to do that in every endpoint. A database connection isn't something you can store in the session- you'll have to store the credentials there and do a connect/disconnect loop in every endpoint you write. Worse, you'll have to either figure out encrypted sessions or server side sessions (without a db connection!) to prevent keeping those credentials in the session from becoming a horrible security leak.
I promise you, it will be easier both now and in the long run to figure out a simple way to authenticate users so that they can share a connection pool that is abstracted out of your app endpoints. But if you HAVE to do it this way, this is how you will do it. (make sure you are closing those connections every time!)

Setup Cassandra DB in django using cqlengine but without using django-cassandra-engine

I'm a Django beginner and have developed 1 app using mysql as primary DB, but in my next project I have to use Cassandra db using https://github.com/cqlengine/cqlengine but do not use https://github.com/r4fek/django-cassandra-engine (which is a wrapper over cqlengine?).
I dont have any clue How do I start? I mean how and where should I create db connection and then create models in models.py file?
Should I create connection in init.py file?in views.py? what would be the most efficient way?
would be great(for future readers too) if someone provide a simple configuration and a model.

The twissandra demo should be a good example of how to build an app using Cassandra and Django.
In this implementation there is no models.py and the connection is maintained in the file cass.py.
You'll see cass.py also hosts all the functions required to return data from the C* database and make objects which are used by the system. This is where you would swap out the api requests with your CqlEngine code.
I hope these resources get you pointed in the right direction

Rustyrazorblade shows an easy way to accomplish this via his CQLEngine tutorial branch HERE.
You can easily setup the connection by doing something like this in your_app_project/models/connection.py:
from cqlengine import management
from cqlengine.connection import setup
def connect():
setup(["127.0.0.1", "127.0.1.1", "127.0.1.2"], "tutorial", retry_connect=True)
management.create_keyspace("tutorial", replication_factor=1, strategy_class="SimpleStrategy")
In this example: "tutorial" is the keyspace we are using, strategy_class is the replication strategy your C* instance is using, replication_factor is the amount of replications that will be stored throughout the ring, 127.0.0.1 is a Cassandra cluster node IP address (you can pass this a list or a string) and retry_connect specifies whether or not you would like it to attempt to reconnect if there is a connection failure.
From here, it is very easy for new C* users to get confused. You can call this anytime Before syncing the C* tables or using a C* query.
So, you'll want to do something like:
from cqlengine.management import sync_table
from models.connection import connect
from models.somemodels import MyCassandraModel
# This will fire off our previously setup 'connect' method
connect()
# This will setup the Model as a table in your C* DB
sync_table(MyCassandraModel)
You can even drop this into manage.py, just as long as that CQLEngine setup() is properly executed.

Using Flask-SQLAlchemy Event API to broadcast to Flask-SocketIO?

I'm a novice developing a simple multi-user game (think Minesweeper) using Flask for the API backend and AngularJS for the frontend. I've followed tutorials to structure the Angular/Flask app and I've coded up a RESTful API using Flask-Restless.
Now I'd like to push events to all the clients when game data is changed in the database (as it is by a POST to one the Restless endpoints). I was looking at using the SqlAlchemy event.listen API to call the Flask-SocketIO emit function to broadcast the data to clients. Is this an appropriate method to accomplish what I'm trying to do? Are there drawbacks to this approach?

#CESCO's reply works great if all of your computation is being done in the same process. You could also use this syntax (see full source code here):
#sa.models_committed.connect_via(app)
def on_models_committed(sender, changes):
for obj, change in changes:
print 'SQLALCHEMY - %s %s' % (change, obj)
Read on if you're interested in subscribing to all updates to a database ...
That won't work if your database is being updated from another process, however.
models_committed only works in the same process where the commit comes from (it's not a DB-level notification, it's sqlalchemy after committing to the DB)
https://github.com/mitsuhiko/flask-sqlalchemy/issues/369#issuecomment-170272020
I wrote a little demo app showing how to use any of Redis, ZeroMQ or socketIO_client to communicate real-time updates to your server. It might be helpful for anyone else trying to deal with outside database access.
Also, you could look into subscribing to postgres events:
https://blog.andyet.com/2015/04/06/postgres-pubsub-with-json/

Thats the barebone version of the code you want. I would start testing from there. You will have to figure out what you mean by change, so you can atach the right SqlAlchemy event to the type of change you are doing.
User database after insetion listen event
from sqlalchemy import event
from app import socketio
def after_insert_listener(mapper, connection, target):
socketio.emit('namespace response',{'data': data[0]},
namespace='/namespace')
print(target.id_user)
event.listen(User, 'after_insert', after_insert_listener)
SocketIO

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.