Django redis LPUSH / RPUSH - python

I am using the django-redis backend and the django.core.cache.cache module.
The django cache module does not seem to support proper functionality of pushing to lists and manipulating certain data structures.
The implied implementation used to update a list in the django cache module:
my_list = cache.get('my_list')
my_list.append('my value')
cache.set('my_list', my_list)
This approach is not efficient because the entire list is being loaded into the application server's memory.
Redis has support for the LPUSH / RPUSH commands to dynamically update a list. However, it doesn't look like these methods are available in the django cache module.
The official python redis client seems to implement these methods.
Is there any reason why django wouldn't offer this implementation? I'm asking out of my curiosity. Possibly I missed some details?

It does support raw client and command access, for that you would have to get access to raw client instead of using django cache.
https://github.com/jazzband/django-redis#raw-client-access
In some situations your application requires access to a raw Redis client to use some advanced features that aren't exposed by the Django cache interface. To avoid storing another setting for creating a raw connection, django-redis exposes functions with which you can obtain a raw client reusing the cache connection string: get_redis_connection(alias).
Code example:
>>> from django_redis import get_redis_connection
>>> con = get_redis_connection("default")
>>> con
<redis.client.StrictRedis object at 0x2dc4510>
>>> con.lpush('mylist',1)

Related

Is storing data in "thread local storage" in a Django application safe, in cases of concurrent requests?

I have seen at many places that using thread local storage to store any data in Django application is not a good practice.
But this is the only way I could store my request object. I need to store it because my application has a complex structure. And I can't keep on passing the request object at each function call or class intialization.
I need the cookies and headers from my request object, to be passed to some api calls I'm making at different places in the application.
I'm using this for reference:
https://blndxp.wordpress.com/2016/03/04/django-get-current-user-anywhere-in-your-code-using-a-middleware/
So I'm using a middleware, as mentioned in the reference.
And, this is how request is stored
from threading import local
_thread_locals = local()
_thread_locals.request = request
And, this is how data is fetched:
getattr(_thread_locals, "request", None)
So does are the data stored in the threads local to that particular request ? Or if another request takes place at the same time, does both of them use the same data ?(Which is certainly not what i want)
Or is there any new way of dealing with this old problem(storing request object globally)
Note: I'm also using async at places in my Django application(If that matters).
Yes, using thread-local storage in Django is safe.
Django uses one thread to handle each request. Django also uses thread-local data itself, for instance for storing the currently activated locale. While appservers such as Gunicorn and uwsgi can be configured to utilize multiple threads, each request will still be handled by a single thread.
However, there have been conflicting opinions on whether using thread-locals is an elegant and well-designed solution. The reasons against using thread-locals boil down to the same reasons why global variables are considered bad practice. This answer discusses a number of them.
Still, storing the request object in thread-local data has become a widely used pattern in the Django community. There is even an app Django-CRUM that contains a CurrentRequestUserMiddleware class and the functions get_current_user() and get_current_request().
Note that as of version 3.0, Django has started to implement asynchronous support. I'm not sure what its implications are for apps like Django-CRUM. For the foreseeable future, however, thread-locals can safely be used with Django.

Python vs. Node.js Event Payloads in Firebase Cloud Functions

I am in the process of writing a Cloud Function for Firebase via the Python option. I am interested in Firebase Realtime Database Triggers; in other words I am willing to listen to events that happen in my Realtime Database.
The Python environment provides the following signature for handling Realtime Database triggers:
def handleEvent(data, context):
# Triggered by a change to a Firebase RTDB reference.
# Args:
# data (dict): The event payload.
# context (google.cloud.functions.Context): Metadata for the event.
This is looking good. The data parameter provides 2 dictionaries; 'data' for notifying the data before the change and 'delta' for the changed bits.
The confusion kicks in when comparing this signature with the Node.js environment. Here is a similar signature from theNode.js world:
exports.handleEvent = functions.database.ref('/path/{objectId}/').onWrite((change, context) => {}
In this signature, the change parameter is pretty powerful and it seems to be of type firebase.database.DataSnapshot. It has nice helper methods such as hasChild() or numChildren() that provide information about the changed object.
The question is: Does Python environment have a similar DataSnapshot object? With Python, do I have to query the database to get the number of children for example? It really isn't clear what Python environment can and can't do.
Related API/Reference/Documentation:
Firebase Realtime DB Triggers: https://cloud.google.com/functions/docs/calling/realtime-database
DataSnapshot Reference: https://firebase.google.com/docs/reference/js/firebase.database.DataSnapshot
The python runtime currently doesn't have a similar object structure. The firebase-functions SDK is actually doing a lot of work for you in creating objects that are easy to consume. Nothing similar is happening in the python environment. You are essentially getting a pretty raw view at the payload of data contained by the event that triggered your function.
If you write Realtime Database triggers for node, not using the Firebase SDK, it will be a similar situation. You'll get a really basic object with properties similar to the python dictionary.
This is the reason why use of firebase-functions along with the Firebase SDK is the preferred environment for writing triggers from Firebase products. The developer experience is superior: it does a bunch of convenient work for you. The downside is that you have to pay for the cost of the Firebase Admin SDK to load and initialize on cold start.
Note that might be possible for you to parse the event and create your own convenience objects using the Firebase Admin SDK for python.

Global state in a WSGI hosted Flask application

Assume a Flask application that allows to build an object (server-side) through a number of steps (wizard-like ; client-side).
I would like to create an initial object server-side an build it up step by step given the client-side input, keeping the object 'alive' throughout the whole build-process. A unique id will be associated with the creation of each new object / wizard.
Serving the Flask application with the use of WSGI on Apache, requests can go through multiple instance of the Flask application / multiple threads.
How do I keep this object alive server-side, or in other words how to keep some kind of global state?
I like to keep the object in memory, not serialize/deserialize it to/from disk. No cookies either.
Edit:
I'm aware of the Flask.g object but since this is on per request basis this is not a valid solution.
Perhaps it is possible to use some kind of cache layer, e.g.:
from werkzeug.contrib.cache import SimpleCache
cache = SimpleCache()
Is this a valid solution? Does this layer live across multiple app instances?
You're looking for sessions.
You said you don't want to use cookies, but did you mean you didn't want to store the data as a cookie or are you avoiding cookies entirely? For the former case, take a look at server side sessions, e.g. Flask-KVSession
Instead of storing data on the client, only a securely generated ID is stored on the client, while the actual session data resides on the server.

Django & Redis: How do I properly use connection pooling?

I have a Redis server which I query on almost every Django view for fetching some cached data. I've done some reading on some stackoverflow questions and learned that making a new Redis connection via r = redis.StrictRedis(host='localhost', port=6379, db=0) for every single web request is bad and that I should be using connection pooling.
Here is the approach I came up with for connection pooling in Django:
In settings.py so I can pull it up easily in any Django view as this is like a global variable:
# Redis Settings
import redis
REDIS_CONN_POOL_1 = redis.ConnectionPool(host='localhost', port=6379, db=0)
In some views.py:
from django.conf import settings
REDIS_CONN_POOL_1 = settings.REDIS_POOL_1
r = redis.Redis(connection_pool=REDIS_CONN_POOL_1)
r.get("foobar") # Whatever operation
So, my question is: Is this the right way to do connection pooling in Django? Are there any better approaches you guys use for those have experienced a similar scenario like this? This is probably better than my old approach of opening and closing a new connection to redis on every request.
EDIT: Gathered my understanding about why it's wrong to open a new connection on every request from this stackoverflow question.
A better approach would be to setup redis as your Django's cache backend with Django redis cache app. It gives you a done solution for your problem and you can use Django's official cache library to reach redis whenever you want get or set cached information. You can also avoid compatibility issues in your application if you decide to change your cache backend to something else.
Here's an easy to follow tutorial:
Using Redis as Django's session store and cache backend

Use mock MongoDB server for unit test

I have to implement nosetests for Python code using a MongoDB store. Is there any python library which permits me initializing a mock in-memory MongoDB server?
I am using continuous integration. So, I want my tests to be independent of any MongoDB running server.
Is there a way to mock mongoDM Server in memory to test the code independently of connecting to a Mongo server?
Thanks in advance!
You could try: https://github.com/vmalloc/mongomock, which aims to be a small library for mocking pymongo collection objects for testing purposes.
However, I'm not sure that the cost of just running mongodb would be prohibitive compared to ensuring some mocking library is feature complete.
I don’t know about Python, but I had a similar concern with C#. I decided to just run a real instance of Mongo on my workstation pointed at an empty directory. It’s not great because the code isn’t isolated but it’s fast and easy.
Only the data access layer actually calls Mongo during the test. The rest can rely on the mocks of the data access layer. I didn’t feel like faking Mongo was worth the effort when really I want to verify the interaction with Mongo is correct anyway.
You can use Ming which has an in-memory mongo db pymongo connection replacement.
import ming
mg = ming.create_datastore('mim://')
mg.conn # is the connection
mg.db # is a db with no name
mg.conn.somedb.somecol
# >> mim.Collection(mim.Database(somedb), somecol)
col = mg.conn.somedb.somecol
col.insert({'a': 1})
# >> ObjectId('5216ac3fe0323a1218f4e9aa')
col.find().count()
# >> 1
I am also using pymongo and MockupDB is working very well for my purpose (integration tests).
To use it is as simple as:
from mockupdb import *
server = MockupDB()
port = server.run()
from pymongo import MongoClient
client = MongoClient(server.uri)
import module_i_want_to_patch
module_i_want_to_patch.client = client
You can check the official tutorial for MockupDB here

Categories

Resources