Using RedShift as an additional Django Database - python

I have two Databases defined, default which is a regular MySQL backend andredshift (using a postgres backend). I would like to use RedShift as a read-only database that is just used for django-sql-explorer.
Here is the router I have created in my_project/common/routers.py:
class CustomRouter(object):
def db_for_read(self, model, **hints):
return 'default'
def db_for_write(self, model, **hints):
return 'default'
def allow_relation(self, obj1, obj2, **hints):
db_list = ('default', )
if obj1._state.db in db_list and obj2._state.db in db_list:
return True
return None
def allow_migrate(self, db, app_label, model_name=None, **hints):
return db == 'default'
And my settings.py references it like so:
DATABASE_ROUTERS = ['my_project.common.routers.CustomRouter', ]
The problem occurs when invoking makemigrations, Django throws an error with the indication that it is trying to create django_* tables in RedShift (and obviously failing because of the postgres type serial not being supported by RedShift:
...
raise MigrationSchemaMissing("Unable to create the django_migrations table (%s)" % exc)
django.db.migrations.exceptions.MigrationSchemaMissing: Unable to create the django_migrations table (Column "django_migrations.id" has unsupported type "serial".)
So my question is two-fold:
Is it possible to completely disable Django Management for a database, but still use the ORM?
Barring Read-Only Replicas, why has Django not considered it an acceptable use case to support read-only databases?
Related Questions
- Column 'django_migrations.id' has unsupported type 'serial' [ with Amazon Redshift]

I just discovered that this is the result of a bug. It's been addressed in a few PRs, most notably: https://github.com/django/django/pull/7194
So, to answer my own questions:
No. It's not currently possible. The best solution is to use a custom Database Router in combination with a read-only DB account and have allow_migrate() return False in the router.
The best solution is to upgrade to Django >= 1.10.4 and not use a Custom Database Router, which stops the bug. However, this is a caveat if you have any other databases defined, such as a Read-Replica.

Related

Django: Exclude apps from python manage.py migrate when using multiple databases

QUESTION :
How to exclude the logs migrations from the default database, when using multiple databases in Django.
I want this to be automated. I started overriding the migrate command
I am using the default database for all models in my application and I need new database Logs, for only one model (the model is in different app - logs)
I successfully connected the application with the both databases. Also I am using a Router to control the operations
class LogRouter:
route_app_labels = {'logs'}
def db_for_read(self, model, **hints):
...
def db_for_write(self, model, **hints):
...
def allow_migrate(self, db, app_label, model_name=None, **hints):
"""
Make sure the logs app only appear in the
'logs' database.
"""
if app_label in self.route_app_labels:
return db == 'logs'
if db != 'default':
"""
If the database is not default, do not apply the migrations to the other
database.
"""
return False
return None
With allow_migrate I am faking the logs migrations in the default database which is updating the table django_migrations with the logs migration.
Also with
if db != 'default':
"""
If the database is not default, do not apply the migrations to the other database.
"""
return False
I am faking the migrations from the default database in the logs database and again the django_migrations table is updated with all the default database migrations.
This is fine solution, but I want to achieve:
The logs migrations to be ignored in the default database, including django_migrations table
The migrations for the default database to be ignored from the logs database, including django_migrations table
To achieve this, I tried overriding the migrate command:
from django.core.management.commands import migrate
class Command(migrate.Command):
def handle(self, *args, **options):
super(Command, self).handle(*args, **options)
# this is equal to python manage.py migrate logs --database=logs
# This will execute only the logs migrations in the logs database
options['app_label'] = options['database'] ='logs'
super(Command, self).handle(*args, **options)
With this code I am fixing the logs database, but the default still tries to execute the logs migrations (it is writing them down in the django_migrations table)

Creating an object with more than 1 foreign key in a Django project with multiple databases causing error

In my Django project, I have 2 databases. Everything works perfectly except for when I try to create a model with 2 foreign keys. The database router locks up and gives me a Cannot assign "<FKObject: fk_object2>": the current database router prevents this relation. even if both foreign keys come from the same database and I've yet to save the object in question. Code below
1 fk_object1 = FKObject.objects.using("database2").get(pk=1)
2 fk_object2 = FKObject.objects.using("database2").get(pk=2)
3
4 object = Object()
5 object.fk_object1 = fk_object1
6 object.fk_object2 2 = fk_object2
7 object.save(using="database2")
The problem arises on line 6, before the object is even saved into the database so I'm assuming that Django somehow calls Object() with database1 even though it hasn't been specified yet.
Does anyone know how to deal with this?
So I ended up finding a work around as such:
As it turns out, my suspicions were only partially true. Calling Model() does not cause Django to assume it is to use the default database but setting a foreign key does. This would explain why my code would error out at line 6 and not at line 5 as by this point in time, Django already assumes that you're using the default database and as fk_object2 is called from database2, it errors out for fear of causing an inter-database relation.
To get around this, I used threading.current_thread() as so:
class Command(BaseCommand):
current_thread().db_name = "database2"
def handle(self, **args, **kwargs):
# Do work here
class DatabaseRouter(object):
db_thread = threading.current_thread()
def db_for_read(self, model, **hints):
try:
print("Using {}".format(db_thread.db_name))
return db_thread.db_name
except AttributeError:
return "default"
def db_for_write(self, model, **hints):
try:
print("Using {}".format(db_thread.db_name))
return db_thread.db_name
except AttributeError:
return "default"
This way, my 2nd database is used every time, thereby avoiding any possible relation inconsistencies.

Django Db routing

I am trying to run my Django application with two db's (1 master, 1 read replica). My problem is if I try to read right after a write the code explodes. For example:
p = Product.objects.create()
Product.objects.get(id=p.id)
OR
If user is redirected to Product's
details page
The code runs way faster than the read replica. And if the read operation uses the replica the code crashes, because it didn't update in time.
Is there any way to avoid this? For example, the db to read being chosen by request instead of by operation?
My Router is identical to Django's documentation:
import random
class PrimaryReplicaRouter(object):
def db_for_read(self, model, **hints):
"""
Reads go to a randomly-chosen replica.
"""
return random.choice(['replica1', 'replica2'])
def db_for_write(self, model, **hints):
"""
Writes always go to primary.
"""
return 'primary'
def allow_relation(self, obj1, obj2, **hints):
"""
Relations between objects are allowed if both objects are
in the primary/replica pool.
"""
db_list = ('primary', 'replica1', 'replica2')
if obj1._state.db in db_list and obj2._state.db in db_list:
return True
return None
def allow_migrate(self, db, app_label, model_name=None, **hints):
"""
All non-auth models end up in this pool.
"""
return True
Solved it with :
class Model(models.Model):
objects = models.Manager() -> objects only access master
sobjects = ReplicasManager() -> sobjects access either master and replicas
class Meta:
abstract = True -> so django doesn't create a table
make every model extend this one instead of models.Model, and then use objects or sobjects whether I want to access only master or if want to access either master or replicas
Depending on the size of the data and the application I'd tackle this with either of the following methods:
Database pinning:
Extend your database router to allow pinning functions to specific databases. For example:
from customrouter.pinning import use_master
#use_master
def save_and_fetch_foo():
...
A good example of that can be seen in django-multidb-router.
Of course you could just use this package as well.
Use a model manager to route queries to specific databases.
class MyManager(models.Manager):
def get_queryset(self):
qs = CustomQuerySet(self.model)
if self._db is not None:
qs = qs.using(self._db)
return qs
Write a middleware that'd route your requests to master/slave automatically.
Basically same as the pinning method but you wouldn't specify when to run GET requests against master.
IN master replica conf the new data will take few millisecond to replicate the data on all other replica server/database.
so whenever u tried to read after write it wont gives you correct result.
Instead of reading from replica you can use master to read immediately after write by using using('primary') keyword with your get query.

Single django app to use multiple sqlite3 files for database

I have two models in my django app and I want their tables/databases to be stored in seperate db/sqlite3 files rather than in the default 'db.sqlite3' file.
Eg:
my models.py has two classes Train and Bus, I want them to be stored in train.db and bus.db
Of course you could always just use Train.objects.using('train') for your calls and that would select the correct database (assuming you defined a database called train in your settings.py.
If you don't want to do that, I had a similar problem and I adjusted my solution to your case. It was partially based on this blog article and the Django documentation for database routers is here.
With this solution your current database will not be affected, however your current data will also not be transferred to the new databases. Depending on your version of Django you need to either include allow_syncdb or the right version of allow_migrate.
In settings.py:
DATABASES = {
'default': {
'NAME': 'db.sqlite3',
'ENGINE': 'django.db.backends.sqlite3',
},
'train': {
'NAME': 'train.db',
'ENGINE': 'django.db.backends.sqlite3',
},
'bus': {
'NAME': 'bus.db',
'ENGINE': 'django.db.backends.sqlite3',
},
}
DATABASE_ROUTERS = [ 'yourapp.DatabaseAppsRouter']
DATABASE_APPS_MAPPING = {'train': 'train', 'bus': 'bus'}
In a new file called database_router.py:
from django.conf import settings
class DatabaseAppsRouter(object):
"""
A router to control all database operations on models for different
databases.
In case an app is not set in settings.DATABASE_APPS_MAPPING, the router
will fallback to the `default` database.
Settings example:
DATABASE_APPS_MAPPING = {'model_name1': 'db1', 'model_name2': 'db2'}
"""
def db_for_read(self, model, **hints):
"""Point all read operations to the specific database."""
return settings.DATABASE_APPS_MAPPING.get(model._meta.model_name, None)
def db_for_write(self, model, **hints):
"""Point all write operations to the specific database."""
return settings.DATABASE_APPS_MAPPING.get(model._meta.model_name, None)
def allow_relation(self, obj1, obj2, **hints):
"""Have no opinion on whether the relation should be allowed."""
return None
def allow_syncdb(self, db, model): # if using Django version <= 1.6
"""Have no opinion on whether the model should be synchronized with the db. """
return None
def allow_migrate(db, model): # if using Django version 1.7
"""Have no opinion on whether migration operation is allowed to run. """
return None
def allow_migrate(db, app_label, model_name=None, **hints): # if using Django version 1.8
"""Have no opinion on whether migration operation is allowed to run. """
return None
(Edit: this is also what Joey Wilhelm suggested)

How can I use different databases for different models

I have a model called Requests which I want to save in different database than default django databse.
The reason for this is that that table is going to record every request for analytics and that is going to get populated very heavily. As I am taking database backups hourly so I don't want to increase the db size just for that table.
So I was thinking of puting in separate DB so that I don't backup it up more often.
This docs says like this
https://docs.djangoproject.com/en/dev/topics/db/multi-db/
def db_for_read(self, model, **hints):
"""
Reads go to a randomly-chosen slave.
"""
return random.choice(['slave1', 'slave2'])
def db_for_write(self, model, **hints):
"""
Writes always go to master.
"""
return 'master'
Now I am not sure how can I check that if my model is Requests then choose database A else database B
Models are just classes - so check, if you have right class. This example should work for you:
from analytics.models import Requests
def db_for_read(self, model, **hints):
"""
Reads go to default database, unless it is about requests
"""
if model is Requests:
return 'database_A'
else:
return 'database_B'
def db_for_write(self, model, **hints):
"""
Writes go to default database, unless it is about requests
"""
if model is Requests:
return 'database_A'
else:
return 'database_B'
If you wish, though, you can also use one of some other techniques (such as checking model.__name__ or looking at model._meta).
One note, though: the requests should not have foreign keys connecting them to models in other databases. But you probably already know that.

Categories

Resources