How do I enforce domain integrity in a Django app transparently?

How do I enforce domain integrity in a Django app transparently? - python

Here's the situation. I've got an app with multiple users and each user has a group/company they belong to. There is a company field on all models meaning there's a corresponding company_id column in every table in the DB. I want to transparently enforce that, when a user tries to access any object, they are always restricted to objects within their "domain," e.g. their group/company. I could go through every query and add a filter that says .filter(company=user.company), but I'm hoping there's a better way to do at a lower level so it's transparent to whoever is coding the higher level logic.
Does anyone have experience with this and/or can point we to a good resource on how to approach this? I'm assuming this is a fairly common requirement.

You could do something like this:
from django.db import models
from django.db.models.query import QuerySet
class DomainQuerySet(QuerySet):
def applicable(self, user=None):
if user is None:
return self
else:
return self.filter(company=user.company)
class DomainManager(models.Manager):
def get_query_set(self):
return DomainQuerySet(self.model)
def __getattr__(self, name):
return getattr(self.get_query_set(), name)
class MyUser(models.Model):
company = models.ForeignKey('Company')
objects = DomainManager()
MyUser.objects.applicable(user)
Since we are using querysets, the query is chainable so you could also do:
MyUser.objects.applicable().filter(**kwargs)

Related

Auto delete a Django object from the database based on DateTimeField

Let's imagine a simple Food model with a name and an expiration date, my goal is to auto delete the object after the expiration date is reached.
I want to delete objects from the database (postgresql in my case) just after exp_date is reached, not filter by exp_date__gt=datetime.datetime.now() in my code then cron/celery once a while a script that filter by exp_date__lt=datetime.datetime.now() and then delete
Food(models.Model):
name = models.CharField(max_length=200)
exp_date = models.DateTimeField()
*I could do it with a vanilla view when the object is accessed via an endpoint or even with the DRF like so :
class GetFood(APIView):
def check_date(self, food):
"""
checking expiration date
"""
if food.exp_date <= datetime.datetime.now():
food.delete()
return False
def get(self, request, *args, **kwargs):
id = self.kwargs["id"]
if Food.objects.filter(pk=id).exists():
food = Food.objects.get(pk=id)
if self.check_date(food) == False:
return Response({"error": "not found"}, status.HTTP_404_NOT_FOUND)
else:
name = food.name
return Response({"food":name}, status.HTTP_200_OK)
else:
return Response({"error":"not found"},status.HTTP_404_NOT_FOUND)
but it would not delete the object if no one try to access it via an endpoint.
*I could also set cronjob with a script that query the database for every Food object which has an expiration date smaller than today and then delete themor even setup Celery. It would indeed just need to run once a day if I was using DateField but as I am using DateTimeField it would need to run every minute (every second for the need of ny project).
*I've also thought of a fancy workaround with a post_save signal with a while loop like :
#receiver(post_save, sender=Food)
def delete_after_exp_date(sender, instance, created, **kwargs):
if created:
while instance.exp_date > datetime.datetime.now():
pass
else:
instance.delete()
I don't know if it'd work but it seems very inefficient (if someone could please confirm)
Voila, thanks in advance if you know some ways or some tools to achieve what I want to do, thanks for reading !

I would advice not to delete the objects, or at least not effectively. Sceduling tasks is cumbersome. Even if you manage to schedule this, the time when you remove the items will always be slighlty off the time when you scheduled this from happening. It also means you will make an extra query per element, and not remove the items in bulk. Furthermore scheduling is inherently more complicated: it means you need something to persist the schedule. If later the expiration date of some food is changed, it will require extra logic to "cancel" the current schedule and create a new one. It also makes the system less "reliable": besides the webserver, the scheduler daemon has to run. It can happen that for some reason the daemon fails, and then you will no longer retrieve food that is not expired.
Therefore it might be better to combine filtering the records such that you only retrieve food that did not expire, and remove at some regular interval Food that has expired. You can easily filter the objects with:
from django.db.models.functions import Now
Food.objects.filter(exp_date__gt=Now())
to retrieve Food that is not expired. To make it more efficient, you can add a database index on the exp_date field:
Food(models.Model):
name = models.CharField(max_length=200)
exp_date = models.DateTimeField(db_index=True)
If you need to filter often, you can even work with a Manager [Django-doc]:
from django.db.models.functions import Now
class FoodManager(models.Manager):
def get_queryset(*args, **kwargs):
return super().get_queryset(*args, **kwargs).filter(
exp_date__gt=Now()
)
class Food(models.Model):
name = models.CharField(max_length=200)
exp_date = models.DateTimeField(db_index=True)
objects = FoodManager()
Now if you work with Food.objects you automatically filter out all Food that is expired.
Besides that you can make a script that for example runs daily to remove the Food objects that have expired:
from django.db.models import Now
Food._base_manager.filter(exp_date__lte=Now()).delete()

Update to the accepted answer. You may run into Super(): No Arguments if you define the method outside the class. I found this answer helpful.
As Per PEP 3135, which introduced "new super":
The new syntax:
super()
is equivalent to:
super(__class__, <firstarg>)
where class is the class that the method
was defined in, and is the first
parameter of the method (normally self for
instance methods, and cls for class methods).
While super is not a reserved word, the parser recognizes the use of super in a method definition and only passes in the class cell when this is found. Thus, calling a global alias of super without arguments will not necessarily work.
As such, you will still need to include self:
class FoodManager(models.Manager):
def get_queryset(self, *args, **kwargs):
return super().get_queryset(*args, **kwargs).filter(
exp_date__gt=Now()
)
Just something to keep in mind.

How to assign Django object ownership without explicitly declaring an owner field on all models?

I'm currently trying to figure out per user object permissions for our Django website API.
I have several models with sensitive information, that I need to be able to filter on a user basis.
For a simplified example of one of the models:
Restaurant, main customer of the website.
User, each user gets assigned a restaurant when the user account is
created. As such, a restaurant can have many users and they all
should only be able to access that restaurant's information.
Oven, which belong to a specific restaurant. A restaurant can have
many ovens.
Recipe, which belong to an oven. An oven can have many different
recipes.
Recipe Results, which belong to a recipe. There can be many different
Recipe Results belonging to the same Recipe (different ingredients
tried, etc).
There are at least 12+ different models. All models from a particular restaurant have to be hidden from other restaurants, we don't want them to be able to look at other restaurant recipes after all!
Not all models have a user = models.ForeignKey(User)
Without having to go into each one of my models and declaring owner = models.ForeignKey(User), is there a way to filter them in my API List Views and Detail Views?
Currently my List API View looks like this (simplified example):
class RecipeResultsListAPIView(ListAPIView):
queryset = RecipeResults.objects.all()
queryset = queryset.prefetch_related('oven')
serializer_class = RecipeResultsListSerializer
filter_backends = (DjangoFilterBackend,)
filter_fields = ('id', 'time', 'oven', 'recipe_name', 'recipe_description')
pagination_class = ExpertPageNumberPagination
def list(self, request):
user = User.objects.get(username=request.user)
restaurant = Restaurant.objects.get(user=user)
ovens = Oven.objects.filter(restaurant=restaurant)
queryset = RecipeResults.objects.filter(oven__in=ovens)
serializer = RecipeResultsListSerializer(queryset, many=True, context={'request':request})
return Response(serializer.data)
And the model for that looks like this:
class RecipeResults(models.Model):
time = models.DateTimeField()
oven = models.ForeignKey(Oven, on_delete=models.CASCADE)
recipe_name = models.CharField(max_length=20)
recipe_description = models.CharField(max_length=50)
def __str__(self):
return str(self.time) + ': ' + self.recipe_name + ' = ' + self.recipe_description
def __key(self):
return self.oven, self.time, self.recipe_name
def __eq__(self, y):
return isinstance(y, self.__class__) and self.__key() == y.__key()
def __hash__(self):
return hash(self.__key())
class Meta:
unique_together=(('time','recipe_name', 'oven-'),)
Specifically looking at the modified list method, currently this works properly to filter API call results to display only those Recipe Results that belong to the user that is logged in.
What I'm trying to figure out is if there's an easier way to do this, as for each model I would have to trace back ownership to the specific restaurant which would get confusing fast as I have 12+ different models.
What I'm not sure is if declaring "owner = models.ForeignKey(User)" on each of those models is the way to go. It feels like it would create many extra steps when retrieving the data.
I have also tried
class IsOwnerOrAdmin(BasePermission):
"""
Custom permission to only allow owners of an object to see and edit it.
Admin users however have access to all.
"""
def has_object_permission(self, request, view, obj):
# Permissions are only allowed to the owner of the snippet
if request.user.is_staff:
return True
return obj.user == request.user
But this didn't seem to filter properly, and besides, not all of the models have a user field assigned to them.
Please keep in mind I'm a junior developer and I'm learning a lot as I go. I'm only working on the API side of the company. The website and schema is already a work in progress and other systems depend on it, and so I'm trying not to modify the schema or models too much (I would like to avoid this if possible, but will do it if it's the only way). I was also brought in just to work on the API at first. The company understands I'm a junior developer and I'm extremely grateful to have been given the opportunity to grow while learning this project, but this one issue seems to be giving me a lot more trouble than actually building the rest of the API for the website.
I would greatly appreciate any help I can get with this!

I think you might benefit from model inheritance in this case.
You can define a base model for your owner-affected objects.
An example can look like:
class OwnedModel(models.Model):
owner = models.ForeignKey(User)
class Meta:
abstract = True
Then you can simply add this as the base for your other models:
class SomeModel(OwnedModel):
"""
This class already has the owner field
"""
A big downside of this approach is that you will still need a migration that will alter every involved table.
If you aren't allowed to do that, you might be able to do it with a loose, non relational approach, for example with django's permission model. You can assign automatically generated permission strings, eg: myapp.mymodel.pkey:
A final alternative is this third party source app that handles things for you: django-guardian

How are ModelFields assigned in Django Models?

When we define a model in django we write something like..
class Student(models.Model):
name = models.CharField(max_length=64)
age = models.IntegerField()
...
where, name = models.CharField() implies that name would be an object of models.CharField. When we have to make an object of student we simple do..
my_name = "John Doe"
my_age = 18
s = Student.objects.create(name=my_name, age=my_age)
where my_name and my_age are string and integer data types respectively, and not an object of models.CharField/models.IntegerField. Although while assigning the values the respective validations are performed (like checking on the max_length for CharField)
I'm trying to build similar models for an abstraction of Neo4j over Django but not able to get this workflow. How can I implement this ?
Found a similar question but didn't find it helpful enough.

How things work
First thing I we need to understand that each field on your models has own validation, this one refer to the CharField(_check_max_length_attribute) and it also calling the super on method check from the Field class to validate some basic common things.
That in mind, we now move to the create method which is much more complicated and total different thing, the basics operations for specific object:
Create a python object
Call save()
Using a lot of getattrs the save does tons of validation
Commit to the DB, if anything wrong goes from the DB, raise it to the user
A third thing you need to understand that when you query an object it first get the data from the db, and then(after long process) it set the data to the object.
Simple Example
class BasicCharField:
def __init__(self, max_len):
self.max_len = max_len
def validate(self, value):
if value > self.max_len:
raise ValueError('the value must be lower than {}'.format(self.max_len))
class BasicModel:
score = BasicCharField(max_len=4)
#staticmethod
def create(**kwargs):
obj = BasicModel()
obj.score = kwargs['score']
obj.save()
return obj
def save(self):
# Lots of validations here
BasicModel.score.validate(self.score)
# DB commit here
BasicModel.create(score=5)
And like we was expecting:
>>> ValueError: the value must be lower than 4
Obviously I had to simplify things to make it into few lines of code, you can improve this by a lot (like iterate over the attribute and not hardcode it like obj.score = ...)

Dynamically add properties to a django model

I have a Django model where a lot of fields are choices. So I had to write a lot of "is_something" properties of the class to check whether the instance value is equal to some choice value. Something along the lines of:
class MyModel(models.Model):
some_choicefield = models.IntegerField(choices=SOME_CHOICES)
#property
def is_some_value(self):
return self.some_choicefield == SOME_CHOICES.SOME_CHOICE_VALUE
# a lot of these...
In order to automate this and spare me a lot of redundant code, I thought about patching the instance at creation, with a function that adds a bunch of methods that do the checks.
The code became as follows (I'm assuming there's a "normalize" function that makes the label of the choice a usable function name):
def dynamic_add_checks(instance, field):
if hasattr(field, 'choices'):
choices = getattr(field, 'choices')
for (value,label) in choices:
def fun(instance):
return getattr(instance, field.name) == value
normalized_func_name = "is_%s_%s" % (field.name, normalize(label))
setattr(instance, normalized_func_name, fun(instance))
class MyModel(models.Model):
def __init__(self, *args, **kwargs):
super(MyModel).__init__(*args, **kwargs)
dynamic_add_checks(self, self._meta.get_field('some_choicefield')
some_choicefield = models.IntegerField(choices=SOME_CHOICES)
Now, this works but I have the feeling there is a better way to do it. Perhaps at class creation time (with metaclasses or in the new method)? Do you have any thoughts/suggestions about that?

Well I am not sure how to do this in your way, but in such cases I think the way to go is to simply create a new model, where you keep your choices, and change the field to ForeignKey. This is simpler to code and manage.
You can find a lot of information at a basic level in Django docs: Models: Relationships. In there, there are many links to follow expanding on various topics. Beyong that, I believe it just needs a bit of imagination, and maybe trial and error in the beginning.

I came across a similar problem where I needed to write large number of properties at runtime to provide backward compatibility while changing model fields. There are 2 standard ways to handle this -
First is to use a custom metaclass in your models, which inherits from models default metaclass.
Second, is to use class decorators. Class decorators sometimes provides an easy alternative to metaclasses, unless you have to do something before the creation of class, in which case you have to go with metaclasses.

I bet you know Django fields with choices provided will automatically have a display function.
Say you have a field defined like this:
category = models.SmallIntegerField(choices=CHOICES)
You can simply call a function called get_category_display() to access the display value. Here is the Django source code of this feature:
https://github.com/django/django/blob/baff4dd37dabfef1ff939513fa45124382b57bf8/django/db/models/base.py#L962
https://github.com/django/django/blob/baff4dd37dabfef1ff939513fa45124382b57bf8/django/db/models/fields/init.py#L704
So we can follow this approach to achieve our dynamically set property goal.
Here is my scenario, a little bit different from yours but down to the end it's the same:
I have two classes, Course and Lesson, class Lesson has a ForeignKey field of Course, and I want to add a property name cached_course to class Lesson which will try to get Course from cache first, and fallback to database if cache misses:
Here is a typical solution:
from django.db import models
class Course(models.Model):
# some fields
class Lesson(models.Model):
course = models.ForeignKey(Course)
#property
def cached_course(self):
key = key_func()
course = cache.get(key)
if not course:
course = get_model_from_db()
cache.set(key, course)
return course
Turns out I have so many ForeignKey fields to cache, so here is the code following the similar approach of Django get_FIELD_display feature:
from django.db import models
from django.utils.functional import curry
class CachedForeignKeyField(models.ForeignKey):
def contribute_to_class(self, cls, name, **kwargs):
super(models.ForeignKey, self).contribute_to_class(cls, name, **kwargs)
setattr(cls, "cached_%s" % self.name,
property(curry(cls._cached_FIELD, field=self)))
class BaseModel(models.Model):
def _cached_FIELD(self, field):
value = getattr(self, field.attname)
Model = field.related_model
return cache.get_model(Model, pk=value)
class Meta:
abstract = True
class Course(BaseModel):
# some fields
class Lesson(BaseModel):
course = CachedForeignKeyField(Course)
By customizing CachedForeignKeyField, and overwrite the contribute_to_class method, along with BaseModel class with a _cached_FIELD method, every CachedForeignKeyField will automatically have a cached_FIELD property accordingly.
Too good to be true, bravo!

Django remove bulk-delete

This is a very simple question: Is there any good way to disable calling a bulk-delete (through querysets of course) on all models in an entire Django project?
The reasoning for this is under the premise that completely deleting data is almost always a poor choice, and an accidental bulk-delete can be detrimental.

Like comments says on your first post, you have to create a subclass for each of these elements:
The model manager
Queryset class
BaseModel
After some search, a great example can be found here, all credits to Akshay Shah, the blog author. Before looking to the code, be aware of that:
However, it inevitably leads to data corruption. The problem is simple: using a Boolean to store deletion status makes it impossible to enforce uniqueness constraints in your database.
from django.db import models
from django.db.models.query import QuerySet
class SoftDeletionQuerySet(QuerySet):
def delete(self):
# Bulk delete bypasses individual objects' delete methods.
return super(SoftDeletionQuerySet, self).update(alive=False)
def hard_delete(self):
return super(SoftDeletionQuerySet, self).delete()
def alive(self):
return self.filter(alive=True)
def dead(self):
return self.exclude(alive=True)
class SoftDeletionManager(models.Manager):
def __init__(self, *args, **kwargs):
self.alive_only = kwargs.pop('alive_only', True)
super(SoftDeletionManager, self).__init__(*args, **kwargs)
def get_queryset(self):
if self.alive_only:
return SoftDeletionQuerySet(self.model).filter(alive=True)
return SoftDeletionQuerySet(self.model)
def hard_delete(self):
return self.get_queryset().hard_delete()
class SoftDeletionModel(models.Model):
alive = models.BooleanField(default=True)
objects = SoftDeletionManager()
all_objects = SoftDeletionManager(alive_only=False)
class Meta:
abstract = True
def delete(self):
self.alive = False
self.save()
def hard_delete(self):
super(SoftDeletionModel, self).delete()
Basically, it adds an alive field to check if the row has been deleted or not, and update it when the delete() method is called.
Of course, this method works only on project where you can manipulate the code base.

There are nice off-the-shelf applications that allow for restoring deleted models (if that is what you are interested in), here are ones I used:
Django softdelete: https://github.com/scoursen/django-softdelete I used it more
Django reversion: https://github.com/etianen/django-reversion this one is updated more often, and allows you to revert to any version of your model (not only after delete, but as well after update).
If you really want to forbid bulk deletes, I'd discourage you from this approach as it will:
Break expectations about applicaiton behaviour. If I call MyModel.objects.all().delete() I want table to be empty afterwards.
Break existing applications.
If you want do do it please follow advice from comment:
I'm guessing this would involve subclassing QuerySet and changing the delete method to your liking, subclassing the default manager and have it use your custom query set, subclassing model - create an abstract model and have it use your custom manager and then finally have all your models subclass your custom abstract model.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I enforce domain integrity in a Django app transparently? - python

Related

Auto delete a Django object from the database based on DateTimeField

How to assign Django object ownership without explicitly declaring an owner field on all models?

How are ModelFields assigned in Django Models?

Dynamically add properties to a django model

Django remove bulk-delete

Categories

Resources