How to soft delete many to many relation with Django

How to soft delete many to many relation with Django - python

In my Django project, all entities deleted by the user must be soft deleted by setting the current datetime to deleted_at property. My model looks like this: Trip <-> TripDestination <-> Destination (many to many relation). In other words, a Trip can have multiple destinations.
When I delete a Trip, the SoftDeleteManager filters out all the deleted trip. However, if I request all the destinations of a trip (using get_object_or_404(Trip, pk = id)), I also get the deleted ones (i.e. TripDestination models with deleted_at == null OR deleted_at != null). I really don't understand why since all my models inherit from LifeTimeTracking and are using the SoftDeleteManager.
Can someone please help me to understand why the SoftDeleteManager isn't working for n:m relation?
class SoftDeleteManager(models.Manager):
def get_query_set(self):
query_set = super(SoftDeleteManager, self).get_query_set()
return query_set.filter(deleted_at__isnull = True)
class LifeTimeTrackingModel(models.Model):
created_at = models.DateTimeField(auto_now_add = True)
updated_at = models.DateTimeField(auto_now = True)
deleted_at = models.DateTimeField(null = True)
objects = SoftDeleteManager()
all_objects = models.Manager()
class Meta:
abstract = True
class Destination(LifeTimeTrackingModel):
city_name = models.CharField(max_length = 45)
class Trip(LifeTimeTrackingModel):
name = models.CharField(max_length = 250)
destinations = models.ManyToManyField(Destination, through = 'TripDestination')
class TripDestination(LifeTimeTrackingModel):
trip = models.ForeignKey(Trip)
destination = models.ForeignKey(Destination)
Resolution
I filed the bug 17746 in Django Bug DB. Thanks to Caspar for his help on this.

It looks like this behaviour comes from the ManyToManyField choosing to use its own manager, which the Related objects reference mentions, because when I try making up some of my own instances & try soft-deleting them using your model code (via the manage.py shell) everything works as intended.
Unfortunately it doesn't mention how you can override the model manager. I spent about 15 minutes searching through the ManyToManyField source but haven't tracked down where it instantiates its manager (looking in django/db/models/fields/related.py).
To get the behaviour you are after, you should specify use_for_related_fields = True on your SoftDeleteManager class as specified by the documentation on controlling automatic managers:
class SoftDeleteManager(models.Manager):
use_for_related_fields = True
def get_query_set(self):
query_set = super(SoftDeleteManager, self).get_query_set()
return query_set.filter(deleted_at__isnull = True)
This works as expected: I'm able to define a Trip with 2 Destinations, each through a TripDestination, and if I set a Destination's deleted_at value to datetime.datetime.now() then that Destination no longer appears in the list given by mytrip.destinations.all(), which is what you are after near as I can tell.
However, the docs also specifically say do not filter the query set by overriding get_query_set() on a manager used for related fields, so if you run into problems later, bear this in mind as a possible cause.

To enable filtering by deleted_at field of Destinantion and Trip models setting use_for_related_fields = True for SoftDeleteManager class is enough. As per Caspar's answer this does not return deleted Destinations for trip_object.destinations.all().
However from your comments we can see you would like to filter out Destinations that are linked to Trip via a TripDestination object with a set deleted_at field, a.k.a. soft delete on a through instance.
Let's clarify the way managers work. Related managers are the managers of the remote model, not of a through model.
trip_object.destinantions.some_method() calls default Destination manager.
destinantion_object.trip_set.some_method() calls default Trip manager.
TripDestination manager is not called at any time.
You can call it with trip_object.destinantions.through.objects.some_method(), if you really want to. Now, what I would do is add an Instance method Trip.get_destinations and a similar Destination.get_trips that filters out deleted connections.
If you insist on using the manager to do the filtering it gets more complicated:
class DestinationManager(models.Manager):
use_for_related_fields = True
def get_query_set(self):
query_set = super(DestinationManager, self).get_query_set()
if hasattr(self, "through"):
through_objects = self.through.objects.filter(
destination_id=query_set.filter(**self.core_filters).get().id,
trip_id=self._fk_val,
deleted_at__isnull=True)
query_set = query_set.filter(
id__in=through_objects.values("destination_id"))
return query_set.filter(deleted_at__isnull = True)
The same would have to be done for TripManager as they would differ. You may check the performance and look at django/db/models/fields/related.py for reference.
Modifying the get_queryset method of the default manager may hamper the ability to backup the database and the documentation discourages it. Writing a Trip.get_destinations method is the alternative.

Related

How to have 2 Types with the same model in Graphene?

I created a model Checkout on my project, with a CheckoutType to handle the requests, but now i need a Profile, that is basically just getting many of the fields on Checkout. The problem is that Checkout and Profile will be retrieved by users with very different permissions, and the while the first one will have the right ones, the second one must not have them. so i went with creating 2 types:
Checkout:
class CheckoutType(ModelType):
class Meta:
model = Checkout
interfaces = [graphene.relay.Node]
connection_class = CountableConnection
permissions = ['app.view_checkout']
filter_fields = {
'zone': ['exact'],
'vehicle__mark': ['exact'],
'status': ['exact']
}
Profile:
class ProfileFilter(django_filters.FilterSet):
class Meta:
model = Checkout
fields = ['zone','status']
#property
def qs(self):
# The query context can be found in self.request.
return super(ProfileFilter, self).qs.filter(salesman=self.request.user)
class ProfileType(ModelType):
class Meta:
model = Checkout
interfaces = [graphene.relay.Node]
connection_class = CountableConnection
filterset_class = ProfileFilter
The thing here is that, the first one shouldn't filter, and just be a regular schema, while the second one should filter by the user that made the request, that and the permissions is the reason i use 2, but as soon as i implemented, all the tests i did for the Checkout Type started to fail, since it seems it tries to use the ProfileType. I searched a little, and it seems that relay only allows a type per model in Django, so this approach doesn't seems possible, but i'm not sure how to overwrite the CheckoutType on another schema, or how to make a second Type with different permissions and different filters. Does someone knows if this is possible?

Just in case someone is on the same boat, i think i found a way to make it work, but with a different approach, i just modified the CheckoutType a little:
class CheckoutType(ModelType):
# Meta
#classmethod
def get_queryset(cls, queryset, info):
if info.context.user.has_perm('app.view_checkout'):
return queryset
return queryset.filter(salesman=info.context.user)
Basically here i remove the permission from the Meta, since i don't want to check that there, and then i overwrite the get_queryset() to check if the user has the perms, if that's the case, then just return the normal query, but if not just filter(And any additional thing you want to do for people without the permission). I'm not sure if there's a better way, but definitely did the job.

Django Rest Framework updating nested m2m objects. Does anyone know a better way?

I have a case when user needs to update one instance together with adding/editing the m2m related objects on this instance.
Here is my solution:
# models.py
class AdditionalAction(SoftDeletionModel):
ADDITIONAL_CHOICES = (
('to_bring', 'To bring'),
('to_prepare', 'To prepare'),
)
title = models.CharField(max_length=50)
type = models.CharField(choices=ADDITIONAL_CHOICES, max_length=30)
class Event(models.Model):
title= models.CharField(max_length=255)
actions = models.ManyToManyField(AdditionalAction, blank=True)
# serializers.py
class MySerializer(serializers.ModelSerializer):
def update(self, instance, validated_data):
actions_data = validated_data.pop('actions')
# Use atomic block to rollback if anything raised Exception
with transaction.atomic():
# update main object
updated_instance = super().update(instance, validated_data)
actions = []
# Loop over m2m relation data and
# create/update each action instance based on id present
for action_data in actions_data:
action_kwargs = {
'data': action_data
}
id = action_data.get('id', False)
if id:
action_kwargs['instance'] = AdditionalAction.objects.get(id=id)
actions_ser = ActionSerializerWrite(**action_kwargs)
actions_ser.is_valid(raise_exception=True)
actions.append(actions_ser.save())
updated_instance.actions.set(actions)
return updated_instance
Can anyone suggest better solution?
P.S. actions can be created or updated in this case, so i can't just use many=True on serializer cause it also needs instance to update.

Using for loop with save here will be a killer if you have a long list or actions triggered on save, etc. I'd try to avoid it.
You may be better off using ORMS update with where clause: https://docs.djangoproject.com/en/2.0/topics/db/queries/#updating-multiple-objects-at-once and even reading the updated objects from the database after the write.
For creating new actions you could use bulk_create:https://docs.djangoproject.com/en/2.0/ref/models/querysets/#bulk-create
There is also this one: https://github.com/aykut/django-bulk-update (disclaimer: I am not a contributor or author of the package).
You have to be aware of cons of this method - if you use any post/pre_ save signals those will not be triggered by the update.
In general, running multiple saves will kill the database, and you might end up with hard to diagnose deadlocks. In one of the projects I worked on moving from save() in the loop into update() decreased response time from 30 something seconds to < 10 where the longest operations left where sending emails.

Is there a clean way to hide model attributes from some users in Django?

I'm trying to introduce field-level permissions in my app that would effectively hide/nullify model field values from some users, while showing them to others. A user would need to be able to do something like this:
class MyRestrictedModel(HypotheticalMixin, models.Model):
public = CharField(max_length=128)
restricted = RestrictedCharField(
max_length=128,
permitted_groups=("group1",)
)
user1 = User.objects.get(pk=1) # in group1
user2 = User.objects.get(pk=2) # NOT in group 1
model_instance1 = MyRestrictedModel.objects.get(pk=1).restrict(user1)
model_instance2 = MyRestrictedModel.objects.get(pk=1).restrict(user2)
print(model_instance1.public) # "this is public data"
print(model_instance1.restricted) # "this is restricted data"
print(model_instance2.public) # "this is public data"
print(model_instance2.restricted) # None
I think I might be able to hack something together to get this working, but I'd hate to do that work if something more robust and community-accepted was available, so I thought I'd ask here. Does such a thing exist?

You will want add attribute, method, groups etc so you know if a user is restricted or not. Assuming you have user.is_restricted attribute:
class RestrictManager(models.Manager):
def by_user(self,user):
queryset = super(RestrictManager,self).get_queryset()
if user.is_restricted:
queryset = queryset.annotate(field_to_show=None) # field_to_show is a queryset field (not in any model)
else:
queryset = queryset.annotate(field_to_show=secret_field)
return queryset
class MyRestrictedModel(models.Model):
field1 = models.CharField...
restricted_objects = RestrictManager()
In your code:
q = MyRestrictedModel.restricted_objects.by_user(self.request.user)
# Now use q as usual, q.all(), q.get(...), q.filter(...)
You can of course add more method like by_group etc, and even set objects=RestrictManager() to replace the objects default manager.

Probably you should use a django package that deals specifically with detailed permissions. See here all this kind of packages. The right for you - which has field level permissions management - is django-permissions. But there are others too.

Query prefetched objects without using the db?

I'm getting multiple objects with prefetched relations from my db:
datei_logs = DateiLog.objects.filter(user=request.user)
.order_by("-pk")
.prefetch_related('transfer_logs')
transfer_logs refers to this:
class TransferLog(models.Model):
datei_log = models.ForeignKey("DateiLog", related_name="transfer_logs")
status = models.CharField(
max_length=1,
choices=LOG_STATUS_CHOICES,
default='Good'
)
server_name = models.CharField(max_length=100, blank=True, default="(no server)")
server = models.ForeignKey('Server')
class Meta:
verbose_name_plural = "Transfer-Logs"
def __unicode__(self):
return self.server_name
Now I want to get all the TransferLogs that have a status of "Good". But I think if I do this:
datei_logs[0].transfer_logs.filter(...)
It queries the db again! Since this happens on a website with many log entries I end up with 900 Queries!
I use:
datei_logs[0].transfer_logs.count()
As well and it causes lots of queries to the db too!
What can I do to "just get everything" and then just query an object that holds all the information instead of the db?

Since you're on Django 1.7 you can use the new Prefetch() objects to specify the queryset you want to use for the related lookup.
queryset = TransferLog.objects.filter(status='Good')
datei_logs = DateiLog.objects.filter(user=request.user)
.order_by("-pk")
.prefetch_related(Prefetch('transfer_logs',
queryset=queryset,
to_attr='good_logs'))
Then you can access datei_logs[0].good_logs and check len(datei_logs[0].good_logs).
If you're interested in multiple statuses, you can just use multiple Prefetch objects. But if you're going to get all the logs anyway, you might as well stick to your original query and then split the logs up in Python, rather than calling filter().

Django admin list_display weirdly slow with foreign keys

Django 1.2.5
Python: 2.5.5
My admin list of a sports model has just gone really slow (5 minutes for 400 records). It was returning in a second or so until we got 400 games, 50 odd teams and 2 sports.
I have fixed it in an awful way so I'd like to see if anyone has seen this before. My app looks like this:
models:
Sport( models.Model )
name
Venue( models.Model )
name
Team( models.Model )
name
Fixture( models.Model )
date
sport = models.ForeignKey(Sport)
venue = models.ForeignKey(Venue)
TeamFixture( Fixture )
team1 = models.ForeignKey(Team, related_name="Team 1")
team2 = models.ForeignKey(Team, related_name="Team 2")
admin:
TeamFixture_ModelAdmin (ModelAdmin)
list_display = ('date','sport','venue','team1','team2',)
If I remove any foreign keys from list_display then it's quick. As soon as I add any foreign key then slow.
I fixed it by using non foreign keys but calculating them in the model init so this works:
models:
TeamFixture( Fixture )
team1 = models.ForeignKey(Team, related_name="Team 1")
team2 = models.ForeignKey(Team, related_name="Team 2")
sport_name = ""
venue_name = ""
team1_name = ""
team2_name = ""
def __init__(self, *args, **kwargs):
super(TeamFixture, self).__init__(*args, **kwargs)
self.sport_name = self.sport.name
self.venue_name = self.venue.name
self.team1_name = self.team1.name
self.team2_name = self.team2.name
admin:
TeamFixture_ModelAdmin (ModelAdmin)
list_display = ('date','sport_name','venue_name','team1_name','team2_name',)
Administration for all other models are fine with several thousand records at the moment and all views in the actual site is functioning fine.

It's driving me crazy. list_select_related is set to True, however adding a foreign key to User in the list_display generates one query per row in the admin, which makes the listing slow. Select_related is True, so the Django admin shouldn't call this query on each row.
What is going on ?

The first thing I would look for, are the database calls. If you shouldn't have done that already, install django-debug-toolbar. That awesome tool lets you inspect all sql queries done for the current request. I assume there are lots of them. If you look at them, you will know where to look for the problem.
One problem I myself have run into: When the __unicode__ method of a model uses a foreign key, that leads to one database hit per instance. I know of two ways to overcome this problem:
use select_related, which usually is your best bet.
make your __unicode__ return a static string and override the save method to update this string accordingly.

This is a very old problem with django admin and foreign keys. What happens here is that whenever you try to load an object it tries to get all the objects of that foreign key. So lets say you are trying to load a fixture with a some teams (say the number of teams is about 100), its going to keep on including all the 100 teams in one go. You can try to optimize them by using something called as raw_fields. What this would do is instead of having to calling everything at once, it will limit the number of calls and make sure that the call is only made when an event is triggered (i.e. when you are selecting a team).
If that seems a bit like a UI mess you can try using this class:
"""
For Raw_id_field to optimize django performance for many to many fields
"""
class RawIdWidget(ManyToManyRawIdWidget):
def label_for_value(self, value):
values = value.split(',')
str_values = []
key = self.rel.get_related_field().name
for v in values:
try:
obj = self.rel.to._default_manager.using(self.db).get(**{key: v})
x = smart_unicode(obj)
change_url = reverse(
"admin:%s_%s_change" % (obj._meta.app_label, obj._meta.object_name.lower()),
args=(obj.pk,)
)
str_values += ['<strong>%s</strong>' % (change_url, escape(x))]
except self.rel.to.DoesNotExist:
str_values += [u'No input or index in the db']
return u', '.join(str_values)
class ImproveRawId(admin.ModelAdmin):
raw_id_fields = ('created_by', 'updated_by')
def formfield_for_dbfield(self, db_field, **kwargs):
if db_field.name in self.raw_id_fields:
kwargs.pop("request", None)
type = db_field.rel.__class__.__name__
kwargs['widget'] = RawIdWidget(db_field.rel, site)
return db_field.formfield(**kwargs)
return super(ImproveRawId, self).formfield_for_dbfield(db_field, **kwargs)
Just make sure that you inherit the class properly. I am guessing something like TeamFixture_ModelAdmin (ImproveRawIdFieldsForm). This will most likely give you a pretty cool performance boost in your django admin.

I fixed my problem by setting list_select_related to the list of related model fields instead of just True

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to soft delete many to many relation with Django - python

Related

How to have 2 Types with the same model in Graphene?

Django Rest Framework updating nested m2m objects. Does anyone know a better way?

Is there a clean way to hide model attributes from some users in Django?

Query prefetched objects without using the db?

Django admin list_display weirdly slow with foreign keys

Categories

Resources