Recursive delete foreign keys for Django object - python

Suppose I have an object called Person that has a foreign key that links to CLothes which links to
class Person(models.Model):
clothes = models.ForeignKey('Clothes', on_delete=models.PROTECT)
jokes = models.ManyToManyField(to='Jokes')
class Clothes(models.Model):
fabric = models.ForeignKey('Material', on_delete=models.PROTECT)
class Material(models.Model):
plant = models.ForeignKey('Plant', on_delete=models.PROTECT)
And if I wanted to delete person, I would have to delete Clothes, Jokes, Materials attached to it. Is there a way to recursively detect all the foreign keys so that I can delete them?

The django.db.models.deletion.Collector is suited for this task. It is what Django uses under the hood to cascade deletions.
You can use it this way:
from django.db.models.deletion import Collector
collector = Collector(using='default') # You may specify another database
collector.collect([some_instance])
for model, instance in collector.instances_with_model():
# Our instance has already been deleted, trying again would result in an error
if instance == some_instance:
continue
instance.delete()
For more information about the Collector class, you can refer to this question:
How to show related items using DeleteView in Django?
As mentioned in the comments, using on_delete=models.CASCADE would be the best solution but if you do not have control over that, this should work.

Related

PROTECT vs RESTRICT for on_delete (Django)

I read the django documentation about PROTECT and RESTRICT to use with "on_delete".
PROTECT
Prevent deletion of the referenced object by raising ProtectedError, a
subclass of django.db.IntegrityError.
Example:
class MyModel(models.Model):
field = models.ForeignKey(YourModel, on_delete=models.PROTECT)
RESTRICT
Prevent deletion of the referenced object by raising RestrictedError
(a subclass of django.db.IntegrityError). Unlike PROTECT, deletion of
the referenced object is allowed if it also references a different
object that is being deleted in the same operation, but via a CASCADE
relationship.
Example:
class MyModel(models.Model):
field = models.ForeignKey(YourModel, on_delete=models.RESTRICT)
To some degree, I could understand the difference between PROTECT and RESTRICT but not exactly so what is the difference between PROTECT and RESTRICT exactly? and when should I use them?
Based on Django documentation RESTRICT allows you to delete your referenced object in some special situations. For instance:
class Artist(models.Model):
name = models.CharField(max_length=10)
class Album(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
class Song(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
album = models.ForeignKey(Album, on_delete=models.RESTRICT)
As you can see, if you create an album instance and after that create a song instance with the same artist (now you have a song and also an album with the same artist), then you can simply delete that artist without any problem (since in this deleting operation you're also deleting related objects. Also note that artist has CASCADE on song and album deletion). But if you have defined PROTECT instead of RESTRICT, like:
class Song(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
album = models.ForeignKey(Album, on_delete=models.PROTECT)
you couldn't have deleted your artist instance because that artist is referenced by this song. If you ask me, I would say RESTRICT is another version of PROTECT with less limitation on object deletion. If this explanation is not clear so far I would recommend you Django example itself:
Artist can be deleted even if that implies deleting an Album which is referenced by a Song, because Song also references Artist itself through a cascading relationship. For example:
artist_one = Artist.objects.create(name='artist one')
artist_two = Artist.objects.create(name='artist two')
album_one = Album.objects.create(artist=artist_one)
album_two = Album.objects.create(artist=artist_two)
song_one = Song.objects.create(artist=artist_one, album=album_one)
song_two = Song.objects.create(artist=artist_one, album=album_two)
album_one.delete()
Raises RestrictedError.
artist_two.delete()
Raises RestrictedError.
artist_one.delete()
(4, {'Song': 2, 'Album': 1, 'Artist': 1})
Will successfully delete your object
Using different types of on_delete is really related to your design and your constraints on deleting your objects. So, basically when you want to just protect your object from deletion (without any dependencies), using PROTECT is your best solution because with using RESTRICT in this case, you force Django to look in every related object (a nested for loop) for checking if other relations will be deleted in this process or not and it might have bad impact on your deletion performance.
Based on the real world applications requirement, we use both for different purpose.
PROTECT never deletes and raises error. But, RESTRICT (introduced from Django 3.1) deletes in some cases, not all.
PROTECT example:
According to how to prevent deletion,
class Employee(models.Model):
name = models.CharField(name, unique=True)
class Project(models.Model):
name = models.CharField(name, unique=True)
employees = models.ForeignKey(Employee, on_delete=models.PROTECT)
PROTECT explanation: Think from real worlds perspective. There will be many Employees and an Employee can have multiple Projects. If we delete an Employee if he has multiple Projects associated with it, the project objects in Project model will be remained. This is wrong. If Employee has done any Projects, he (Employee object) can't be deleted. Hence we used PROTECT. This would work to prevent the deletion of any Employee object that has one or more Project object(s) associated with it.
You need to understand CASCADE first before understanding RESTRICT:
CASCADE example:
class Artist(models.Model):
name = models.CharField(max_length=10)
class Album(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
CASCADE explanation: Think from real worlds perspective. There will be many Artists and an Artist can have multiple Albums. If we want to delete an Artist and his/her related Albums, we will use CASCADE. Remember, CASCADE deletes. It always deletes.
RESTRICT example:
class Artist(models.Model):
name = models.CharField(max_length=10)
class Album(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
class Song(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
album = models.ForeignKey(Album, on_delete=models.RESTRICT)
RESTRICT explanation: Now, think once again from the real world perspective. An Artist will have zero or more Albums. An Album can have zero or more Songs. There is no problem in deleting if an Artist have zero Albums and an Album have zero Songs. In fact, there is no relation since Artist doesn't have any Albums at all.
The deletion problem arises and the scenario starts when an Artist has multiple Albums and an Album has multiple Songs. Here's how:
RESTRICT and PROTECT works the same way.
But, PROTECT is of two steps. Parent and Child. If we shouldn't delete a Child (Album), we shouldn't delete a Parent (Artist). In other words, we use PROTECT if we don't want our Child (Album) deleted if Parent(Artist) deleted. PROTECT protects from deletion of objects.
And, RESTRICT is of three steps. Parent and Child and Grand Child. RESTRICT (a limiting condition or measure) only restricts from deletion of objects up to a certain limit.
You need to understand a real world scenario why we use RESTRICT.
Lets say there are multiple Artists. Each Artists have multiple Albums. Each Album has multiple songs. see the below code
>>> artist_one = Artist.objects.create(name='artist one')
>>> artist_two = Artist.objects.create(name='artist two')
>>> album_one = Album.objects.create(artist=artist_one)
>>> album_two = Album.objects.create(artist=artist_two)
>>> song_one = Song.objects.create(artist=artist_one, album=album_one)
>>> song_two = Song.objects.create(artist=artist_one, album=album_two)
>>> album_one.delete()
# Raises RestrictedError.
>>> artist_two.delete()
# Raises RestrictedError.
>>> artist_one.delete()
(4, {'Song': 2, 'Album': 1, 'Artist': 1})
Note that, from above code,
song_one and song_two are from same Artist and different Albums from different Artists.
One song can be sung/written/shared by one or more Artists as well.
One Song can be in many Albums sung/written by one or more Artists.
One Album contains many Songs sung/written by different Artists.
How RESTRICTS works:
Now, in real world, if we have to delete the Artist all his Albums and Songs in Albums should be deleted. But, only when all the songs in his Albums doesn't share relationship with other artists. In other words, when all songs referenced to the same Artist, then deletion of Artist, Albums and Songs will happen.
Note that we can't delete artist_two, because song_two shared his album_two along with artist_one.
In simple words, in Song object, if artist and artist from the album are same, RESTRICT allows to delete.
by taking #Roham example
class Artist(models.Model):
name = models.CharField(max_length=10)
class Album(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
class Song(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
album = models.ForeignKey(Album, on_delete=models.RESTRICT)
So here RESTRICT and PROTECT are suppose to stop the deletion of a Album instance that is referenced in a song instance. But in case of a special case, only RESTRICT will allow to delete the album instance such that instance of artist should also be deleted simultaneously (artist reference should be same for both album and song). If you will use PROTECT, it will protect deletion anyway. I hope this simple explanation helps you.
The short answer in simple words is:
CASCADE by deletion of parent, child also gets deleted.
SET_NULL lets parent be deleted but keeps the child.
PROTECT never lets the deletion of parent OF child.
while RESTRICT allows deletion of child only if all of its owners(parents) are deleted in past or currently are being deleted (makes sure other instances are not involved).

Django many to many relation, include all IDs in queryset in both directions

I have 2 models connected via M2M relation
class Paper(models.Model):
title = models.CharField(max_length=70)
authors = models.ManyToManyField(B, related_name='papers')
class Author():
name = models.CharField(max_length=70)
Is there a way to include authors as all related authors' IDs (and maybe name somehow)?
Is there a way to include papers IDs as reverse relation (and maybe title as well)?
Author.objects.all().annotate(related_papers=F('papers'))
this only adds id of one paper, first one it finds I think.
Furthermore, changing related_papers to papers gives an error:
ValueError: The annotation ‘papers’ conflicts with a field on the
model.
From what I understand in your comments, you're using DRF. I will give you 2 answers.
1) If you're talking about model serializer, you can use PrimaryKeyRelatedField :
class AuthorSerializer(serializers.ModelSerializer):
papers=serializers.PrimaryKeyRelatedField(many=True, read_only=True)
class Meta:
model = Author
fields = ['name', 'papers']
class PaperSerializer(serializers.ModelSerializer):
class Meta:
model = Paper
fields = '__all__'
This will return the IDs for the other side of the relationship whether you're on Paper or Author side. That will return the primary keys, not a representation of the object itself.
2) Now you're also talking about performance (e.g. database hit at each iteration).
Django (not DRF-specific) has a queryset method to handle preloading related objects. It's called prefetch_related.
For example, if you know you're going to need the relation object attributes and want to avoid re-querying the database, do as follow:
Author.objects.all().prefetch_related('papers')
# papers will be already loaded, thus won't need another database hit if you iterate over them.
Actually, it has already been implemented for you. You should include a Many-to-Many relationship to author in your Paper model like this:
class Paper(models.Model):
title = models.CharField(max_length=70)
authors = models.ManyToManyField(Author, related_name='papers')
That gives you the opportunity to add Author objects to a related set using
p.authors.add(u), assuming that p is the object of Paper model, and a is an object of Author model.
You can access all related authors of a Paper instance using p.authors.all().
You can access all related papers of an Author instance using u.papers.all().
This will return an instance of QuerySet that you can operate on.
See this documentation page to learn more.

Django, update the object after a prefetch_related

I have the following models:
class Publisher(models.Model):
name = models.CharField(max_length=30)
class Book(models.Model):
title = models.CharField(max_length=100)
publisher = models.ForeignKey(Publisher)
In my views.py, When I want to show the publisher page, I also want to show their books, so I usually do something like this:
publisher = Publisher.objects.prefetch_related('book_set').filter(pk=id).first()
Then, after some processing I also do some work with the books
for book in publisher.book_set.all():
foo()
This works great, but I have one problem. If there is a book added between the query and the for loop, the publisher.book_set.all() won't have the newly added books because it was prefetched.
Is there a way to update the publisher object?
You can delete the entire prefetch cache on the instance:
if hasattr(publisher, '_prefetched_objects_cache'):
del publisher._prefetched_objects_cache
If you only want to delete a particular prefetched relation:
if hasattr(publisher, '_prefetched_objects_cache'):
publisher._prefetched_objects_cache.pop('book_set', None)
There's a nicer way to clear prefetched attributes, using only public Django APIs, which is the refresh_from_db() method:
# Reloads all fields from the object as well as clearing prefetched attributes
publisher.refresh_from_db()
# Clears one prefetched attribute, will be re-fetched on next access
publisher.refresh_from_db(fields=['book_set'])
for book in publisher.book_set.all():
Also there is possibility to drop all prefetch_related from Django doc:
To clear any prefetch_related behavior, pass None as a parameter::
non_prefetched = qs.prefetch_related(None)

How to find all Django foreign key references to an instance

How do you find all direct foreign key references to a specific Django model instance?
I want to delete a record, but I want to maintain all child records that refer to it, so I'm trying to "swap out" the reference to the old record with a different one before I delete it.
This similar question references the Collector class. I tried:
obj_to_delete = MyModel.objects.get(id=blah)
new_obj = MyModel.objects.get(id=blah2)
collector = Collector(using='default')
collector.collect([obj_to_delete])
for other_model, other_data in collector.field_updates.iteritems():
for (other_field, _value), other_instances in other_data.iteritems():
# Why is this necessary?
if other_field.rel.to is not type(first_obj):
continue
for other_instance in other_instances:
setattr(other_instance, other_field.name, new_obj)
other_instance.save()
# All FK references should be gone, so this should be safe to delete.
obj_to_delete.delete()
However, this seems to have two problems:
Sometimes collector.field_updates contains references to models and fields that have nothing to do with my target obj_to_delete.
My final obj_to_delete.delete() call fails with IntegrityErrors complaining about remaining records that still refer to it, records that weren't caught by the collector.
What am I doing wrong?
I just need a way to lookup all FK references to a single model instance. I don't need any kind of fancy dependency lookup like what's used in Django's standard deletion view.
You can use Django's reverse foreign key support.
Say you have two models, like so:
class Foo(models.Model):
name = models.CharField(max_length=10)
class Bar(models.Model):
descr = models.CharField(max_length=100)
foo = models.ForeignKey(Foo)
Then you know you can do bar_instance.foo to access the Foo object it keys to. But you can use the reverse foreign key on a Foo instance to get all the Bar objects that point to it using, e.g, foo.bar_set.
Personally, I think the best option is to avoid the cascaded deletion.
Declaring the foreign keys in the related models with the proper Django option, e.g.
on_delete=models.SET_NULL
should suffice.
Borrowing the sample models from #Joseph's answer:
class Foo(models.Model):
name = models.CharField(max_length=10)
class Bar(models.Model):
descr = models.CharField(max_length=100)
foo = models.ForeignKey(Foo, blank=True, null=True, on_delete=models.SET_NULL))
As described in the official Django docs, here are the predefined behaviours you can use and experiment with:
SET_NULL: Set the ForeignKey null; this is only possible if null is
True.
SET_DEFAULT: Set the ForeignKey to its default value; a default for
the ForeignKey must be set.
SET(): Set the ForeignKey to the value passed to SET(), or if a
callable is passed in, the result of calling it. In most cases, passing a callable will be necessary to avoid executing queries at the time your models.py is imported:
from django.conf import settings
from django.contrib.auth import get_user_model
from django.db import models
def get_sentinel_user():
return get_user_model().objects.get_or_create(username='deleted')[0]
class MyModel(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL,
on_delete=models.SET(get_sentinel_user))
DO_NOTHING: Take no action. If your database backend enforces
referential integrity, this will cause an IntegrityError unless you
manually add an SQL ON DELETE constraint to the database field

Django: find all reverse references by foreign keys

Well, now I'm using Django 1.6+
And I have a model:
class FileReference(models.Model):
# some data fields
# ...
pass
class Person(models.Model):
avatar = models.ForeignKey(FileReference, related_name='people_with_avatar')
class House(models.Model):
images = models.ManyToManyField(FileReference, related_name='houses_with_images')
class Document(model.Model):
attachment = models.OneToOneField(FileReference, related_name='document_with_attachment')
So, many other model will have a foreign key referring to the FileReference model.
But sometimes, the referring models is deleted, with the FileReference object left.
I want to delete the FileReference objects with no foreign key referencing.
But so many other places will have foreign keys.
Is there any efficient way to find all the references? i.e. get the reference count of some model object?
I stumbled upon this question and I got a solution for you. Note, that django==1.6 is not supported any more, so this solution will probably work on django>=1.9
Lets say we are talking about 2 of the objects for now:
class FileReference(models.Model):
pass
class Person(models.Model):
avatar = models.ForeignKey(FileReference, related_name='people_with_avatar', on_delete=models.CASCADE)
As you can see in ForeignKey.on_delete documentation, when you delete the related FileReference object, the referenced object Person is deleted as well.
Now for your question. How do we do the revered? We want upon Person deletion that FileReference object will be removed as well.
We will do that using post_delete signal:
def delete_reverse(sender, **kwargs):
try:
if kwargs['instance'].avatar:
kwargs['instance'].avatar.delete()
except:
pass
post_delete.connect(delete_reverse, sender=Person)
What we did there was deleting the reference in avatar field on Person deletion. Notice that the try: except: block is to prevent looping exceptions.
Extra:
The above solution will work on all future objects. If you want to remove all of the past objects without a reference do the following:
In your package add the following file and directories: management/commands/remove_unused_file_reference.py
from django.core.management.base import BaseCommand, CommandError
class Command(BaseCommand):
def handle(self, *args, **options):
file_references = FileReference.objects.all()
file_reference_mapping = {file_reference.id: file_reference for file_reference in file_references}
persons = Person.objects.all()
person_avatar_mapping = {person.avatar.id: person for person in persons}
for file_reference_id, file_reference in file_reference_mapping.items():
if file_reference_id not in person_avatar_mapping:
file_reference.delete()
When you done, call: python manage.py remove_unused_file_reference
This is the base idea, you can change it to bulk delete etc...
I hope this will help to someone out there. Good Luck!

Categories

Resources