Django cannot delete single object after rewriting model.Manager method

Django cannot delete single object after rewriting model.Manager method - python

I am trying to rewrite get_by_natural_key method on django manager (models.Manager). After adding model (NexchangeModel) I can delete all() objects but single - cannot.
Can:
SmsToken.objects.all().delete()
Cannot:
SmsTokent.objects.last().delete()
Code:
from django.db import models
from core.common.models import SoftDeletableModel, TimeStampedModel, UniqueFieldMixin
class NexchangeManager(models.Manager):
def get_by_natural_key(self, param):
qs = self.get_queryset()
lookup = {qs.model.NATURAL_KEY: param}
return self.get(**lookup)
class NexchangeModel(models.Model):
class Meta:
abstract = True
objects = NexchangeManager()
class SmsToken(NexchangeModel, SoftDeletableModel, UniqueFieldMixin):
sms_token = models.CharField(
max_length=4, blank=True)
user = models.ForeignKey(User, related_name='sms_token')
send_count = models.IntegerField(default=0)

While you are calling:
SmsToken.objects.all().delete() you are calling the queryset's delete method.
But on SmsTokent.objects.last().delete() you are calling the instance's delete method.
After django 1.9 queryset delete method returns no of items deleted. REF
Changed in Django 1.9:
The return value describing the number of objects deleted was added.
But on instance delete method Django already knows only one row will be deleted.
Also note that querset's delete method and instance's delete method are different.
The delete()[on a querset] method does a bulk delete and does not call any delete() methods on your models[instance method]. It does, however, emit the pre_delete and post_delete signals for all deleted objects (including cascaded deletions).
So you cannot rely on the response of the method to check if the delete worked fine or not. But in terms of python's philosophy "Ask for forgiveness than for permission". That means you can rely on exceptions to see if a delete has worked properly the way it should. Django's ORM will raise proper exceptions and do proper rollbacks in case of any failure.
So you can do this:
try:
instance.delete()/querset.delete()
except Exception as e:
# some code to notify failure / raise to propagate it
# you can avoid this try..except if you want to propagate exceptions as well.
Note: I am catching generic Exception because the only code in my try block is delete. If you wish to have some other code then you must catch specific exceptions only.

I assume that SoftDeletableModel comes from the django-model-utils package? If so, the purpose of that model is to mark instances with an is_removed field rather than actually deleting them. So it's to be expected that calling delete() on a model instance—which is what you get from last()—wouldn't actually delete anything.
SoftDeletableModel provides an objects attribute with a manager that limits its results to non-removed objects, and overrides delete() to mark objects as removed instead of actually deleting them.
The problem is that you've defined your own manager as objects, so the SoftDeletableModel manager isn't being used. Your custom manager is actually bulk deleting objects from the database, contrary to the goal of doing a soft delete. The way to resolve this is to have your custom manager inherit from SoftDeletableManagerMixin:
class NexchangeManager(SoftDeletableManagerMixin, models.Manager):
# your custom code

Related

Run code when "foreign" object is added to set

I have a foreign key relationship in my Django (v3) models:
class Example(models.Model):
title = models.CharField(max_length=200) # this is irrelevant for the question here
not_before = models.DateTimeField(auto_now_add=True)
...
class ExampleItem(models.Model):
myParent = models.ForeignKey(Example, on_delete=models.CASCADE)
execution_date = models.DateTimeField(auto_now_add=True)
....
Can I have code running/triggered whenever an ExampleItem is "added to the list of items in an Example instance"? What I would like to do is run some checks and, depending on the concrete Example instance possibly alter the ExampleItem before saving it.
To illustrate
Let's say the Example's class not_before date dictates that the ExampleItem's execution_date must not be before not_before I would like to check if the "to be saved" ExampleItem's execution_date violates this condition. If so, I would want to either change the execution_date to make it "valid" or throw an exception (whichever is easier). The same is true for a duplicate execution_date (i.e. if the respective Example already has an ExampleItem with the same execution_date).
So, in a view, I have code like the following:
def doit(request, example_id):
# get the relevant `Example` object
example = get_object_or_404(Example, pk=example_id)
# create a new `ExampleItem`
itm = ExampleItem()
# set the item's parent
itm.myParent = example # <- this should trigger my validation code!
itm.save() # <- (or this???)
The thing is, this view is not the only way to create new ExampleItems; I also have an API for example that can do the same (let alone that a user could potentially "add ExampleItems manually via REPL). Preferably the validation code must not be duplicated in all the places where new ExampleItems can be created.
I was looking into Signals (Django docu), specifically pre_save and post_save (of ExampleItem) but I think pre_save is too early while post_save is too late... Also m2m_changed looks interesting, but I do not have a many-to-many relationship.
What would be the best/correct way to handle these requirements? They seem to be rather common, I imagine. Do I have to restructure my model?

The obvious solution here is to put this code in the ExampleItem.save() method - just beware that Model.save() is not invoked by some queryset bulk operations.
Using signals handlers on your own app's models is actually an antipattern - the goal of signal is to allow for your app to hook into other app's lifecycle without having to change those other apps code.
Also (unrelated but), you can populate your newly created models instances directly via their initializers ie:
itm = ExampleItem(myParent=example)
itm.save()
and you can even save them directly:
# creates a new instance, populate it AND save it
itm = ExampleItem.objects.create(myParent=example)
This will still invoke your model's save method so it's safe for your use case.

modify `objects` to always return a subset of objects?

I have an Events table which contains various types of events. I care only about one of those types. As a result, every query I write begins with
Events.objects.filter(event_type="the_type").\
etc(...).etc(...)`.
Obviously this is repetitive and easy to forget. Is there a way to use a custom Manager so that the objects attribute always returns a specific subset of the rows, without explicitly asking for it? Or any other way to restrict the model to specific subset of the rows??

Yes, we can make a manager like:
from django.db import models
class EventManager(models.Manager):
def get_queryset(self):
return super(EventManager, self).get_queryset().filter(event_type="the_type")
and then add the manager to the Event class:
class Event(models.Model):
# ...
objects = EventManager()
Note however that some parts of Django will not use .objects, but ._base_manager, and thus still return the entire set. Furthermore my own experience with overriding the .objects manager is that it turns out to cause a lot of harm, for example if you want to set an attribute of all events, then writing Event.objects.all().update(foo='bar') will only update the events with the_type as type, whereas the code suggests otherwise.
Personally I think it is better to construct a manager with a different name, that at least hints that something is filtered, for example:
class Event(models.Model):
# ...
all_events = models.Manager()
type_events = EventManager()
here Event.objects no longer exist, but you write Event.all_events, or Event.type_events, and thus the code clearly hints what subset you take.

In django, how to delete all related objects when deleting a certain type of instances?

I first tried to override the delete() method but that doesn't work for QuerySet's bulk delete method. It should be related to pre_delete signal but I can't figure it out. My code is as following:
def _pre_delete_problem(sender, instance, **kwargs):
instance.context.delete()
instance.stat.delete()
But this method seems to be called infinitely and the program runs into a dead loop.
Can someone please help me?

If the class has foreign keys (or related objects) they are deleted by default like a DELETE CASCADE in sql.
You can change the behavior using the on_delete argument when defining the ForeignKey in the class, but by default it is CASCADE.
You can check the docs here.
Now the pre_delete signal works, but it doesn't call the delete() method if you are using a bulk delete, since its not deleting in a object by object basis.

In your case, using the post_delete signal instead of pre_delete should fix the infinite loop issue. Due to a ForeignKey's on_delete default value of cascade, using pre_delete logic this way will trigger the instance.context object to call delete on instance, which will then call instance.context, and so forth.
Using this approach:
def _post_delete_problem(sender, instance, **kwargs):
instance.context.delete()
instance.stat.delete()
post_delete.connect(_post_delete_problem, sender=Foo)
Can do the cleanup you want.

If you'd like a quick one-off to delete an instance and all of its related objects and those related objects' objects and so on without having to change the DB schema, you can do this -
def recursive_delete(to_del):
"""Recursively delete an object, all of its protected related
instances, those instances' protected instances, and so on.
"""
from django.db.models import ProtectedError
while True:
try:
to_del_pk = to_del.pk
if to_del_pk is None:
return # unsaved object
to_del.delete()
print(f"Deleted {to_del.__class__.__name__} with pk {to_del_pk}: {to_del}")
except ProtectedError as e:
for protected_ob in e.protected_objects:
recursive_delete(protected_ob)
Be careful, though!
I'd only use this to help with debugging in one-off scripts (or on the shell) with test databases that I don't mind wiping. Relationships aren't always obvious and if something is protected, it's probably for a reason.

Django: how to do get_or_create() in a threadsafe way?

In my Django app very often I need to do something similar to get_or_create(). E.g.,
User submits a tag. Need to see if
that tag already is in the database.
If not, create a new record for it. If
it is, just update the existing
record.
But looking into the doc for get_or_create() it looks like it's not threadsafe. Thread A checks and finds Record X does not exist. Then Thread B checks and finds that Record X does not exist. Now both Thread A and Thread B will create a new Record X.
This must be a very common situation. How do I handle it in a threadsafe way?

Since 2013 or so, get_or_create is atomic, so it handles concurrency nicely:
This method is atomic assuming correct usage, correct database
configuration, and correct behavior of the underlying database.
However, if uniqueness is not enforced at the database level for the
kwargs used in a get_or_create call (see unique or unique_together),
this method is prone to a race-condition which can result in multiple
rows with the same parameters being inserted simultaneously.
If you are using MySQL, be sure to use the READ COMMITTED isolation
level rather than REPEATABLE READ (the default), otherwise you may see
cases where get_or_create will raise an IntegrityError but the object
won’t appear in a subsequent get() call.
From: https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create
Here's an example of how you could do it:
Define a model with either unique=True:
class MyModel(models.Model):
slug = models.SlugField(max_length=255, unique=True)
name = models.CharField(max_length=255)
MyModel.objects.get_or_create(slug=<user_slug_here>, defaults={"name": <user_name_here>})
... or by using unique_togheter:
class MyModel(models.Model):
prefix = models.CharField(max_length=3)
slug = models.SlugField(max_length=255)
name = models.CharField(max_length=255)
class Meta:
unique_together = ("prefix", "slug")
MyModel.objects.get_or_create(prefix=<user_prefix_here>, slug=<user_slug_here>, defaults={"name": <user_name_here>})
Note how the non-unique fields are in the defaults dict, NOT among the unique fields in get_or_create. This will ensure your creates are atomic.
Here's how it's implemented in Django: https://github.com/django/django/blob/fd60e6c8878986a102f0125d9cdf61c717605cf1/django/db/models/query.py#L466 - Try creating an object, catch an eventual IntegrityError, and return the copy in that case. In other words: handle atomicity in the database.

This must be a very common situation. How do I handle it in a threadsafe way?
Yes.
The "standard" solution in SQL is to simply attempt to create the record. If it works, that's good. Keep going.
If an attempt to create a record gets a "duplicate" exception from the RDBMS, then do a SELECT and keep going.
Django, however, has an ORM layer, with it's own cache. So the logic is inverted to make the common case work directly and quickly and the uncommon case (the duplicate) raise a rare exception.

try transaction.commit_on_success decorator for callable where you are trying get_or_create(**kwargs)
"Use the commit_on_success decorator to use a single transaction for all the work done in a function.If the function returns successfully, then Django will commit all work done within the function at that point. If the function raises an exception, though, Django will roll back the transaction."
apart from it, in concurrent calls to get_or_create, both the threads try to get the object with argument passed to it (except for "defaults" arg which is a dict used during create call in case get() fails to retrieve any object). in case of failure both the threads try to create the object resulting in multiple duplicate objects unless some unique/unique together is implemented at database level with field(s) used in get()'s call.
it is similar to this post
How do I deal with this race condition in django?

So many years have passed, but nobody has written about threading.Lock. If you don't have the opportunity to make migrations for unique together, for legacy reasons, you can use locks or threading.Semaphore objects. Here is the pseudocode:
from concurrent.futures import ThreadPoolExecutor
from threading import Lock
_lock = Lock()
def get_staff(data: dict):
_lock.acquire()
try:
staff, created = MyModel.objects.get_or_create(**data)
return staff
finally:
_lock.release()
with ThreadPoolExecutor(max_workers=50) as pool:
pool.map(get_staff, get_list_of_some_data())

Django Manager Chaining

I was wondering if it was possible (and, if so, how) to chain together multiple managers to produce a query set that is affected by both of the individual managers. I'll explain the specific example that I'm working on:
I have multiple abstract model classes that I use to provide small, specific functionality to other models. Two of these models are a DeleteMixin and a GlobalMixin.
The DeleteMixin is defined as such:
class DeleteMixin(models.Model):
deleted = models.BooleanField(default=False)
objects = DeleteManager()
class Meta:
abstract = True
def delete(self):
self.deleted = True
self.save()
Basically it provides a pseudo-delete (the deleted flag) instead of actually deleting the object.
The GlobalMixin is defined as such:
class GlobalMixin(models.Model):
is_global = models.BooleanField(default=True)
objects = GlobalManager()
class Meta:
abstract = True
It allows any object to be defined as either a global object or a private object (such as a public/private blog post).
Both of these have their own managers that affect the queryset that is returned. My DeleteManager filters the queryset to only return results that have the deleted flag set to False, while the GlobalManager filters the queryset to only return results that are marked as global. Here is the declaration for both:
class DeleteManager(models.Manager):
def get_query_set(self):
return super(DeleteManager, self).get_query_set().filter(deleted=False)
class GlobalManager(models.Manager):
def globals(self):
return self.get_query_set().filter(is_global=1)
The desired functionality would be to have a model extend both of these abstract models and grant the ability to only return the results that are both non-deleted and global. I ran a test case on a model with 4 instances: one was global and non-deleted, one was global and deleted, one was non-global and non-deleted, and one was non-global and deleted. If I try to get result sets as such: SomeModel.objects.all(), I get instance 1 and 3 (the two non-deleted ones - great!). If I try SomeModel.objects.globals(), I get an error that DeleteManager doesn't have a globals (this is assuming my model declaration is as such: SomeModel(DeleteMixin, GlobalMixin). If I reverse the order, I don't get the error, but it doesn't filter out the deleted ones). If I change GlobalMixin to attach GlobalManager to globals instead of objects (so the new command would be SomeModel.globals.globals()), I get instances 1 and 2 (the two globals), while my intended result would be to only get instance 1 (the global, non-deleted one).
I wasn't sure if anyone had run into any situation similar to this and had come to a result. Either a way to make it work in my current thinking or a re-work that provides the functionality I'm after would be very much appreciated. I know this post has been a little long-winded. If any more explanation is needed, I would be glad to provide it.
Edit:
I have posted the eventual solution I used to this specific problem below. It is based on the link to Simon's custom QuerySetManager.

See this snippet on Djangosnippets: http://djangosnippets.org/snippets/734/
Instead of putting your custom methods in a manager, you subclass the queryset itself. It's very easy and works perfectly. The only issue I've had is with model inheritance, you always have to define the manager in model subclasses (just: "objects = QuerySetManager()" in the subclass), even though they will inherit the queryset. This will make more sense once you are using QuerySetManager.

Here is the specific solution to my problem using the custom QuerySetManager by Simon that Scott linked to.
from django.db import models
from django.contrib import admin
from django.db.models.query import QuerySet
from django.core.exceptions import FieldError
class MixinManager(models.Manager):
def get_query_set(self):
try:
return self.model.MixinQuerySet(self.model).filter(deleted=False)
except FieldError:
return self.model.MixinQuerySet(self.model)
class BaseMixin(models.Model):
admin = models.Manager()
objects = MixinManager()
class MixinQuerySet(QuerySet):
def globals(self):
try:
return self.filter(is_global=True)
except FieldError:
return self.all()
class Meta:
abstract = True
class DeleteMixin(BaseMixin):
deleted = models.BooleanField(default=False)
class Meta:
abstract = True
def delete(self):
self.deleted = True
self.save()
class GlobalMixin(BaseMixin):
is_global = models.BooleanField(default=True)
class Meta:
abstract = True
Any mixin in the future that wants to add extra functionality to the query set simply needs to extend BaseMixin (or have it somewhere in its heirarchy). Any time I try to filter the query set down, I wrapped it in a try-catch in case that field doesn't actually exist (ie, it doesn't extend that mixin). The global filter is invoked using globals(), while the delete filter is automatically invoked (if something is deleted, I never want it to show). Using this system allows for the following types of commands:
TemporaryModel.objects.all() # If extending DeleteMixin, no deleted instances are returned
TemporaryModel.objects.all().globals() # Filter out the private instances (non-global)
TemporaryModel.objects.filter(...) # Ditto about excluding deleteds
One thing to note is that the delete filter won't affect admin interfaces, because the default Manager is declared first (making it the default). I don't remember when they changed the admin to use Model._default_manager instead of Model.objects, but any deleted instances will still appear in the admin (in case you need to un-delete them).

I spent a while trying to come up with a way to build a nice factory to do this, but I'm running into a lot of problems with that.
The best I can suggest to you is to chain your inheritance. It's not very generic, so I'm not sure how useful it is, but all you would have to do is:
class GlobalMixin(DeleteMixin):
is_global = models.BooleanField(default=True)
objects = GlobalManager()
class Meta:
abstract = True
class GlobalManager(DeleteManager):
def globals(self):
return self.get_query_set().filter(is_global=1)
If you want something more generic, the best I can come up with is to define a base Mixin and Manager that redefines get_query_set() (I'm assuming you only want to do this once; things get pretty complicated otherwise) and then pass a list of fields you'd want added via Mixins.
It would look something like this (not tested at all):
class DeleteMixin(models.Model):
deleted = models.BooleanField(default=False)
class Meta:
abstract = True
def create_mixin(base_mixin, **kwargs):
class wrapper(base_mixin):
class Meta:
abstract = True
for k in kwargs.keys():
setattr(wrapper, k, kwargs[k])
return wrapper
class DeleteManager(models.Manager):
def get_query_set(self):
return super(DeleteManager, self).get_query_set().filter(deleted=False)
def create_manager(base_manager, **kwargs):
class wrapper(base_manager):
pass
for k in kwargs.keys():
setattr(wrapper, k, kwargs[k])
return wrapper
Ok, so this is ugly, but what does it get you? Essentially, it's the same solution, but much more dynamic, and a little more DRY, though more complex to read.
First you create your manager dynamically:
def globals(inst):
return inst.get_query_set().filter(is_global=1)
GlobalDeleteManager = create_manager(DeleteManager, globals=globals)
This creates a new manager which is a subclass of DeleteManager and has a method called globals.
Next, you create your mixin model:
GlobalDeleteMixin = create_mixin(DeleteMixin,
is_global=models.BooleanField(default=False),
objects = GlobalDeleteManager())
Like I said, it's ugly. But it means you don't have to redefine globals(). If you want a different type of manager to have globals(), you just call create_manager again with a different base. And you can add as many new methods as you like. Same for the manager, you just keep adding new functions that will return different querysets.
So, is this really practical? Maybe not. This answer is more an exercise in (ab)using Python's flexibility. I haven't tried using this, though I do use some of the underlying principals of dynamically extending classes to make things easier to access.
Let me know if anything is unclear and I'll update the answer.

Another option worth considering is the PassThroughManager:
https://django-model-utils.readthedocs.org/en/latest/managers.html#passthroughmanager

You should use QuerySet instead of Manager.
See Documentation here.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.