Django: Transaction and select_for_update()

Django: Transaction and select_for_update() - python

In my Django application I have the following two models:
class Event(models.Model):
capacity = models.PositiveSmallIntegerField()
def get_number_of_registered_tickets():
return EventRegistration.objects.filter(event__exact=self).aggregate(total=Coalesce(Sum('number_tickets'), 0))['total']
class EventRegistration(models.Model):
time = models.DateTimeField(auto_now_add=True)
event = models.ForeignKey(Event, on_delete=models.CASCADE)
number_tickets = models.PositiveSmallIntegerField(validators=[MinValueValidator(1)])
The method get_number_of_registered_tickets() do I need at several places in my application (e.g. template rendering). So I thought it makes sense to put it into the model also because it's related to it and I often heard it's good to have "fat models and lightweight views".
My problem now:
In order to prevent that two people want to register for the event in parallel, I have to use locking. Example: Let's say there's one ticket left to register for. Now, to people are on my website and click "Register" simultaneously. Under unforunate circumstances, it could happen that both requests are valid and now I have more registrations than capacity.
I'm relatively new to Django, but reading through the docs, I thought that select_for_update() should be the solution, am I right here (I use PostgreSQL, so that should be supported)?
However, the docs also say that using select_for_update() is only valid within a transcation.
Evaluating a queryset with select_for_update() in autocommit mode on
backends which support SELECT ... FOR UPDATE is a
TransactionManagementError error because the rows are not locked in
that case. If allowed, this would facilitate data corruption and could
easily be caused by calling code that expects to be run in a
transaction outside of one.
My idea was now to change my model method get_number_of_registered_tickets() and add select_for_update():
def get_number_of_registered_tickets():
return EventRegistration.objects.select_for_update().filter(event__exact=self).aggregate(total=Coalesce(Sum('number_tickets'), 0))['total']
Different questions now:
Is using select_for_update() the right solution to my problem?
Does it mean that I cannot use the method get_number_of_registered_tickets() in different views/templates now, given that it seems to only work within a transaction? Do I have to violate DRY here and copy and paste the query with select_for_update() to another place in my code?
I tested it locally and Django does not raise the TransactionManagementError while being in autocommit mode (not using any transactions). What could be the reason or do I misunderstand something?

Doing select_for_update() on an EventRegistration queryset isn't the way to go. That locks the specified rows, but presumably the conflict you're trying to prevent involves creating new EventRegistrations. Your lock won't prevent that.
Instead you can acquire a lock on the Event. Something like:
class Event(models.Model):
...
#transaction.atomic
def reserve_tickets(self, number_tickets):
list(Event.objects.filter(id=self.id).select_for_update()) # force evaluation
if self.get_number_of_registered_tickets() + number_tickets <= self.capacity:
# create EventRegistration
else:
# handle error
Note that this uses the transaction.atomic decorator to make sure you are running inside a transaction.

Note that in multiple database environment you must have atomic and select_for_update on the same db, otherwise it wont work
with transaction.atomic(using='dbwrite'):
Model.objects.using('dbwrite').select_for_update()....

select_for_update is a database function option implemented using Django. Whenever you are writing an operation to do some update, database (in your case POSTGRES) takes care of the reliable transactions that adhere to these ACID properties.
To me your approach seems correct. And the last question answer would be to test this using a time.sleep delay.
You can do a select operation then in the next line put a time.sleep(10) while this occurs hit the api to make another transaction. You will be able to find the
TransactionManagementError

Related

How and where should I create this transactional method?

I'm using Python 3.7. In a service class, I have these statements ...
article.first_appeared_date = datetime.now(timezone.utc)
article.save()
ArticleStat.objects.save_main_article(article)
The first pair of statements updates an attribute for a single object and the second statement creates a bunch of separate objects, using the first object. What i would like is for the whole thing to be executed as a transaction, whereby everything succeeds or no changes to the database occur if something fails. I'm unclear as to best practices in Python. Where would a method like this go? Does putting it in a manager class make it transactional?

from django.db import transaction
with transaction.atomic():
article.first_appeared_date = datetime.now(timezone.utc)
article.save()
ArticleStat.objects.save_main_article(article)
You could implement a method for this in some of your models or even in a view which you call from your service. Most of the time, you want to put code like this in some view (or function that is called in a view).
However, if what you want is execute ArticleStat.objects.save_main_article(article) every time you save an article you should look at Django signals, specifically post_save signal.
Take a look to the docs for transactions here: https://docs.djangoproject.com/en/2.1/topics/db/transactions/

Django Queryset Thread Safe Read

I've been reading up a bit on concurrency with database transactions in Django.
My question is if you are simply getting querysets to use in a Django view for instance and just essentially perform read operations without any intention of making changes to the database, should the lines of code getting the querysets still be in a transaction, knowing the models could be updated in another thread?
If I seem to not be understanding something, feel free to let me know.
For example..
def some_function():
ricky_obj = Model.objects.filter(name='Ricky')
# maybe another thread deletes an object with the name bob at this very time.
bob_obj = Model.objects.filter(name='Bob')
do_some_stuff_here()
return

UPDATE doesnt use Model save method

If I do the folllowing:
obj = Model.objects.get(pk=2)
object.field = 'new value'
object.save()
It runs the custom save method that I have written in django.
However, if I do a normal update statement:
Model.objects.filter(pk=2).update(field='new value')
It does not use the custom save method. My question here is two-fold:
1) Why was that decision made in django -- why doesn't every 'save' implement the custom save method.
2) Is there a codebase-wide way to make sure that no update statements are ever made? Basically, I just want to ensure that the custom save method is always run whenever doing a save within the django ORM. How would this be possible?

I'm not a Django developer, but I dabble from time to time and no one else has answered yet.
Why was that decision made in django -- why doesn't every 'save' implement the custom save method.
I'm going to guess here that this is done as a speed optimization for the common case of just performing a bulk update. update works on the SQL level so it is potentially much faster than calling save on lots of objects, each one being its own database transaction.
Is there a codebase-wide way to make sure that no update statements are ever made? Basically, I just want to ensure that the custom save method is always run whenever doing a save within the django ORM. How would this be possible?
You can use a custom manager with a custom QuerySet that raises some Exception whenever update is called. Of course, you can always loop over the Queryset and call save on each object if you need the custom behavior.
Forbidding Update on a Model
from django.db import models
class NoUpdateQuerySet(models.QuerySet):
"""Don't let people call update! Muahaha"""
def update(self, **kwargs):
# you should raise a more specific Exception.
raise Exception('You shall not update; use save instead.')
class Person(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
# setting the custom manager keeps people from calling update.
objects = NoUpdateQuerySet.as_manager()
You would just need to set the NoUpdateQuerySet as a manager for each model you don't want to update. I don't really think it's necessary to set a custom QuerySet though; if it were me I would just not call update, and loop through the objects that need to be saved whenever I need to. You may find a time when you want to call update, and this would end up being very annoying.
Forbidding Update Project-Wide
If you really really decide you hate update, you can just monkey-patch the update method. Then you can be completely certain it's not being called. You can monkey-patch it in your project's settings.py, since you know that module will be imported:
def no_update(self, **kwargs):
# probably want a more specific Exception
raise Exception('NO UPDATING HERE')
from django.db.models.query import QuerySet
QuerySet.update = no_update
Note that the traceback will actually be pretty confusing, since it will point to a function in settings.py. I'm not sure how much, if ever, update is used by other apps; this could have unintended consequences.

How to filter autocompletion results in django grappelli?

We have a soft delete scheme where we just mark things as deleted and then filter the deleted ones out in various places. I'm trying to figure out how to filter the deleted ones out of the grapelli autocomplete suggestions.

In the end I went with this:
from grappelli.views.related import AutocompleteLookup
class YPAutocompleteLookup(AutocompleteLookup):
""" patch grappelli's autocomplete to let us control the queryset
by creating a autocomplete_queryset function on the model """
def get_queryset(self):
if hasattr(self.model, "autocomplete_queryset"):
qs = self.model.autocomplete_queryset()
else:
qs = self.model._default_manager.all()
qs = self.get_filtered_queryset(qs)
qs = self.get_searched_queryset(qs)
return qs.distinct()
It can be installed by overriding the relevant url:
url(r'^grappelli/lookup/autocomplete/$', YPAutocompleteLookup.as_view(), name="grp_autocomplete_lookup"),
Make sure this is ahead of Grappelli in your urls.

If your working with the Admin site, you should take advantage of the ModelAdmin.queryset function:
https://docs.djangoproject.com/en/1.5/ref/contrib/admin/#django.contrib.admin.ModelAdmin.queryset
As I found out, changing the default model manager to restrict the results is a bad idea, causing all kinds of nasty problems. For example: preventing syncdb, shell or shell_plus from running. Making it impossible to add the first record to a blank db. The exact errors depend upon what your restricting, but you are bound to get a few.
What is needed here is a way to tell Grappelli the name of the queryset manager to use. Passed in or a setting perhaps?
You can specify a simple (constant or related field) filter using ForeignKey.limit_choices_to. Grappelli grabs this value and sends it in the GET as param 'query_string'.
However, this might not be enough. I posted a request to the Grappelli repo I use to add a way to specify the record manger to use, or just automatically use the admin queryset (ModelAdmin.queryset).
My post is here:
https://github.com/sehmaschine/django-grappelli/issues/362

It looks like you can pass extra search params into the ajax autocompleter somehow. Likely a frontend hack needed.
https://github.com/sehmaschine/django-grappelli/blob/master/grappelli/views/related.py#L101
OR
You can make the default Manager for the models return an already filtered list, and have places that need to explicitly see deleted items remove that restriction.
This would likely make the default case much easier for you across the board.

Django: Get notified of all changes in database

I was wandering if there's a way to get notified of any changes to objects in a Django database. Right now I just need an email if anybody adds or changes anything but it would be best if I could hook a function triggered by any change and could decide what to do.
Is there an easy way to do it in Django?

Two ideas come to mind:
Override the predefined model method for saving.
Use a signal like post_save.
Here is a good article that talks about the difference between the two things listed above and when to use them:
Django signals vs. custom save()-method
The article was written near the end of 2007, three days after the release of Django 0.96.1. However, I believe the advice the author gives still applies today.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django: Transaction and select_for_update() - python

Note that in multiple database environment you must have atomic and select_for_update on the same db, otherwise it wont work with transaction.atomic(using='dbwrite'): Model.objects.using('dbwrite').select_for_update()....

Related

How and where should I create this transactional method?

Django Queryset Thread Safe Read

UPDATE doesnt use Model save method

How to filter autocompletion results in django grappelli?

Django: Get notified of all changes in database

Categories

Resources