Django Queryset Thread Safe Read - python

I've been reading up a bit on concurrency with database transactions in Django.
My question is if you are simply getting querysets to use in a Django view for instance and just essentially perform read operations without any intention of making changes to the database, should the lines of code getting the querysets still be in a transaction, knowing the models could be updated in another thread?
If I seem to not be understanding something, feel free to let me know.
For example..
def some_function():
ricky_obj = Model.objects.filter(name='Ricky')
# maybe another thread deletes an object with the name bob at this very time.
bob_obj = Model.objects.filter(name='Bob')
do_some_stuff_here()
return

Related

How and where should I create this transactional method?

I'm using Python 3.7. In a service class, I have these statements ...
article.first_appeared_date = datetime.now(timezone.utc)
article.save()
ArticleStat.objects.save_main_article(article)
The first pair of statements updates an attribute for a single object and the second statement creates a bunch of separate objects, using the first object. What i would like is for the whole thing to be executed as a transaction, whereby everything succeeds or no changes to the database occur if something fails. I'm unclear as to best practices in Python. Where would a method like this go? Does putting it in a manager class make it transactional?
from django.db import transaction
with transaction.atomic():
article.first_appeared_date = datetime.now(timezone.utc)
article.save()
ArticleStat.objects.save_main_article(article)
You could implement a method for this in some of your models or even in a view which you call from your service. Most of the time, you want to put code like this in some view (or function that is called in a view).
However, if what you want is execute ArticleStat.objects.save_main_article(article) every time you save an article you should look at Django signals, specifically post_save signal.
Take a look to the docs for transactions here: https://docs.djangoproject.com/en/2.1/topics/db/transactions/

Django: Transaction and select_for_update()

In my Django application I have the following two models:
class Event(models.Model):
capacity = models.PositiveSmallIntegerField()
def get_number_of_registered_tickets():
return EventRegistration.objects.filter(event__exact=self).aggregate(total=Coalesce(Sum('number_tickets'), 0))['total']
class EventRegistration(models.Model):
time = models.DateTimeField(auto_now_add=True)
event = models.ForeignKey(Event, on_delete=models.CASCADE)
number_tickets = models.PositiveSmallIntegerField(validators=[MinValueValidator(1)])
The method get_number_of_registered_tickets() do I need at several places in my application (e.g. template rendering). So I thought it makes sense to put it into the model also because it's related to it and I often heard it's good to have "fat models and lightweight views".
My problem now:
In order to prevent that two people want to register for the event in parallel, I have to use locking. Example: Let's say there's one ticket left to register for. Now, to people are on my website and click "Register" simultaneously. Under unforunate circumstances, it could happen that both requests are valid and now I have more registrations than capacity.
I'm relatively new to Django, but reading through the docs, I thought that select_for_update() should be the solution, am I right here (I use PostgreSQL, so that should be supported)?
However, the docs also say that using select_for_update() is only valid within a transcation.
Evaluating a queryset with select_for_update() in autocommit mode on
backends which support SELECT ... FOR UPDATE is a
TransactionManagementError error because the rows are not locked in
that case. If allowed, this would facilitate data corruption and could
easily be caused by calling code that expects to be run in a
transaction outside of one.
My idea was now to change my model method get_number_of_registered_tickets() and add select_for_update():
def get_number_of_registered_tickets():
return EventRegistration.objects.select_for_update().filter(event__exact=self).aggregate(total=Coalesce(Sum('number_tickets'), 0))['total']
Different questions now:
Is using select_for_update() the right solution to my problem?
Does it mean that I cannot use the method get_number_of_registered_tickets() in different views/templates now, given that it seems to only work within a transaction? Do I have to violate DRY here and copy and paste the query with select_for_update() to another place in my code?
I tested it locally and Django does not raise the TransactionManagementError while being in autocommit mode (not using any transactions). What could be the reason or do I misunderstand something?
Doing select_for_update() on an EventRegistration queryset isn't the way to go. That locks the specified rows, but presumably the conflict you're trying to prevent involves creating new EventRegistrations. Your lock won't prevent that.
Instead you can acquire a lock on the Event. Something like:
class Event(models.Model):
...
#transaction.atomic
def reserve_tickets(self, number_tickets):
list(Event.objects.filter(id=self.id).select_for_update()) # force evaluation
if self.get_number_of_registered_tickets() + number_tickets <= self.capacity:
# create EventRegistration
else:
# handle error
Note that this uses the transaction.atomic decorator to make sure you are running inside a transaction.
Note that in multiple database environment you must have atomic and select_for_update on the same db, otherwise it wont work
with transaction.atomic(using='dbwrite'):
Model.objects.using('dbwrite').select_for_update()....
select_for_update is a database function option implemented using Django. Whenever you are writing an operation to do some update, database (in your case POSTGRES) takes care of the reliable transactions that adhere to these ACID properties.
To me your approach seems correct. And the last question answer would be to test this using a time.sleep delay.
You can do a select operation then in the next line put a time.sleep(10) while this occurs hit the api to make another transaction. You will be able to find the
TransactionManagementError

UPDATE doesnt use Model save method

If I do the folllowing:
obj = Model.objects.get(pk=2)
object.field = 'new value'
object.save()
It runs the custom save method that I have written in django.
However, if I do a normal update statement:
Model.objects.filter(pk=2).update(field='new value')
It does not use the custom save method. My question here is two-fold:
1) Why was that decision made in django -- why doesn't every 'save' implement the custom save method.
2) Is there a codebase-wide way to make sure that no update statements are ever made? Basically, I just want to ensure that the custom save method is always run whenever doing a save within the django ORM. How would this be possible?
I'm not a Django developer, but I dabble from time to time and no one else has answered yet.
Why was that decision made in django -- why doesn't every 'save' implement the custom save method.
I'm going to guess here that this is done as a speed optimization for the common case of just performing a bulk update. update works on the SQL level so it is potentially much faster than calling save on lots of objects, each one being its own database transaction.
Is there a codebase-wide way to make sure that no update statements are ever made? Basically, I just want to ensure that the custom save method is always run whenever doing a save within the django ORM. How would this be possible?
You can use a custom manager with a custom QuerySet that raises some Exception whenever update is called. Of course, you can always loop over the Queryset and call save on each object if you need the custom behavior.
Forbidding Update on a Model
from django.db import models
class NoUpdateQuerySet(models.QuerySet):
"""Don't let people call update! Muahaha"""
def update(self, **kwargs):
# you should raise a more specific Exception.
raise Exception('You shall not update; use save instead.')
class Person(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
# setting the custom manager keeps people from calling update.
objects = NoUpdateQuerySet.as_manager()
You would just need to set the NoUpdateQuerySet as a manager for each model you don't want to update. I don't really think it's necessary to set a custom QuerySet though; if it were me I would just not call update, and loop through the objects that need to be saved whenever I need to. You may find a time when you want to call update, and this would end up being very annoying.
Forbidding Update Project-Wide
If you really really decide you hate update, you can just monkey-patch the update method. Then you can be completely certain it's not being called. You can monkey-patch it in your project's settings.py, since you know that module will be imported:
def no_update(self, **kwargs):
# probably want a more specific Exception
raise Exception('NO UPDATING HERE')
from django.db.models.query import QuerySet
QuerySet.update = no_update
Note that the traceback will actually be pretty confusing, since it will point to a function in settings.py. I'm not sure how much, if ever, update is used by other apps; this could have unintended consequences.

Django MVT design: Should I have all the code in models or views?

I'm pretty novice so I'll try to explain in a way that you can understand what I mean.
I'm coding a simple application in Django to track cash operations, track amounts, etc.
So I have an Account Model (with an amount field to track how many money is inside) and an Operation Model(with an amount field as well).
I've created a model helper called Account.add_operation(amount). Here is my question:
Should I include inside the code to create the new Operation inside Account.add_operation(amount) or should I do it in the Views?
And, should I call the save() method in the models (for example at the end of Account.add_operation() or must it be called in the views?)
What's the best approach, to have code inside the models or inside the views?
Thanks for your attention and your patience.
maybe you could use the rule "skinny controllers, fat models" to decide. Well in django it would be "skinny views".
To save related objects, in your case Operation I'd do it in the save() method or use the pre_save signal
Hope this helps
Experienced Django users seem to always err on the side of putting code in models. In part, that's because it's a lot easier to unit test models - they're usually pretty self-contained, whereas views touch both models and templates.
Beyond that, I would just ask yourself if the code pertains to the model itself or whether it's specific to the way it's being accessed and presented in a given view. I don't entirely understand your example (I think you're going to have to post some code if you want more specific help), but everything you mention sounds to me like it belongs in the model. That is, creating a new Operation sounds like it's an inherent part of what it means to do something called add_operation()!

Django - alternative to subclassing User?

I am using the standard User model (django.contrib.auth) which comes with Django. I have made some of my own models in a Django application and created a relationship between like this:
from django.db import models
from django.contrib.auth.models import User
class GroupMembership(models.Model):
user = models.ForeignKey(User, null = True, blank = True, related_name='memberships')
#other irrelevant fields removed from example
So I can now do this to get all of a user's current memberships:
user.memberships.all()
However, I want to be able to do a more complex query, like this:
user.memberships.all().select_related('group__name')
This works fine but I want to fetch this data in a template. It seems silly to try to put this sort of logic inside a template (and I can't seem to make it work anyway), so I want to create a better way of doing it. I could sub-class User, but that doesn't seem like a great solution - I may in future want to move my application into other Django sites, and presumably if there was any another application that sub-classed User I wouldn't be able to get it to work.
Is the best to create a method inside GroupMembership called something like get_by_user(user)? Would I be able to call this from a template?
I would appreciate any advice anybody can give on structuring this - sorry if this is a bit long/vague.
First, calling select_related and passing arguments, doesn't do anything. It's a hint that cache should be populated.
You would never call select_related in a template, only a view function. And only when you knew you needed all those related objects for other processing.
"Is the best to create a method inside GroupMembership called something like get_by_user(user)?"
You have this. I'm not sure what's wrong with it.
GroupMembership.objects.filter( user="someUser" )
"Would I be able to call this from a template?"
No. That's what view functions are for.
groups = GroupMembership.objects.filter( user="someUser" )
Then you provide the groups object to the template for rendering.
Edit
This is one line of code; it doesn't seem that onerous a burden to include this in all your view functions.
If you want this to appear on every page, you have lots of choices that do not involve repeating this line of code..
A view function can call another function.
You might want to try callable objects instead of simple functions; these can subclass a common callable object that fills in this information.
You can add a template context processor to put this into the context of all templates that are rendered.
You could write your own decorator to assure that this is done in every view function that has the decorator.

Categories

Resources