How and where should I create this transactional method?

How and where should I create this transactional method? - python

I'm using Python 3.7. In a service class, I have these statements ...
article.first_appeared_date = datetime.now(timezone.utc)
article.save()
ArticleStat.objects.save_main_article(article)
The first pair of statements updates an attribute for a single object and the second statement creates a bunch of separate objects, using the first object. What i would like is for the whole thing to be executed as a transaction, whereby everything succeeds or no changes to the database occur if something fails. I'm unclear as to best practices in Python. Where would a method like this go? Does putting it in a manager class make it transactional?

from django.db import transaction
with transaction.atomic():
article.first_appeared_date = datetime.now(timezone.utc)
article.save()
ArticleStat.objects.save_main_article(article)
You could implement a method for this in some of your models or even in a view which you call from your service. Most of the time, you want to put code like this in some view (or function that is called in a view).
However, if what you want is execute ArticleStat.objects.save_main_article(article) every time you save an article you should look at Django signals, specifically post_save signal.
Take a look to the docs for transactions here: https://docs.djangoproject.com/en/2.1/topics/db/transactions/

Related

Django: Transaction and select_for_update()

In my Django application I have the following two models:
class Event(models.Model):
capacity = models.PositiveSmallIntegerField()
def get_number_of_registered_tickets():
return EventRegistration.objects.filter(event__exact=self).aggregate(total=Coalesce(Sum('number_tickets'), 0))['total']
class EventRegistration(models.Model):
time = models.DateTimeField(auto_now_add=True)
event = models.ForeignKey(Event, on_delete=models.CASCADE)
number_tickets = models.PositiveSmallIntegerField(validators=[MinValueValidator(1)])
The method get_number_of_registered_tickets() do I need at several places in my application (e.g. template rendering). So I thought it makes sense to put it into the model also because it's related to it and I often heard it's good to have "fat models and lightweight views".
My problem now:
In order to prevent that two people want to register for the event in parallel, I have to use locking. Example: Let's say there's one ticket left to register for. Now, to people are on my website and click "Register" simultaneously. Under unforunate circumstances, it could happen that both requests are valid and now I have more registrations than capacity.
I'm relatively new to Django, but reading through the docs, I thought that select_for_update() should be the solution, am I right here (I use PostgreSQL, so that should be supported)?
However, the docs also say that using select_for_update() is only valid within a transcation.
Evaluating a queryset with select_for_update() in autocommit mode on
backends which support SELECT ... FOR UPDATE is a
TransactionManagementError error because the rows are not locked in
that case. If allowed, this would facilitate data corruption and could
easily be caused by calling code that expects to be run in a
transaction outside of one.
My idea was now to change my model method get_number_of_registered_tickets() and add select_for_update():
def get_number_of_registered_tickets():
return EventRegistration.objects.select_for_update().filter(event__exact=self).aggregate(total=Coalesce(Sum('number_tickets'), 0))['total']
Different questions now:
Is using select_for_update() the right solution to my problem?
Does it mean that I cannot use the method get_number_of_registered_tickets() in different views/templates now, given that it seems to only work within a transaction? Do I have to violate DRY here and copy and paste the query with select_for_update() to another place in my code?
I tested it locally and Django does not raise the TransactionManagementError while being in autocommit mode (not using any transactions). What could be the reason or do I misunderstand something?

Doing select_for_update() on an EventRegistration queryset isn't the way to go. That locks the specified rows, but presumably the conflict you're trying to prevent involves creating new EventRegistrations. Your lock won't prevent that.
Instead you can acquire a lock on the Event. Something like:
class Event(models.Model):
...
#transaction.atomic
def reserve_tickets(self, number_tickets):
list(Event.objects.filter(id=self.id).select_for_update()) # force evaluation
if self.get_number_of_registered_tickets() + number_tickets <= self.capacity:
# create EventRegistration
else:
# handle error
Note that this uses the transaction.atomic decorator to make sure you are running inside a transaction.

Note that in multiple database environment you must have atomic and select_for_update on the same db, otherwise it wont work
with transaction.atomic(using='dbwrite'):
Model.objects.using('dbwrite').select_for_update()....

select_for_update is a database function option implemented using Django. Whenever you are writing an operation to do some update, database (in your case POSTGRES) takes care of the reliable transactions that adhere to these ACID properties.
To me your approach seems correct. And the last question answer would be to test this using a time.sleep delay.
You can do a select operation then in the next line put a time.sleep(10) while this occurs hit the api to make another transaction. You will be able to find the
TransactionManagementError

UPDATE doesnt use Model save method

If I do the folllowing:
obj = Model.objects.get(pk=2)
object.field = 'new value'
object.save()
It runs the custom save method that I have written in django.
However, if I do a normal update statement:
Model.objects.filter(pk=2).update(field='new value')
It does not use the custom save method. My question here is two-fold:
1) Why was that decision made in django -- why doesn't every 'save' implement the custom save method.
2) Is there a codebase-wide way to make sure that no update statements are ever made? Basically, I just want to ensure that the custom save method is always run whenever doing a save within the django ORM. How would this be possible?

I'm not a Django developer, but I dabble from time to time and no one else has answered yet.
Why was that decision made in django -- why doesn't every 'save' implement the custom save method.
I'm going to guess here that this is done as a speed optimization for the common case of just performing a bulk update. update works on the SQL level so it is potentially much faster than calling save on lots of objects, each one being its own database transaction.
Is there a codebase-wide way to make sure that no update statements are ever made? Basically, I just want to ensure that the custom save method is always run whenever doing a save within the django ORM. How would this be possible?
You can use a custom manager with a custom QuerySet that raises some Exception whenever update is called. Of course, you can always loop over the Queryset and call save on each object if you need the custom behavior.
Forbidding Update on a Model
from django.db import models
class NoUpdateQuerySet(models.QuerySet):
"""Don't let people call update! Muahaha"""
def update(self, **kwargs):
# you should raise a more specific Exception.
raise Exception('You shall not update; use save instead.')
class Person(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
# setting the custom manager keeps people from calling update.
objects = NoUpdateQuerySet.as_manager()
You would just need to set the NoUpdateQuerySet as a manager for each model you don't want to update. I don't really think it's necessary to set a custom QuerySet though; if it were me I would just not call update, and loop through the objects that need to be saved whenever I need to. You may find a time when you want to call update, and this would end up being very annoying.
Forbidding Update Project-Wide
If you really really decide you hate update, you can just monkey-patch the update method. Then you can be completely certain it's not being called. You can monkey-patch it in your project's settings.py, since you know that module will be imported:
def no_update(self, **kwargs):
# probably want a more specific Exception
raise Exception('NO UPDATING HERE')
from django.db.models.query import QuerySet
QuerySet.update = no_update
Note that the traceback will actually be pretty confusing, since it will point to a function in settings.py. I'm not sure how much, if ever, update is used by other apps; this could have unintended consequences.

Call a function after saving a model

I'm working on a Django application connected to a LDAP server. Here's the trick i'm trying to do.
I have a model called system containing some information about computers. When i add a new system, the model generates a unique UUID, like and AutoField. The point is that this parameter is generated when saving, and only the first time.
After saved, i need a function to keep that UUID and create a new object on my LDAP.
Since i didn't know a lot about signals, i tried overriding the model save function in this way:
def save(self):
# import needed modules
import ldap
import ldap.modlist as modlist
[--OPERATIONS ON THE LDAP--]
super(System, self).save()
In this way, if i modify an existing system all work as should, because its UUID has been generated already. But if i try adding a new system i find the error UUID is None, and i can't work with an empty variable on LDAP (also it would be useless, don't u think so?)
It seems i need to call the function that work on LDAP after the system is saved, and so after an UUID has been generated. I tried to unserstand how to create a post_save function but i couldn't get it.
How can i do that?
Thanks

As you stated on your own, you do need signals, it will allow your code to stay more clean and seperate logic between parts.
The usual approach is to place signals just at the end of your models file:
# Signals
from django.dispatch import receiver
#receiver(models.signals.post_save, sender=YourModel)
def do_something(sender, instance, created, **kwargs):
....
On the above example we connect the post_save signal with the do_something function, this is performed through the decorator #receiver, sender of the decorator points to your Model Class.
Inside your function you have instance which holds the current instance of the model and the created flag which allows you to determine if this is a new record or an old (if the Model is being updated).

Signals would be excellent for something like this, but moving the line super(System, self).save() to the top of the save method might work as well. That means you first save the instance, before passing the saved object to the LDAP.

Mocking a Django Queryset in order to test a function that takes a queryset

I have a utility function in my Django project, it takes a queryset, gets some data from it and returns a result. I'd like to write some tests for this function. Is there anyway to 'mock' a QuerySet? I'd like to create an object that doesn't touch the database, and i can provide it with a list of values to use (i.e. some fake rows) and then it'll act just like a queryset, and will allow someone to do field lookups on it/filter/get/all etc.
Does anything like this exist already?

For an empty Queryset, I'd go simply for using none as keithhackbarth has already stated.
However, to mock a Queryset that will return a list of values, I prefer to use a Mock with a spec of the Model's manager. As an example (Python 2.7 style - I've used the external Mock library), here's a simple test where the Queryset is filtered and then counted:
from django.test import TestCase
from mock import Mock
from .models import Example
def queryset_func(queryset, filter_value):
"""
An example function to be tested
"""
return queryset.filter(stuff=filter_value).count()
class TestQuerysetFunc(TestCase):
def test_happy(self):
"""
`queryset_func` filters provided queryset and counts result
"""
m_queryset = Mock(spec=Example.objects)
m_queryset.filter.return_value = m_queryset
m_queryset.count.return_value = 97
result = func_to_test(m_queryset, '__TEST_VALUE__')
self.assertEqual(result, 97)
m_queryset.filter.assert_called_once_with(stuff='__TEST_VALUE__')
m_queryset.count.assert_called_once_with()
However, to fulfil the question, instead of setting a return_value for count, this could easily be adjusted to be a list of model instances returned from all.
Note that chaining is handled by setting the filter to return the mocked queryset:
m_queryset.filter.return_value = m_queryset
This would need to be applied for any queryset methods used in the function under test, e.g. exclude, etc.

Of course you can mock a QuerySet, you can mock anything.
You can create an object yourself, and give it the interface you need, and have it return any data you like. At heart, mocking is nothing more than providing a "test double" that acts enough like the real thing for your tests' purposes.
The low-tech way to get started is to define an object:
class MockQuerySet(object):
pass
then create one of these, and hand it to your test. The test will fail, likely on an AttributeError. That will tell you what you need to implement on your MockQuerySet. Repeat until your object is rich enough for your tests.

I am having the same issue, and it looks like some nice person has written a library for mocking QuerySets, it is called mock-django and the specific code you will need is here https://github.com/dcramer/mock-django/blob/master/mock_django/query.py I think you can then just patch you models objects function to return one of these QuerySetMock objects that you have set up to return something expected!

For this I use Django's .none() function.
For example:
class Location(models.Model):
name = models.CharField(max_length=100)
mock_locations = Location.objects.none()
This is the method used frequently in Django's own internal test cases. Based on comments in the code
Calling none() will create a queryset that never returns any objects and no
+query will be executed when accessing the results. A qs.none() queryset
+is an instance of ``EmptyQuerySet``.

Try out the django_mock_queries library that lets you mock out the database access, and still use some of the Django query set features like filtering.
Full disclosure: I contributed some features to the project.

Have you looked into FactoryBoy? https://factoryboy.readthedocs.io/en/latest/orms.html
It's a fixtures replacement tool with support for the django orm - factories basically generate orm-like objects (either in memory or in a test database).
Here's a great article for getting started: https://www.caktusgroup.com/blog/2013/07/17/factory-boy-alternative-django-testing-fixtures/

One first advice would be to split the function in two parts, one that creates the queryset
and one that manipulates the output of it. In this way testing the second part is straightforward.
For the database problem, I investigated if django uses sqlite-in-memory and I found out that
recent version of django uses the sqlite -in-memory database, from The django unittest page:
When using the SQLite database engine the tests will by default use an
in-memory database (i.e., the database will be created in memory,
bypassing the filesystem entirely!).
Mocking the QuerySet object will not make you exercise its full logic.

You can mock like this:
#patch('django.db.models.query.QuerySet')
def test_returning_distinct_records_for_city(self, mock_qs):
self.assertTrue(mock_qs.called)

Not that I know of, but why not use an actual queryset? The test framework is all set up to allow you to create sample data within your test, and the database is re-created on every test, so there doesn't seem to be any reason not to use the real thing.

Django - alternative to subclassing User?

I am using the standard User model (django.contrib.auth) which comes with Django. I have made some of my own models in a Django application and created a relationship between like this:
from django.db import models
from django.contrib.auth.models import User
class GroupMembership(models.Model):
user = models.ForeignKey(User, null = True, blank = True, related_name='memberships')
#other irrelevant fields removed from example
So I can now do this to get all of a user's current memberships:
user.memberships.all()
However, I want to be able to do a more complex query, like this:
user.memberships.all().select_related('group__name')
This works fine but I want to fetch this data in a template. It seems silly to try to put this sort of logic inside a template (and I can't seem to make it work anyway), so I want to create a better way of doing it. I could sub-class User, but that doesn't seem like a great solution - I may in future want to move my application into other Django sites, and presumably if there was any another application that sub-classed User I wouldn't be able to get it to work.
Is the best to create a method inside GroupMembership called something like get_by_user(user)? Would I be able to call this from a template?
I would appreciate any advice anybody can give on structuring this - sorry if this is a bit long/vague.

First, calling select_related and passing arguments, doesn't do anything. It's a hint that cache should be populated.
You would never call select_related in a template, only a view function. And only when you knew you needed all those related objects for other processing.
"Is the best to create a method inside GroupMembership called something like get_by_user(user)?"
You have this. I'm not sure what's wrong with it.
GroupMembership.objects.filter( user="someUser" )
"Would I be able to call this from a template?"
No. That's what view functions are for.
groups = GroupMembership.objects.filter( user="someUser" )
Then you provide the groups object to the template for rendering.
Edit
This is one line of code; it doesn't seem that onerous a burden to include this in all your view functions.
If you want this to appear on every page, you have lots of choices that do not involve repeating this line of code..
A view function can call another function.
You might want to try callable objects instead of simple functions; these can subclass a common callable object that fills in this information.
You can add a template context processor to put this into the context of all templates that are rendered.
You could write your own decorator to assure that this is done in every view function that has the decorator.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How and where should I create this transactional method? - python

Related

Django: Transaction and select_for_update()

UPDATE doesnt use Model save method

Call a function after saving a model

Mocking a Django Queryset in order to test a function that takes a queryset

Django - alternative to subclassing User?

Categories

Resources