Django object locking - python

I'm using Django 1.8 and I have a model
class ModelA(models.Model):
some_field = models.PositiveIntegerField()
Now, in my view I want to add a new ModelA object, but only if there are fewer than x entries for that value already.
def my_view(request):
# Using the value of 4 here just as an example
c = ModelA.objects.filter(some_field=4).count()
# Check if fewer than (x=20) objects with this field already
if c < 20:
# Fewer, so create one
new_model = ModelA(4)
new_model.save()
else:
# Return a message saying "too many"
From my understanding, there could be more than one thread running this method and so thread 1 may perform the count and there are fewer than 20 and then the other thread saves a new object, then thread 1 would save its object and there be 20 or more.
Is there some way to have the view be
def my_view(request):
get_a_lock_on_model(ModelA)
c = ModelA.objects....
# Rest of the code the same
release_lock_on_model(ModelA)
Or is there some other way I should be thinking about doing this? There are only ever inserts, never updates or deletes.
Thanks!

In order to do this you need to lock the entire table and how to do that depends on the RDBMS that you are using. It will involve the use of raw sql. An alternative approach is to do the count after you have saved your record
def my_view(request):
new_model = ModelA(4)
new_model.save()
try :
c = ModelA.objects.filter(some_field=4)[20]
if c.pk == new_model.pk:
c.delete()
# Return a message saying "too many"
except IndexError:
pass
This approach does not get in each others way, each thread is responsible for deleting the extra item that it added. Instead of deleting you can use atomic and rollback if the count is greater than 20

Tested on Django 1.10.x and postgres:
models.py:
class ModelA(models.Model):
some_field = models.PositiveIntegerField()
active = models.BooleanField()
And:
from django.db.models.expressions import RawSQL
n = 42
maximum = 3
raw_sql = RawSQL('select (select count(*) from fooapp_modela where some_field=%s) < %s', (n, maximum))
while True:
o = ModelA.objects.create(some_field=n, active=raw_sql)
o.refresh_from_db()
print(o.id, o.active)
if not o.active:
# o.delete()
break
Caveat: By default, while one transaction is active, other transactions on other connections could not "see" inserted rows until the transactions are committed. Try to avoid creating rows in a complex transactions. I believe that this means that this method is not completely bullet proof :-( More info: https://www.postgresql.org/docs/9.6/static/transaction-iso.html .
A more robust solution might include a db constraint (probably unique_together):
class ModelA(models.Model):
some_field = models.PositiveIntegerField()
ordinal = models.IntegerField()
class Meta:
unique_together = (
('some_field', 'ordinal'),
)
#...
raw_sql = RawSQL('select count(*) + 1 from fooapp_modela where some_field=%s', (n,))
o = ModelA.objects.create(some_field=n, ordinal=raw_sql) # retry a few times on IntegrityError
o.refresh_from_db()
print(o.id, o.ordinal)

Related

Access several tables with only one query

I have the same schema in my django application:
class SomeModel(models.Model):
value = models.CharField(max_length=30)
class AbstractModel(models.Model):
someModel = models.ForeignKey(SomeModel)
class Meta:
abstract = True
class A(AbstractModel):
anotherValue = models.CharField(max_length=5)
class B(AbstractModel):
anotherValue = models.CharField(max_length=5)
class C(AbstractModel):
anotherValue = models.CharField(max_length=5)
class D(AbstractModel):
anotherValue = models.CharField(max_length=5)
class E(AbstractModel):
anotherValue = models.CharField(max_length=5)
With this layout, I need the most efficient way to query all objects from models A, B, C, D and E with a given id of SomeModel. I know that I cannot execute a query in an abstract model, so right now, what I do is query each model separately like this:
A.objects.filter(someModel__id=id)
B.objects.filter(someModel__id=id)
C.objects.filter(someModel__id=id)
D.objects.filter(someModel__id=id)
E.objects.filter(someModel__id=id)
Obviously this approach is quite slow, because I need to make 5 different queries each time I want to know all those objects. So my question is, is there a way to optimize this kind of query?
UPDATE:
I have tried the union method like this:
qs1 = A.objects.filter(**filters) # hits DB
qs2 = B.objects.filter(**filters) # hits DB
qs3 = C.objects.filter(**filters) # hits DB
qs4 = D.objects.filter(**filters) # hits DB
qs5 = E.objects.filter(**filters) # hits DB
qs1.union(qs2, qs3, qs4, qs5) # hits DB
That's actually 6 hits to the database!! I woulk like only one!
I have checked this printing the number of queries made:
from django.conf import settings
settings.DEBUG = True
from django.db import connection
print(len(connection.queries))
You may use union method, but what you want to do? If you want to call five objects by one pk and you want to be sure that they have strict relation between each other you may use OneToOne relationship.
So in the first case you just need to make a query, in the second case you must make new migration and maybe you will need to rebuild your tables.

Django: Transactions and how to avoid wrong counting?

I am currently struggling with a topic connected to transactions. I implemented a discount functionality. Whenever a sale is made with a discount code, the counter redeemed_quantity is increased by + 1.
Now I thought about the case. What if one or more users redeem a discount at the same time? Assuming redeemed_quantity is 10. User 1 buys the product and redeemed_quantity increases by +1 = 11. Now User 2 clicked on 'Pay' at the same time and again redeemed_quantity increases by +1 = 11. Even so, it should be 12. I learned about #transaction.atomic but I think the way I implemented them here will not help me with what I am actually trying to prevent. Can anyone help me with that?
view.py
class IndexView(TemplateView):
template_name = 'website/index.html'
initial_price_of_course = 100000 # TODO: Move to settings
def check_discount_and_get_price(self):
discount_code_get = self.request.GET.get('discount')
discount_code = Discount.objects.filter(code=discount_code_get).first()
if discount_code:
discount_available = discount_code.available()
if not discount_available:
messages.add_message(
self.request,
messages.WARNING,
'Discount not available anymore.'
)
if discount_code and discount_available:
return discount_code, self.initial_price_of_course - discount_code.value
else:
return discount_code, self.initial_price_of_course
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
context['stripe_pub_key'] = settings.STRIPE_PUB_KEY
discount_object, course_price = self.check_discount_and_get_price()
context['course_price'] = course_price
return context
#transaction.atomic
def post(self, request, *args, **kwargs):
stripe.api_key = settings.STRIPE_SECRET_KEY
token = request.POST.get('stripeToken')
email = request.POST.get('stripeEmail')
discount_object, course_price = self.check_discount_and_get_price()
charge = stripe.Charge.create(
amount=course_price,
currency='EUR',
description='My Description',
source=token,
receipt_email=email,
)
if charge.paid:
if discount_object:
discount_object.redeemed_quantity += 1
discount_object.save()
order = Order(
total_gross=course_price,
discount=discount_object
)
order.save()
return redirect('website:index')
models.py
class Discount(TimeStampedModel):
code = models.CharField(max_length=20)
value = models.IntegerField() # Smallest currency unit, as amount charged
max_quantity = models.IntegerField()
redeemed_quantity = models.IntegerField(default=0)
def available(self):
available_quantity = self.max_quantity - self.redeemed_quantity
if available_quantity > 0:
return True
class Order(TimeStampedModel):
total_gross = models.IntegerField()
discount = models.ForeignKey(
Discount,
on_delete=models.PROTECT, # Can't delete discount if used.
related_name='orders',
null=True,
You can pass the handling of the incrementation to the database in order to avoid the race condition in your code by using django's F expression:
from django.db.models import F
# ...
discount_object.redeemed_quantity = F('redeemed_quantity') + 1
discount_object.save()
From the docs with a completely analogous example:
Although reporter.stories_filed = F('stories_filed') + 1 looks like a normal Python assignment of value to an instance attribute, in fact it’s an SQL construct describing an operation on the database.
When Django encounters an instance of F(), it overrides the standard Python operators to create an encapsulated SQL expression; in this case, one which instructs the database to increment the database field represented by reporter.stories_filed.
Django is a piece of a synchronous code. It means that every request you make to the server is processed individually. This problem could arise, when there are multiple server-workers (for example uwsgi workers), but again - it's practically impossible to do this. We run a webshop application with multiple workers and something like this never happend.
But back to the question - if you want to query the database to increase a value by one, see schwobaseggl's answer.
The last thing is that I think you misunderstand what transaction.atomic() does. Simply put it rolls back any queries made to the database in a function if function exits with an error to the state when function was called. See this answer and this piece of documentation. Maybe it will clear some things up.

Django ORM get jobs with top 3 scores for each model_used

Models.py:
class ScoringModel(models.Model):
title = models.CharField(max_length=64)
class PredictedScore(models.Model):
job = models.ForeignKey('Job')
candidate = models.ForeignKey('Candidate')
model_used = models.ForeignKey('ScoringModel')
score = models.FloatField()
created_at = models.DateField(auto_now_add=True)
modified_at = models.DateTimeField(auto_now=True)
serializers.py:
class MatchingJobsSerializer(serializers.ModelSerializer):
job_title = serializers.CharField(source='job.title', read_only=True)
class Meta:
model = PredictedScore
fields = ('job', 'job_title', 'score', 'model_used', 'candidate')
To fetch the top 3 jobs, I tried the following code:
queryset = PredictedScore.objects.filter(candidate=candidate)
jobs_serializer = MatchingJobsSerializer(queryset, many=True)
jobs = jobs_serializer.data
top_3_jobs = heapq.nlargest(3, jobs, key=lambda item: item['score'])
Its giving me top 3 jobs for the whole set which contains all the models.
I want to fetch the jobs with top 3 scores for a given candidate for each model used.
So, it should return the top 3 matching jobs with each ML model for the given candidate.
I followed this answer https://stackoverflow.com/a/2076665/2256258 . Its giving the latest entry of cake for each bakery, but I need the top 3.
I read about annotations in django ORM but couldn't get much about this issue. I want to use DRF serializers for this operations. This is a read only operation.
I am using Postgres as database.
What should be the Django ORM query to perform this operation?
Make the database do the work. You don't need annotations either as you want the objects, not the values or manipulated values.
To get a set of all scores for a candidate (not split by model_used) you would do:
queryset = candidate.property_set.filter(candidate=candidate).order_by('-score)[:2]
jobs_serializer = MatchingJobsSerializer(queryset, many=True)
jobs = jobs_serializer.data
What you're proposing isn't particularly well suited in the Django ORM, annoyingly - I think you may need to make separate queries for each model_used. A nicer solution (untested for this example) is to hook Q queries together, as per this answer.
Example is there is tags, but I think holds -
#lets get a distinct list of the models_used -
all_models_used = PredictedScore.objects.values('models_used').distinct()
q_objects = Q() # Create an empty Q object to start with
for m in all_models_used:
q_objects |= Q(model_used=m)[:3] # 'or' the Q objects together
queryset = PredictedScore.objects.filter(q_objects)

Django - Checking for two models if their primary keys match

I have 2 models (sett, data_parsed), and data_parsed have a foreign key to sett.
class sett(models.Model):
setid = models.IntegerField(primary_key=True)
block = models.ForeignKey(mapt, related_name='sett_block')
username = models.ForeignKey(mapt, related_name='sett_username')
ts = models.IntegerField()
def __unicode__(self):
return str(self.setid)
class data_parsed(models.Model):
setid = models.ForeignKey(sett, related_name='data_parsed_setid', primary_key=True)
block = models.CharField(max_length=2000)
username = models.CharField(max_length=2000)
time = models.IntegerField()
def __unicode__(self):
return str(self.setid)
The data_parsed model should have the same amount of rows, but there is a possibility that they are not in "sync".
To avoid this from happening. I basically do these two steps:
Check if sett.objects.all().count() == data_parsed.objects.all().count()
This works great for a fast check, and it takes literally seconds in 1 million rows.
If they are not the same, I would check for all the sett model's pk, exclude the ones already found in data_parsed.
sett.objects.select_related().exclude(
setid__in = data_parsed.objects.all().values_list('setid', flat=True)).iterator():
Basically what this does is select all the objects in sett that exclude all the setid already in data_parsed. This method "works", but it will take around 4 hours for 1 million rows.
Is there a faster way to do this?
Finding setts without data_parsed using the reverse relation:
setts.objects.filter(data_parsed_setid__isnull=True)
If i am getting it right you are trying to keep a list of processed objects in another model by setting a foreign key.
You have only one data_parsed object by every sett object, so a many to one relationship is not needed. You could use one to one relationships and then check which object has that field as empty.
With a foreign key you could try to filter using the reverse query but that is at object level so i doubt that works.

How to make an auto-filled and auto-incrementing field in django admin

[Update: Changed question title to be more specific]
Sorry if I didn't make the question very well, I can't figure how to do this:
class WhatEver():
number = model.IntegerField('Just a Field', default=callablefunction)
...
Where callablefunction does this query:
from myproject.app.models import WhatEver
def callablefunction():
no = WhatEver.objects.count()
return no + 1
I want to automatically write the next number, and I don't know how to do it.
I have errors from callablefunction stating that it cannot import the model, and I think there must be an easier way to do this. There's no need even to use this, but I can't figure how to do it with the pk number.
I've googled about this and the only thing I found was to use the save() method for auto incrementing the number... but I wanted to show it in the <textfield> before saving...
What would you do?
Got it! I hope this will help everyone that has any problems making a auto-filled and auto-incrementing field in django. The solution is:
class Cliente(models.Model):
"""This is the client data model, it holds all client information. This
docstring has to be improved."""
def number():
no = Cliente.objects.count()
if no == None:
return 1
else:
return no + 1
clientcode = models.IntegerField(_('Code'), max_length=6, unique=True, \
default=number)
[... here goes the rest of your model ...]
Take in care:
The number function doesn't take any arguments (not even self)
It's written BEFORE everything in the model
This was tested on django 1.2.1
This function will automatically fill the clientcode field with the next number (i.e. If you have 132 clients, when you add the next one the field will be filled with clientcode number 133)
I know that this is absurd for most of the practical situations, since the PK number is also auto-incrementing, but there's no way to autofill or take a practical use for it inside the django admin.
[update: as I stated in my comment, there's a way to use the primary key for this, but it will not fill the field before saving]
Every Django model already has an auto-generated primary key:
id = models.AutoField(primary_key=True)
It seems you are trying to duplicate an already existing behavior, just use the object primary key.
I, too, came across this problem, my instance of it was customer.number which was relative to the customers Store. I was tempted to use something like:
# Don't do this:
class Customer(models.Model):
# store = ...
number = models.IntegerField(default=0)
def save(self, *args, **kwargs):
if self.number == 0:
try:
self.number = self.store.customer_set.count() + 1
else:
self.number = 1
super(Customer, self).save(*args, **kwargs)
The above can cause several problems: Say there were 10 Customers, and I deleted customer number 6. The next customer to be added would be (seemingly) the 10th customer, which would then become a second Customer #10. (This could cause big errors in get() querysets)
What I ended up with was something like:
class Store(models.Model):
customer_number = models.IntegerField(default=1)
class Customer(models.Model):
store = models.ForeignKey(Store)
number = models.IntegerField(default=0)
def save(self, *args, **kwargs):
if self.number == 0:
self.number = self.store.customer_number
self.store.number += 1
self.store.save()
super(Customer, self).save(*args, **kwargs)
PS:
You threw out several times that you wanted this field filled in "before". I imagine you wanted it filled in before saving so that you can access it. To that I would say: this method allows you to access store.customer_number to see the next number to come.
You have errors in code, that's why you can't import it:
from django.db import models
class WhatEver(models.Model):
number = models.IntegerField('Just a Field', default=0)
and Yuval A is right about auto-incrementing: you don't even need to declare such a field. Just use the pk or id, they mean the same unless there's a composite pk in the model:
> w = Whatever(number=10)
> w
<Whatever object>
> w.id
None
> w.save()
> w.id
1
[update] Well, I haven't tried a callable as a default. I think if you fix these errors, it must work.

Categories

Resources