Why this is making two queries instead of one? - python

View:
transfer_details = TransferDetail.objects.filter(user=request.user).select_related('transfermethod_set')
print transfer_details.filter(method__name='PayPal')
Models:
class TransferMethod(models.Model):
name = models.CharField(max_length=30)
...
class TransferDetail(models.Model):
data = models.TextField()
...
method = models.ForeignKey(TransferMethod)
user = models.ForeignKey(User)
I expect transfer_details QuerySet from line one to be used without further database calls.
What I am missing?
UPDATE 1
So I discovered when I have these two lines there are no additional queries:
x = transfer_details.filter(method__name='PayPal')
x2 = transfer_details.filter(method__name='Something')
But when I add the following two lines, it's making 2 DB queries:
list(x[:1])
list(x2[:1])
What's happening under the hood and how I can avoid the extra calls?
UPDATE 2
I tried:
transfer_details.get(method__name='PayPal').data
...
It's also making two queries.

Correctly it should be (assuming you also want to get the user data in one query):
transfer_details = TransferDetail.objects.filter(
user=request.user).select_related('method', 'user')
You wouldn't need to select method because when you filter for it in print transfer_details.filter(method__name='PayPal') it should get selected automatically. When you call print TansferDetail's __unicode__ will get invoked, so a reason for additional could be that you're outputting some other related data there (eg. from the Usermodel, which should be solved with the code above...).
To answer your edited question: If you call list on a queryset the queryset gets evaluated, which means the actual query is made.
Don't know if you are accessing request.user at some point before in your code, but if that is not the case it's possible that the second query is the result of getting the user for the current request.

Related

Django .order_by() related field returns too many items

I'm trying to return a list of users that have recently made a post, but the order_by method makes it return too many items.
there is only 2 accounts total, but when I call
test = Account.objects.all().order_by('-posts__timestamp')
[print(i) for i in test]
it will return the author of every post instance, and its duplicates. Not just the two account instances.
test#test.example
test#test.example
test#test.example
test#test.example
foo#bar.example
Any help?
class Account(AbstractBaseUser):
...
class Posts(models.Model):
author = models.ForeignKey('accounts.Account',on_delete=models.RESTRICT, related_name="posts")
timestamp = models.DateTimeField(auto_now_add=True)
title = ...
content = ...
This is totally normal. You should understand how is the SQL query generated.
Yours should look something like that:
select *
from accounts
left join post on post.account_id = account.id
order by post.timestamp
You are effectively selecting every post with its related users. It is normal that you have some duplicated users.
What you could do is ensure that your are selecting distinct users: Account.objects.order_by('-posts__timestamp').distinct('pk')
What I would do is cache this information in the account (directly on the acount model or in another model that has a 1-to-1 relashionship with your users.
Adding a last_post_date to your Account model would allow you to have a less heavy request.
Updating the Account.last_post_date every time a Post is created can be a little tedious, but you can abstract this by using django models signals.

Django import-export, only export one object with related objects

I have a form which enables a user to register on our website. Now I need to export all the data to excel, so I turned towards the import-export package. I have 3 models, Customer, Reference and Contact. The latter two both have a m2m with Customer. I also created Resources for these models. When I use Resource().export() at the end of my done() method in my form view, it exports all existing objects in the database, which is not what I want.
I tried googling this and only got one result, which basically says I need to use before_export(), but I can't find anywhere in the docs how it actually works.
I tried querying my customer manually like:
customer = Customer.objects.filter(pk=customer.id)
customer_data = CustomerResource().export(customer)
which works fine but then I'm stuck with the related references and contacts: reference_data = ReferenceResource().export(customer.references) gives me an TypeError saying 'ManyRelatedManager' object is not iterable. Which makes sense because export() expects an queryset, but I'm not sure if it's possible getting it that way.
Any help very appreciated!
One way is to override get_queryset(), you could potentially try to load all related data in a single query:
class ReferenceResource(resources.ModelResource):
def __init__(self, customer_id):
super().__init__()
self.customer_id = customer_id
def get_queryset(self):
qs = Customer.objects.filter(pk=self.customer.id)
# additional filtering here
return qs
class Meta:
model = Reference
# add fields as appropriate
fields = ('id', )
To handle m2m relationships, you may be able to modify the queryset to add these additional fields.
This isn't the complete answer but it may help you make progress.

Django prefetch_related and N+1 - How is it solved?

I am sitting with a query looking like this:
# Get the amount of kilo attached to products
product_data = {}
for productSpy in ProductSpy.objects.all():
product_data[productSpy.product.product_id] = productSpy.kilo # RERUN
I do not see how I on my last line would be able to use prefetch_related. In the examples in the docs it's very simplified and somehow makes sense, but I do not understand the whole concept enough to see myself out of this. Could I please get explained what's being done and how? I find this very important to understand, and where met by my first N+1 here.
Thank you up front for your time.
models.py
class ProductSpy(models.Model):
created_by = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
product = models.ForeignKey(Product, on_delete=models.CASCADE)
def __str__(self):
return self.kilo
class Product(models.Model):
product_id = models.IntegerField()
name = models.CharField(max_length=150)
def __str__(self):
return self.name
Django fetches related tables at runtime:
each call to productSpy.product will fetch from the table product using productSpy.id
The latency in I/O operation means that this code is highly inefficient. using prefetch_related will fetch product for all the product spy objects in one shot resulting in better performance.
# Get the amount of kilo attached to products
product_data = {}
product_spies = ProductSpy.objects.all()
product_spies.prefetch_related('product')
product_spies.prefetch_related('kilo')
for productSpy in product_spies:
product_data[productSpy.product.product_id] = productSpy.kilo # RERUN
When one writes productSpy.product if the related object is not already fetched, Django makes automatically will make a query to the database to get the related Product instance. Hence if ProductSpy.objects.all() returned N instances by writing productSpy.product in a loop we will be making N more queries which is what we call N + 1 problem.
Moving further although you can use prefetch_related (will use 2 queries in your case) here it would be better for you to use select_related [Django docs] which will use a LEFT JOIN and get you the related instances in 1 query itself:
product_data = {}
queryset = ProductSpy.objects.select_related('product')
for productSpy in queryset:
product_data[productSpy.product.product_id] = productSpy.kilo # No extra queries as we used `select_related`
Note: There seems to be some problem with your logic here though, as multiple ProductSpy instances can have the same Product,
hence your loop might overwrite some values.

Django Add list generated from the text of one field to many to many field

Having a bit of trouble trying to bulk add a list of items to a many to many field and though having tried various things have no clue on how to approach this. I've looked at the Django documentation and cant seem to find what I'm looking for.
Here is the code for my models:
class Subject(models.Model):
noun = models.CharField(max_length=30, null=True, blank=True)
class Knowledge(models.Model):
item_text = models.TextField()
item_subjects = models.ManyToManyField(Subject, null=True, blank=True)
def add_subjects(sender, instance, *args, **kwargs):
if instance.item_info:
item_subjects = classifier.predict_subjects(instance.item_info)
if item_subjects:
....
post_save.connect(add_subjects, sender=Knowledge)
The list is being generated by the classifer.predict_subjects function.
I have tried using the m2m_changed connector and the pre_save and post_save connect. I'm not even sure the many to many field is the right option would it be better to do make a foreign key relationship.
in place of the '...' I have tried this but it doesn't create the relationship between and only saves the last one.
for sub in item_subjects:
subject = Subject(id=instance.id, noun=sub)
subject.save()
I've also tried
instance.item_subjects = item_subjects
and a load more things that I can't really remember, I don't really think I'm in the right ballpark to be honest. Any suggestions?
edit:
ok, so I have got it adding all of the list items but still haven't managed to link these items to the many to many field.
for sub in item_subjects:
subject = Subject.objects.get_or_create(noun=sub)
edit 2:
So doing pretty much exactly the same thing outside of the loop in the Django shell seems to be working and saves the entry but it doesn't inside the function.
>>> k[0].item_subjects.all()
<QuerySet []>
>>> d, b = Subject.objects.get_or_create(noun="cats")
<Subject: cats>
>>> k[0].item_subjects.add(d)
>>> k[0].item_subjects.all()
<QuerySet [<Subject: cats>]>
edit 3
So I took what Robert suggested and it works in the shell just like above just not when using it in the admin interface. The print statements in my code show the array item being updated but it just dosen't persist. I read around and this seems to be a problem to do with the admin form clearing items before saving.
def sub_related_changed(sender, instance, *args, **kwargs):
print instance.item_subjects.all()
if instance.item_info:
item_subjects = classifier.predict_subjects(instance.item_info)
if item_subjects:
for sub in item_subjects:
subject, created = Subject.objects.get_or_create(noun=sub)
instance.item_subjects.add(subject)
print instance.item_subjects.all()
post_save.connect(sub_related_changed, sender=Knowledge)
I have tried using the function as m2m_changed signal as follows:
m2m_changed.connect(model_saved, sender=Knowledge.item_subjects.through)
But this either generates a recursive loop or doesn't fire.
Once you have the subject objects (as you have in your edit), you can add them with
for sub in item_subjects:
subject, created = Subject.objects.get_or_create(noun=sub)
instance.item_subjects.add(subject)
The "item_subjects" attribute is a way of managing the related items. The through relationships are created via the "add" method.
Once you've done this, you can do things like instance.item_subjects.filter(noun='foo') or instance.item_subjects.all().delete() and so on
Documentation Reference: https://docs.djangoproject.com/en/1.11/topics/db/examples/many_to_many/
EDIT
Ahh I didn't realize that this was taking place in the Django Admin. I think you're right that that's the issue. Upon save, the admin calls two methods: The first is model_save() which calls the model's save() method (where I assume this code lives). The second method it calls is "save_related" which first clears out ManyToMany relationships and then saves them based on the submitted form data. In your case, there is no valid form data because you're creating the objeccts on save.
If you put the relevant parts of this code into the save_related() method of the admin, the changes should persist.
I can be more specific about where it should go if you'll post both your < app >/models.py and your < app >/admin.py files.
Reference from another SO question:
Issue with ManyToMany Relationships not updating inmediatly after save

Django aggregation query on related one-to-many objects

Here is my simplified model:
class Item(models.Model):
pass
class TrackingPoint(models.Model):
item = models.ForeignKey(Item)
created = models.DateField()
data = models.IntegerField()
class Meta:
unique_together = ('item', 'created')
In many parts of my application I need to retrieve a set of Item's and annotate each item with data field from latest TrackingPoint from each item ordered by created field. For example, instance i1 of class Item has 3 TrackingPoint's:
tp1 = TrackingPoint(item=i1, created=date(2010,5,15), data=23)
tp2 = TrackingPoint(item=i1, created=date(2010,5,14), data=21)
tp3 = TrackingPoint(item=i1, created=date(2010,5,12), data=120)
I need a query to retrieve i1 instance annotated with tp1.data field value as tp1 is the latest tracking point ordered by created field. That query should also return Item's that don't have any TrackingPoint's at all. If possible I prefer not to use QuerySet's extra method to do this.
That's what I tried so far... and failed :(
Item.objects.annotate(max_created=Max('trackingpoint__created'),
data=Avg('trackingpoint__data')).filter(trackingpoint__created=F('max_created'))
Any ideas?
Here's a single query that will provide (TrackingPoint, Item)-pairs:
TrackingPoint.objects.annotate(max=Max('item__trackingpoint__created')).filter(max=F('created')).select_related('item').order_by('created')
You would have to query for items without TrackingPoints separately.
This isn't directly answer to your question, but in case don't need exactly what you described you might be interested in greatest-n-per-group solution. You can take a look on my answer on similar question:
Django Query That Get Most Recent Objects From Different Categories
-- this should apply directly to your case:
items = Item.objects.annotate(tracking_point_created=Max('trackingpoint__created'))
trackingpoints = TrackingPoint.objects.filter(created__in=[b.tracking_point_created for b in items])
Note that second line can produce ambiguous results if created dates repeat in TrackingPoint model.

Categories

Resources