Getting Unique Foreign Keys in Django? - python

Suppose my model looks like this:
class Farm(models.Model):
name = ...
class Tree(models.Model):
farm = models.ForeignKey(Farm)
...and I get a QuerySet of Tree objects. How do I determine what farms are represented in that QuerySet?

http://docs.djangoproject.com/en/dev/ref/models/querysets/#in
Farm.objects.filter(tree__in=TreeQuerySet)

There might be a better way to do it with the Django ORM and keep it lazy but you can get what you want with regular python (off the top of my head):
>>> set([ t.farm for t in qs ])

Here is a way to have the database do the work for you:
farms = qs.values_list('farm', flat=True).distinct()
#values_list() is new in Django 1.0
return value should evaluate to something like:
(<Farm instance 1>, <Farm instance5>)
were farms will be those that have trees in that particular query set.
For all farms that have trees, use qs = Tree.objects
Keep in mind that if you add order_by('some_other_column') then distinct will apply to the distinct combinations of 'farm' and 'some_other_column', because other column will also be in the sql query for ordering. I think it's a limitation (not an intended feature) in the api, it's described in the documentation.

Related

query.group_by in Django 1.9

I am moving code from Django 1.6 to 1.9.
In 1.6 I had this code
models.py
class MyReport(models.Model):
group_id = models.PositiveIntegerField(blank=False, null=False)
views.py
query = MyReport.objects.filter(owner=request.user).query
query.group_by = ['group_id']
entries = QuerySet(query=query, model=MyReport)
The query would return one object for each 'group_id'; due to the way I use it, any table row with the group_id would do as a representative.
With 1.9 this code is broken. The query after the second line above is:
SELECT "reports_myreport"."group_id", ... etc FROM "reports_myreport" WHERE "reports_myreport"."owner_id" = 1 GROUP BY "reports_myreport"."group_id", "reports_report"."otherfield", ...
Basically it lists all the table fields in the group by clause, making the query return the whole table.
Ever though in the debugger I see
query.group_by = ['group_by']
It doesn't look like query.group_by is a method in 1.9 nor does the change-logs of 1.7-1.9 suggest that something changed.
Is there a better way - not depending on internal Django stuff - I can use for my query?
Any way to fix my current query?
You can use order_by() to get the results ordered, in that same query you can order by a second criteria.
If your want to get the groups you will need to iterate over the collection to retrieve those values.
If you consume all of the results returned by the query, you can consider:
a) itertools.groupby which makes an in-memory group by instead, but you should not use it for large data sets.
b) Another option is to use Manager.raw() but you will need to write SQL inside Django, like this:
for report in MyReport.objects.raw('SELECT * FROM reporting_report GROUP by group_id'):
print(report)
This will work for large data sets, but you could lose compatibility with some database engines.
Bonus: I recommend you to understand what exactly the old code did before doing a rewrite.

Django ORM values_list with '__in' filter performance

What is the preferred way to filter query set with '__in' in Django?
providers = Provider.objects.filter(age__gt=10)
consumers = Consumer.objects.filter(consumer__in=providers)
or
providers_ids = Provider.objects.filter(age__gt=10).values_list('id', flat=True)
consumers = Consumer.objects.filter(consumer__in=providers_ids)
These should be totally equivalent. Underneath the hood Django will optimize both of these to a subselect query in SQL. See the QuerySet API reference on in:
This queryset will be evaluated as subselect statement:
SELECT ... WHERE consumer.id IN (SELECT id FROM ... WHERE _ IN _)
However you can force a lookup based on passing in explicit values for the primary keys by calling list on your values_list, like so:
providers_ids = list(Provider.objects.filter(age__gt=10).values_list('id', flat=True))
consumers = Consumer.objects.filter(consumer__in=providers_ids)
This could be more performant in some cases, for example, when you have few providers, but it will be totally dependent on what your data is like and what database you're using. See the "Performance Considerations" note in the link above.
I Agree with Wilduck. However couple of notes
You can combine a filter such as these into one like this:
consumers = Consumer.objects.filter(consumer__age__gt=10)
This would give you the same result set - in a single query.
The second thing, to analyze the generated query, you can use the .query clause at the end.
Example:
print Provider.objects.filter(age__gt=10).query
would print the query the ORM would be generating to fetch the resultset.

Django: Most Efficient Way To Filter A Queryset Repeatedly

I have a model that looks something like this:
class Item(models.Model):
name = models.CharField()
type = models.CharField()
tags = models.models.ManyToManyField(Tags)
In order to render a given view, I have a view that presents a list of Items based on type. So in my view, there's a query like:
items = Item.objects.filter(type='type_a')
So that's easy and straight forward. Now I have an additional requirement for the view. In order to fulfill that requirement, I need to build a dictionary that relates Tags to Items. So the output i am looking for would be something like:
{
'tag1': [item1, item2, item5],
'tag2': [item1, item4],
'tag3': [item3, item5]
}
What would be the most efficient way to do this? Is there any way to do this without going to the database with a new query for each tag?
You can check prefetch_related it might help you:
This has a similar purpose to select_related, in that both are designed to stop the deluge of database queries that is caused by accessing related objects, but the strategy is quite different... prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related...
So in the end you will either do multiple queries or use prefetch_related and it will do some Python joins on the objects.
You might do something like this:
# This should require two database queries, one for the items
# and one for all the associated tags.
items = Item.objects.filter(type='type_a').prefetch_related('tags')
# Now massage the data into your desired data structure.
from collections import defaultdict
tag_dict = defaultdict(list)
for item in items:
# Thanks to prefetch_related this will not hit the database.
for tag in item.tags.all():
tag_dict[tag].append(item)

Django - Following ForeignKey relationships "backward" for entire QuerySet

is it possible to follow ForeignKey relationships backward for entire querySet?
i mean something like this:
x = table1.objects.select_related().filter(name='foo')
x.table2.all()
when table1 hase ForeignKey to table2.
in
https://docs.djangoproject.com/en/1.2/topics/db/queries/#following-relationships-backward
i can see that it works only with get() and not filter()
Thanks
You basically want to get QuerySet of different type from data you start with.
class Kid(models.Model):
mom = models.ForeignKey('Mom')
name = models.CharField…
class Mom(models.Model):
name = models.CharField…
Let's say you want to get all moms having any son named Johnny.
Mom.objects.filter(kid__name='Johnny')
Let's say you want to get all kids of any Lucy.
Kid.objects.filter(mom__name='Lucy')
You should be able to use something like:
for y in x:
y.table2.all()
But you could also use get() for a list of the unique values (which will be id, unless you have a different specified), after finding them using a query.
So,
x = table1.objects.select_related().filter(name='foo')
for y in x:
z=table1.objects.select_related().get(y.id)
z.table2.all()
Should also work.
You can also use values() to fetch specific values of a foreign key reference. With values the select query on the DB will be reduced to fetch only those values and the appropriate joins will be done.
To re-use the example from Krzysztof Szularz:
jonny_moms = Kid.objects.filter(name='Jonny').values('mom__id', 'mom__name').distinct()
This will return a dictionary of Mom attributes by using the Kid QueryManager.

How to get all children of queryset in django?

I've a queryset result, say, animals, which has a list of animals. There are sub categories of animals and I want to get all the subcategories. i.e.
for single animal, I can use animal.categories which works. Now, I want to somehow do this:
categories = animals.categories
where animals is queryset. How can I achieve this?
There is no way without iterating over the query set, but you can use prefetch_related to speed things up:
all_animals = Animals.objects.prefetch_related('categories')
categories = [animal.categories.all() for animal in all_animals]
There are 2 options:
1) Following your question exactly, you can only do:
categories=[]
for aninmal in animals:
categories.extend(animal.categories.all())
2) However, I would run a new query with categories like that (I do not know your exact data model and wording, but I think you get the idea)
categories=Category.filter(animals__in=animals).all()
( Base: if animals.categories is giving a queryset that means categories has a many-to-one with animal. and default name (category_set) is changed with categories)
It seems to me like your main query should be a Category query instead of animals. Lets say you get animals like:
Animals.objects.filter(name__startswith='A')
You could get categories of this animals with following query
Category.objects.filter(animal__name__startswith='A')
You can also get animals with this query
Category.objects.filter(animal__name__startswith='A').select_related('category')
But I recommend to use seperate queries for categories and animals (will be more memmory efficent but will do two queries).
Note:If you are really using one-to-many You should consider changing it to many-to-many.

Categories

Resources