Assume the following set of models:
class A(models.Model):
pass
class B(models.Model):
pass
class M2M(models.Model):
a = models.ForeignKey(A)
b = models.ForeignKey(B)
A way to filter (this is a part in the chain of a larger application) by some conditions on the links, in naive Django ORM is to do this:
def fun():
as = A.objects.filter("some complex queryset")
m2ms = M2M.objects.filter("some complex B-dependent QS")
return as.filter(id__in=[m.a_id for m in m2ms])
But obviously this produces a rather awful query "id__in", and clearly executes as two queries.
Is there a better way to get Django to produce a proper join?
You should explicitly declare a many-to-many field from A to B through M2M.
class A(models.Model):
bs = models.ManyToManyField('B', through='M2M')
Now you can simply do:
A.objects.filter(condition_on_A='foo').filter(b__condition_on_b='bar')
You would be able to achieve this in a single query. For example, lets say you want to filter only those records which have a value greater than 50 on field x in the model B, you would do:
A.objects.filter("some-filter-criteria", m2m__b__x__gt=50)
You can read more on related name lookups here
Related
If I have 3 models, like:
class Cow(models.Model):
name =
number_of_eyes =
number_of_feet =
color =
class Pig(models.Model):
name =
number_of_eyes =
number_of_feet =
intelligence =
class Horse(models.Model):
name =
number_of_eyes =
number_of_hooves =
weight_capacity =
speed =
And I'm interested in making a single Livestock table in my template that has instances of all 3, but I'm only interested in these columns that all 3 models have:
name
number_of_eyes
number_of_feet (number_of_hooves if Horse)
And we can ignore all other columns.
How can I join them into a single queryset?
The end goal is to get a single virtual table (queryset) that I can do a few other operations on (filter, order_by, slice), and then return the data in just those columns.
Is this possible in the Django ORM?
I think you have two options:
using itertools.chain:
from itertools import chain
cows = Cow.objects.all()
pigs = Pig.objects.all()
horses = Horse.objects.all()
livestock_list = sorted(
chain(cows, pigs, horses),
key=lambda livestock: livestock.created_at, reverse=True)
)
using contenttypes:
from django.contrib.contenttypes.models import ContentType
from django.contrib.contenttypes.fields import GenericForeignKey
class Livestock(models.Model):
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = GenericForeignKey('content_type', 'object_id')
created = models.DateTimeField(auto_now_add=True)
class Meta:
ordering = ['-created']
Now you can query Livestock model like any other model in Django, but you can have a foreign key that can refers to n models. that's what contenttypes do.
Livestock.content_object gives you what you want in your case it can be Cow, Pig or Horse.
Just remember to add objects to Livestock model after you create horse, etc instances. you need to add them in 2 models actually. you can do it with signals.
I think the second solution is better.
Apparently this can also be done using a Union, as suggested by Nick ODell:
from django.db.models import F
Cow.objects.filter(...).union(
Pig.objects.filter(...),
Horse.objects.filter(...).annotate(number_of_feet=F("number_of_hooves"))
).values('name', 'number_of_eyes', 'number_of_feet').order_by('name')[:3]
Unfortunately you can't filter on the resulting queryset after the union, so you need to filter each queryset before the union, but other than that, everything seems to work in my quick test.
From what I understand, the difference here from MojixCoder's suggestion of using ContentType is that you don't need to maintain a separate definition of this virtual table in your Django models module. In some cases, that can be an advantage, as you don't need to keep the module updated when you get new models you want to include in your query, but in other cases, it can be a disadvantage, because my way has a lot of typing every time you want to use this query, whereas in MojixCoder's example, you define it once, and your queries would be much shorter.
Edit: Using annotate and union can result in the results being out of order. Special care must be taken to ensure this doesn't happen
I was wondering if there is a shortcut to getting all fields from a Django model and only defining additional fields that are retrieved through a join (or multiple joins).
Consider models like the following:
class A(models.Model):
text = models.CharField(max_length=10, blank=True)
class B(models.Model):
a = models.ForeignKey(A, null=True, on_delete=models.CASCADE)
y = models.PositiveIntegerField(null=True)
Now I can use the values() function like this
B.objects.values('y', 'a__text')
to get tuples containing the specified values from the B model and the actual field from the A model. If I only use
B.objects.values()
I only get tuples containing fields from the B model (i.e., y and the foreign key id a). Let's assume a scenario where B and A have many fields, and I am interested in all of those belonging to B but only in a single field from A. Manually specifying all the field names in the values() call would be possible, but tedious and error-prone.
So is there a way to specify that I want all local fields, but only a (few) specific joined field(s)?
Note: I'm currently using Django 1.11, but if a solution only works with a more recent version I am interested in that too.
You can use prefetch_related for this. See docs:
You want to use performance optimization techniques like deferred
fields:
queryset = Pizza.objects.only('name')
restaurants = Restaurant.objects.prefetch_related(Prefetch('best_pizza', queryset=queryset))
In your case you can do something like this:
from django.db.models import Prefetch
queryset = A.objects.only('text')
b_list = B.objects.prefetch_related(Prefetch('a', queryset=queryset))
Maybe something like this would work in your case?
B.objects.select_related('a').defer('a__field_to_lazy_load');
This will load all fields from both models except the ones you specify in defer(), where you can use the usual Django double underscore convention to traverse the relationship.
The fields you specify in defer() won't be loaded from the db but they will be if you try to access them later on (e.g. in a template).
I'm essentially trying to come up with my own inheritance scheme because Django's inheritance doesn't fit my needs.
I'd like parent table(class) hold common data fields.
sub classess would have its own additional data in a separate table.
class ProductBase(models.Model):
common = models.IntegerField()
def get_price(self):
return some_price
class FooProduct(ProductBase):
# no field because I'm proxy
class Meta:
proxy = True
def get_price(self):
return price_using_different_logic
class FooExtra(models.Model):
base = models.OneToOneField(ProductBase, primary_key=True)
phone = models.CharField(max_length=10)
My question is, would it be able to treat as if Foo has FooExtra's fields?
I'd like to do things like following..
foo = FooProduct.objects.create()
foo.phone = "3333" # as django does with its multiple inheritance
foo.save()
FooProduct.objects.filter(phone="3333")
I'd like to list Products of different kind(data)
I need to list them together, so abstract Base inheritance is out
from the list, I'd like to treat each model as polymorphic model, when iterating over ProductBase.objects.all(), product.get_price() will use appropriate classe's method. (without incurring join if don't have to)
When and only when I want, I retrieve the addtional table data (by something like .select_related('fooextra')
Django-polymorphic is close to what I want, but it is rather obscure what it does so I'm afraid to use it, and I think it fails #3.
If I understand well, you want inheritance and you want the fields that are specific to the child class to be on a separate table.
As far as I know, you don't need a proxy class to achieve that, you could just implement multi-table inheritance as specified in the manual at https://docs.djangoproject.com/en/1.9/topics/db/models/#multi-table-inheritance e.g.:
class Base(models.Model):
common = models.IntegerField()
class Foo(Base):
phone = models.CharField(max_length=10)
This, as explained at the link above, will automatically create a one-to-one relationship. And of course you can do foo.phone = "3333" (where foo is of type Foo) as in your example above. And the neat thing is that you can also access foo.common whereas in your example it would have been foo.base.common.
It doesn't seem like you want anything different to Django's standard inheritance.
class ProductBase(models.Model):
common1 = models.IntegerField()
common2 = models.IntegerField()
class FooProduct(ProductBase):
fooextra = models.IntegerField()
class BarProduct(ProductBase):
barextra = models.IntegerField()
If you create instances of each:
foo1 = FooProduct(common1=1, common2=1, fooextra=1)
foo2 = FooProduct(common1=1, common2=1, fooextra=2)
bar1 = BarProduct(common1=1, common2=1, barextra=1)
bar2 = BarProduct(common1=1, common2=1, barextra=2)
You can loop over all products:
for product in ProductBase.objects.all():
print product.common1, product.common2
From a ProductBase object that is actually a FooProduct, you can get the custom field with:
product.foo.fooextra
From a ProductBase object that is actually a BarProduct, you can get the custom field with:
product.bar.barextra
You can still do querying:
foo = FooProduct.objects.get(fooextra=1)
bar = BarProduct.objects.get(barextra=2)
And you can access the common fields directly on those objects:
foo.common1
bar.common2
You can use the InheritanceManager from django-model-utils if you need more control over querying etc - and this should address point 3, too: ProductBase.objects.filter(...).select_subclasses() would give you the FooProduct and BarProduct objects instead of ProductBase objects.
In order to minimize bugs in competitive access to instances of django models, I'm using "only" method of QuerySet. For example, I have models like:
#models.py
class MyModel1(models.Model):
...
class MyModel2(models.Model):
field_1 = models.ForeignKey(MyModel1)
...
I'm using "only" in this way:
instances = list(MyModel2.objects.filter(...).only('id'))
Will field "field_1_id" of instances be loaded at this moment (not 'field_1', exactly 'field_1_id')?
only() method loads only the id of model MyModel2.
Check out using the values_list() method, which can return a list of values if you have only one field you need and you pass it flat = True:
instances = MyModel2.objects.filter(...).values_list('id', flat=True)
Here is my simplified model:
class Item(models.Model):
pass
class TrackingPoint(models.Model):
item = models.ForeignKey(Item)
created = models.DateField()
data = models.IntegerField()
class Meta:
unique_together = ('item', 'created')
In many parts of my application I need to retrieve a set of Item's and annotate each item with data field from latest TrackingPoint from each item ordered by created field. For example, instance i1 of class Item has 3 TrackingPoint's:
tp1 = TrackingPoint(item=i1, created=date(2010,5,15), data=23)
tp2 = TrackingPoint(item=i1, created=date(2010,5,14), data=21)
tp3 = TrackingPoint(item=i1, created=date(2010,5,12), data=120)
I need a query to retrieve i1 instance annotated with tp1.data field value as tp1 is the latest tracking point ordered by created field. That query should also return Item's that don't have any TrackingPoint's at all. If possible I prefer not to use QuerySet's extra method to do this.
That's what I tried so far... and failed :(
Item.objects.annotate(max_created=Max('trackingpoint__created'),
data=Avg('trackingpoint__data')).filter(trackingpoint__created=F('max_created'))
Any ideas?
Here's a single query that will provide (TrackingPoint, Item)-pairs:
TrackingPoint.objects.annotate(max=Max('item__trackingpoint__created')).filter(max=F('created')).select_related('item').order_by('created')
You would have to query for items without TrackingPoints separately.
This isn't directly answer to your question, but in case don't need exactly what you described you might be interested in greatest-n-per-group solution. You can take a look on my answer on similar question:
Django Query That Get Most Recent Objects From Different Categories
-- this should apply directly to your case:
items = Item.objects.annotate(tracking_point_created=Max('trackingpoint__created'))
trackingpoints = TrackingPoint.objects.filter(created__in=[b.tracking_point_created for b in items])
Note that second line can produce ambiguous results if created dates repeat in TrackingPoint model.