In Django, how do I construct a COUNT query for a ManyToManyField?
My models are as follows, and I want to get all the people whose name starts with A and who are the lord or overlord of at least one Place, and order the results by name.
class Manor(models.Model):
lord = models.ManyToManyField(Person, null=True, related_name="lord")
overlord = models.ManyToManyField(Person, null=True, related_name="overlord")
class Person(models.Model):
name = models.CharField(max_length=100)
So my query should look something like this... but how do I construct the third line?
people = Person.objects.filter(
Q(name__istartswith='a'),
Q(lord.count > 0) | Q(overlord.count > 0) # pseudocode
).order_by('name'))
Actually it's not the count you're interested in here, but just whether or not there are any members in that relationship.
Q(lord__isnull=False) | Q(overlord__isnull=False)
In this case, better resort to raw SQL.
for p in Person.objects.raw('SELECT * FROM myapp_person WHERE...'):
print p
Related
Say I have peewee models like so:
class Users(_BaseModel):
id = AutoField(primary_key=True, null=False, unique=True)
first_name = CharField(null=False)
last_name = CharField(null=False)
# Cut short for clarity
class Cohorts(_BaseModel):
id = AutoField(primary_key=True, null=False, unique=True)
name = CharField(null=False, unique=True)
# Cut short for clarity
class CohortsUsers(_BaseModel):
cohort = ForeignKeyField(Cohorts)
user = ForeignKeyField(Users)
is_primary = BooleanField(default=True)
I need to access easily from the user what cohort they are in and for example the cohort's name.
If a user could be in just one cohort, it would be easy but here, having it be many2many complicates things.
Here's what I got so far, which is pretty ugly and inefficient
Users.select(Users, CohortsUsers).join(CohortsUsers).where(Users.id == 1)[0].cohortsusers.cohort.name
Which will do what I require it to but I'd like to find a better way to do it.
Is there a way to have it so I can do Users.get_by_id(1).cohort.name ?
EDIT: I'm thinking about making methods to access them easily on my Users class but I am not really sure it's the best way of doing it nor how to go about it
If it do it like so, it's quite ugly because of the import inside the method to avoid circular imports
#property
def cohort(self):
from dst_datamodel.cohorts import CohortsUsers
return Users.select(Users, CohortsUsers).join(CohortsUsers).where(Users.id == self.id)[0].cohortsusers.cohort
But having this ugly method allows me to do Users.get_by_id(1).cohort easily
This is all covered in the documentation here: http://docs.peewee-orm.com/en/latest/peewee/relationships.html#implementing-many-to-many
You have a many-to-many relationship, where a user can be in zero, one or many cohorts, and a cohort may have zero, one or many users.
If there is some invariant where a user only has one cohort, then just do this:
# Get all cohorts for a given user id and print their name(s).
q = Cohort.select().join(CohortUsers).where(CohortUsers.user == some_user_id)
for cohort in q:
print(cohort.name)
More specific to your example:
#property
def cohort(self):
from dst_datamodel.cohorts import CohortsUsers
cohort = Cohort.select().join(CohortsUsers).where(CohortUsers.user == self.id).get()
return cohort.name
I have two models: City, and its alias CityAlias. The CityAlias model contains all the names in the City, plus the aliases. What I want is that whenever City is searched by name, the CityAlias model should be queried. This is what I've come up with:
class CityQuerySet(models.QuerySet):
""" If City is searched by name, search it in CityAlias """
def _search_name_in_alias(self, args, kwargs):
for q in args:
if not isinstance(q, models.Q): continue
for i, child in enumerate(q.children):
# q.children is a list of tuples of queries:
# [('name__iexact', 'calcutta'), ('state__icontains', 'bengal')]
if child[0].startswith('name'):
q.children[i] = ('aliases__%s' % child[0], child[1])
for filter_name in kwargs:
if filter_name.startswith('name'):
kwargs['aliases__%s' % filter_name] = kwargs.pop(filter_name)
def _filter_or_exclude(self, negate, *args, **kwargs):
# handles 'get', 'filter' and 'exclude' methods
self._search_name_in_alias(args=args, kwargs=kwargs)
return super(CityQuerySet, self)._filter_or_exclude(negate, *args, **kwargs)
class City(models.Model):
name = models.CharField(max_length=255, db_index=True)
state = models.ForeignKey(State, related_name='cities')
objects = CityQuerySet.as_manager()
class CityAlias(models.Model):
name = models.CharField(max_length=255, db_index=True)
city = models.ForeignKey(City, related_name='aliases')
Example: Kolkata will have an entry in City model, and it will have two entries in the CityAlias model: Kolkata and Calcutta. The above QuerySet allows to use lookups on the name field.
So the following two queries will return the same entry:
City.objects.get(name='Kolkata') # <City: Kolkata>
City.objects.get(name__iexact='calcutta') # <City: Kolkata>
So far so good. But the problem arises when City is a ForeignKey in some other model:
class Trip(models.Model):
destination = models.ForeignKey(City)
# some other fields....
Trip.objects.filter(destination__name='Kolkata').count() # some non-zero number
Trip.objects.filter(destination__name='Calcutta').count() # always returns zero
Django internally handles these joins differently, and doesn't call the get_queryset method of City's manager. The alternative is to call the above query as following:
Trip.objects.filter(destination=City.objects.get(name='Calcutta'))
My question is that can I do something, so that however the City model is searched by name, it always searches in the CityAlias table instead?
Or is there another better way to implement the functionality I require?
I think it is better (and more pythonic) to be explicit in what you ask for throughout instead of trying to do magic in the Manager and thus:
City.objects.get(aliases__name__iexact='calcutta') # side note: this can return many (same in original) so you need to catch that
And:
Trip.objects.filter(destination__aliases__name='Calcutta').count()
I was trying to use Custom Lookups but apparently you cannot add a table to the join list. (Well, you could add an extra({"table": ...}) in the model's manager but it's not an elegant solution).
So I'd propose you:
1) Keep always your 'main/preferred' name city also as a CityAlias. So the metadata of the city will be in City... but all the naming information will be in CityAlias. (and maybe change the names)
In this way all look-ups will happen in that table. You could have a boolean to mark which instance is the original/preferred.
class City(models.Model):
state = models.ForeignKey(State, related_name='cities')
[...]
class CityAlias(models.Model):
city = models.ForeignKey(City, related_name='aliases')
name = models.CharField(max_length=255, db_index=True)
2) If you are thinking about translations... Have you thought about django-modeltranslation app?
In this case, it would create a field for each language and it would be always better than having a join.
3) Or, if you are using PostgreSQL, and you are thinking about "different translations for the same city-name" (and I'm thinking with transliterations from Greek or Russian language), maybe you could use PostgreSQL dictionaries, trigrams with similarities, etc. Or even in this case, the 1st approach.
Speaking of keeping it simple. Why not just give the City model a char field 'CityAlias' that contains the string? If I understand your question correctly, this is the most simple solution if you only need one alias per city. It just looks to me as though you are complicating a simple problem.
class City(models.Model):
name = models.CharField(max_length=255, db_index=True)
state = models.ForeignKey(State, related_name='cities')
alias = models.CharField(max_length=255)
c = City.objects.get(alias='Kolkata')
>>>c.name
Calcutta
>>>c.alias
Kolkata
I have the following models:
class AcademicRecord(models.Model):
record_id = models.PositiveIntegerField(unique=True, primary_key=True)
subjects = models.ManyToManyField(Subject,through='AcademicRecordSubject')
...
class AcademicRecordSubject(models.Model):
academic_record = models.ForeignKey('AcademicRecord')
subject = models.ForeignKey('Subject')
language_group = IntegerCharField(max_length=2)
...
class SubjectTime(models.Model):
time_id = models.CharField(max_length=128, unique=True, primary_key=True)
subject = models.ForeignKey(Subject)
language_group = IntegerCharField(max_length=2)
...
class Subject(models.Model):
subject_id = models.PositiveIntegerField(unique=True,primary_key=True)
...
The academic records have list of subjects each with a language code and the subject times have a subject and language code.
With a given AcademicRecord, how can I get the subject times that matches with the AcademicRecordSubjects that the AcademicRecord has?
This is my approach, but it makes more queries than needed:
# record is the given AcademicRecord
times = []
for record_subject in record.academicrecordsubject_set.all():
matched_times = SubjectTime.objects.filter(subject=record_subject.subject)
current_times = matched_times.filter(language_group=record_subject.language_group)
times.append(current_times)
I want to make the query using django ORM not with raw SQL
SubjectTime language group has to match with Subject's language group aswell
I got it, in part thanks to #Robert Jørgensgaard Eng
My problem was how to do the inner join using more than 1 field, in which the F object came on handly.
The correct query is:
SubjectTime.objects.filter(subject__academicrecordsubject__academic_record=record,
subject__academicrecordsubject__language_group=F('language_group'))
Given an AcademicRecord instance academic_record, it is either
SubjectTime.objects.filter(subject__academicrecordsubject_set__academic_record=academic_record)
or
SubjectTime.objects.filter(subject__academicrecordsubject__academic_record=academic_record)
The results reflect all the rows of the join that these ORM queries become in SQL. To avoid duplicates, just use distinct().
Now this would be much easier, if I had a django shell to test in :)
I've looked at doing a query using an extra and/or annotate but have not been able to get the result I want.
I want to get a list of Products, which has active licenses and also the total number of available licenses. An active license is defined as being not obsolete, in date, and the number of licenses less the number of assigned licenses (as defined by a count on the manytomany field).
The models I have defined are:
class Vendor(models.Model):
name = models.CharField(max_length=200)
url = models.URLField(blank=True)
class Product(models.Model):
name = models.CharField(max_length=200)
vendor = models.ForeignKey(Vendor)
product_url = models.URLField(blank=True)
is_obsolete = models.BooleanField(default=False, help_text="Is this product obsolete?")
class License(models.Model):
product = models.ForeignKey(Product)
num_licenses = models.IntegerField(default=1, help_text="The number of assignable licenses.")
licensee_name = models.CharField(max_length=200, blank=True)
license_key = models.TextField(blank=True)
license_startdate = models.DateField(default=date.today())
license_enddate = models.DateField(null=True, blank=True)
is_obsolete = models.BooleanField(default=False, help_text="Is this licenses obsolete?")
licensees = models.ManyToManyField(User, blank=True)
I have tried filtering by the License model. Which works, but I don't know how to then collate / GROUP BY / aggregate the returned data into a single queryset that is returned.
When trying to filter by procuct, I can quite figure out the query I need to do. I can get bits and pieces, and have tried using a .extra() select= query to return the number of available licenses (which is all I really need at this point) of which there will be multiple licenses associated with a product.
So, the ultimate answer I am after is, how can I retrieve a list of available products with the number of available licenses in Django. I'd rather not resort to using raw as much as possible.
An example queryset that gets all the License details I want, I just can't get the product:
License.objects.annotate(
used_licenses=Count('licensees')
).extra(
select={
'avail_licenses': 'licenses_license.num_licenses - (SELECT count(*) FROM licenses_license_licensees WHERE licenses_license_licensees.license_id = licenses_license.id)'
}
).filter(
is_obsolete=False,
num_licenses__gt=F('used_licenses')
).exclude(
license_enddate__lte=date.today()
)
Thank you in advance.
EDIT (2014-02-11):
I think I've solved it in possibly an ugly way. I didn't want to make too many DB calls if I can, so I get all the information using a License query, then filter it in Python and return it all from inside a manager class. Maybe an overuse of Dict and list. Anyway, it works, and I can expand it with additional info later on without a huge amount of risk or custom SQL. And it also uses some of the models parameters that I have defined in the model class.
class LicenseManager(models.Manager):
def get_available_products(self):
licenses = self.get_queryset().annotate(
used_licenses=Count('licensees')
).extra(
select={
'avail_licenses': 'licenses_license.num_licenses - (SELECT count(*) FROM licenses_license_licensees WHERE licenses_license_licensees.license_id = licenses_license.id)'
}
).filter(
is_obsolete=False,
num_licenses__gt=F('used_licenses')
).exclude(
license_enddate__lte=date.today()
).prefetch_related('product')
products = {}
for lic in licenses:
if lic.product not in products:
products[lic.product] = lic.product
products[lic.product].avail_licenses = lic.avail_licenses
else:
products[lic.product].avail_licenses += lic.avail_licenses
avail_products = []
for prod in products.values():
if prod.avail_licenses > 0:
avail_products.append(prod)
return avail_products
EDIT (2014-02-12):
Okay, this is the final solution I have decided to go with. Uses Python to filter the results. Reduces cache calls, and has a constant number of SQL queries.
The lesson here is that for something with many levels of filtering, it's best to get as much as needed, and filter in Python when returned.
class ProductManager(models.Manager):
def get_all_available(self, curruser):
"""
Gets all available Products that are available to the current user
"""
q = self.get_queryset().select_related().prefetch_related('license', 'license__licensees').filter(
is_obsolete=False,
license__is_obsolete=False
).exclude(
license__enddate__lte=date.today()
).distinct()
# return a curated list. Need further information first
products = []
for x in q:
x.avail_licenses = 0
x.user_assigned = False
# checks licenses. Does this on the model level as it's cached so as to save SQL queries
for y in x.license.all():
if not y.is_active:
break
x.avail_licenses += y.available_licenses
if curruser in y.licensees.all():
x.user_assigned = True
products.append(x)
return q
One strategy would be to get all the product ids from your License queryset:
productIDList = list(License.objects.filter(...).values_list(
'product_id', flat=True))
and then query the products using that list of ids:
Product.objects.filter(id__in=productIDList)
I am using django-follow to allow users to "follow" objects - in this example, Actors in films.
I am pulling back a list of film actors using
actors_user_is_following = Follow.objects.get_follows(Actor).filter(user=request.user.id)
But what I also want to do is suggest films to the user based on the actors they are following. This does not need to be a complex algorithm of what they already like and suggesting relative films, just a simple "because you follow this actor and this actor is in this film, suggest it to the user"
I have this rather clunky way of doing this right now...
context['follows'] = {
'actors': Follow.objects.get_follows(Actor).filter(user=request.user.id),
'genres': Follow.objects.get_follows(Genre).filter(user=request.user.id),
}
actor_ids = []
for actor in context['follows']['actors']:
actor_ids.append(actor.target_artist_id)
genre_ids = []
for artist in context['follows']['genres']:
genre_ids.append(artist.genre_ids)
context['suggested'] = {
'films': Listing.objects.filter(Q(actors__in=actor_ids) | Q(genres__in=genre_ids))
}
Which works, but I'm sure there is a better way of doing it?
Most importantly I also want to show the user why that film as been recommended by displaying the actors or genres it features that the user is following, so the end result might be something like...
film = {
title: 'Dodgeball'
image: '/images/films/dodgeball.jpg'
followed_actors: ['Ben Stiller', 'Vince Vaughn'] #could be multiple
followed_genres: ['Comedy'] #could be multiple
}
Note I would want to return multiple films.
Here's how my models are coded up:
Film Model defined like so:
from django.db import models
from app.actors.models import Actor
from app.genres.models import Genre
class Film(models.Model):
title = models.CharField(max_length=255)
strapline = models.CharField(max_length=255)
slug = models.SlugField(max_length=100)
image_url = models.CharField(max_length=255)
pub_date = models.DateTimeField('date published')
actors = models.ManyToManyField(Actor)
genres = models.ManyToManyField(Genre)
def __unicode__(self):
return self.title
And Actor Model:
from django.db import models
from follow import utils
class Actor(models.Model):
title = models.CharField(max_length=255)
strapline = models.CharField(max_length=255)
image = models.CharField(max_length=255)
image_hero = models.CharField(max_length=255)
bio = models.TextField()
def __unicode__(self):
return self.title
#followable
utils.register(Actor)
Behind the scenes, Follow objects are essentially a many-to-many relationship with fields added each time you register a model.
Your question just talks about actors, but your code also includes genres. It's not especially hard to cover both, I'm just not sure which way is the way you want it.
I think you can get your film objects in one queryset:
films = Film.objects.filter(Q(actors__in=Actor.objects.filter(follow_set__user=request.user)) |
Q(genres__in=Genre.objects.filter(follow_set__user=request.user))).distinct()
As noted in the docs for __in lookups, some database back ends will give you better performance if you evaluate the subqueries before using them:
actor_ids = list(Actor.objects.filter(follow_set__user=request.user).values_list('id', flat=True))
genre_ids = list(Genre.objects.filter(follow_set__user=request.user).values_list('id', flat=True))
films = Film.objects.filter(Q(actors__in=actor_ids) | Q(genres__in=genre_ids)).distinct()
If you just want to return the matching films, I think those are the most concise way to express it.
For the part where you're adding the reasons to the films - I don't see a more elegant way to handle that than to iterate through the films queryset and add the information by hand. I would definitely define the querysets for actor_ids and genre_ids before doing so, although whether or not I evaluated them early would still depend on the db back end.
annotated_films = []
for film in films:
film.followed_actors = film.actors.filter(id__in=actor_ids)
film.followed_genres = film.genres.filter(id__in=genre_ids)
annotated_films.append(film)