I am using django-follow to allow users to "follow" objects - in this example, Actors in films.
I am pulling back a list of film actors using
actors_user_is_following = Follow.objects.get_follows(Actor).filter(user=request.user.id)
But what I also want to do is suggest films to the user based on the actors they are following. This does not need to be a complex algorithm of what they already like and suggesting relative films, just a simple "because you follow this actor and this actor is in this film, suggest it to the user"
I have this rather clunky way of doing this right now...
context['follows'] = {
'actors': Follow.objects.get_follows(Actor).filter(user=request.user.id),
'genres': Follow.objects.get_follows(Genre).filter(user=request.user.id),
}
actor_ids = []
for actor in context['follows']['actors']:
actor_ids.append(actor.target_artist_id)
genre_ids = []
for artist in context['follows']['genres']:
genre_ids.append(artist.genre_ids)
context['suggested'] = {
'films': Listing.objects.filter(Q(actors__in=actor_ids) | Q(genres__in=genre_ids))
}
Which works, but I'm sure there is a better way of doing it?
Most importantly I also want to show the user why that film as been recommended by displaying the actors or genres it features that the user is following, so the end result might be something like...
film = {
title: 'Dodgeball'
image: '/images/films/dodgeball.jpg'
followed_actors: ['Ben Stiller', 'Vince Vaughn'] #could be multiple
followed_genres: ['Comedy'] #could be multiple
}
Note I would want to return multiple films.
Here's how my models are coded up:
Film Model defined like so:
from django.db import models
from app.actors.models import Actor
from app.genres.models import Genre
class Film(models.Model):
title = models.CharField(max_length=255)
strapline = models.CharField(max_length=255)
slug = models.SlugField(max_length=100)
image_url = models.CharField(max_length=255)
pub_date = models.DateTimeField('date published')
actors = models.ManyToManyField(Actor)
genres = models.ManyToManyField(Genre)
def __unicode__(self):
return self.title
And Actor Model:
from django.db import models
from follow import utils
class Actor(models.Model):
title = models.CharField(max_length=255)
strapline = models.CharField(max_length=255)
image = models.CharField(max_length=255)
image_hero = models.CharField(max_length=255)
bio = models.TextField()
def __unicode__(self):
return self.title
#followable
utils.register(Actor)
Behind the scenes, Follow objects are essentially a many-to-many relationship with fields added each time you register a model.
Your question just talks about actors, but your code also includes genres. It's not especially hard to cover both, I'm just not sure which way is the way you want it.
I think you can get your film objects in one queryset:
films = Film.objects.filter(Q(actors__in=Actor.objects.filter(follow_set__user=request.user)) |
Q(genres__in=Genre.objects.filter(follow_set__user=request.user))).distinct()
As noted in the docs for __in lookups, some database back ends will give you better performance if you evaluate the subqueries before using them:
actor_ids = list(Actor.objects.filter(follow_set__user=request.user).values_list('id', flat=True))
genre_ids = list(Genre.objects.filter(follow_set__user=request.user).values_list('id', flat=True))
films = Film.objects.filter(Q(actors__in=actor_ids) | Q(genres__in=genre_ids)).distinct()
If you just want to return the matching films, I think those are the most concise way to express it.
For the part where you're adding the reasons to the films - I don't see a more elegant way to handle that than to iterate through the films queryset and add the information by hand. I would definitely define the querysets for actor_ids and genre_ids before doing so, although whether or not I evaluated them early would still depend on the db back end.
annotated_films = []
for film in films:
film.followed_actors = film.actors.filter(id__in=actor_ids)
film.followed_genres = film.genres.filter(id__in=genre_ids)
annotated_films.append(film)
Related
Using the models from https://docs.djangoproject.com/en/dev/topics/db/queries/#making-queries with minor modifications:
from django.db import models
class Blog(models.Model):
name = models.CharField(max_length=100)
class Author(models.Model):
name = models.CharField(max_length=200)
joined = models.DateField()
def __str__(self):
return self.name
class Entry(models.Model):
blog = models.ForeignKey(Blog, on_delete=models.CASCADE)
headline = models.CharField(max_length=255)
authors = models.ManyToManyField(Author)
rating = models.IntegerField()
I would like to create a dictionary from Author to Entries, where the Author joined this year, and the Entry has a rating of 4 or better. The structure of the resulting dict should look like:
author_entries = {author1: [set of entries], author2: [set of entries], etc.}
while hitting the database less than 3'ish times (or at least not proportional to the number of Authors or Entries).
My first attempt (db hits == number of authors, 100 authors 100 db-hits):
res = {}
authors = Author.objects.filter(joined__year=date.today().year)
for author in authors:
res[author] = set(author.entry_set.filter(rating__gte=4))
second attempt, trying to read entries in one go:
res = {}
authors = Author.objects.filter(joined__year=date.today().year)
entries = Entry.objects.select_related().filter(rating__gte=4, authors__in=authors)
for author in authors:
res[author] = {e for e in entries if e.authors.filter(pk=author.pk)}
this one is even worse, 100 authors, 198 db-hits (the original second attempt used {e for e in entries if author in e.authors}, but Django wouldn't have it.
The only method I've found involves raw-sql (4 db-hits):
res = {}
_authors = Author.objects.filter(joined__year=date.today().year)
_entries = Entry.objects.select_related().filter(rating__gte=4, authors__in=_authors)
authors = {a.id: a for a in _authors}
entries = {e.id: e for e in _entries}
c = connection.cursor()
c.execute("""
select entry_id, author_id
from sampleapp_entry_authors
where author_id in (%s)
""" % ','.join(str(v) for v in authors.keys()))
res = {a: set() for a in _authors}
for eid, aid in c.fetchall():
if eid in entries:
res[authors[aid]].add(entries[eid])
(apologies for using string substitutions in the c.execute(..) call -- I couldn't find the syntax sqlite wanted for a where in ? call).
Is there a more Djangoesque way to do this?
I've created a git repo with the code I'm using (https://github.com/thebjorn/revm2m), the tests are in https://github.com/thebjorn/revm2m/blob/master/revm2m/sampleapp/tests.py
You can use a Prefetch-object [Django-doc] for that:
from django.db.models import Prefetch
good_ratings = Prefetch(
'entry_set',
queryset=Entry.objects.filter(rating__gte=4),
to_attr='good_ratings'
)
authors = Author.objects.filter(
joined__year=date.today().year
).prefetch_related(
good_ratings
)
Now the Author objects in authors will have an extra attribute good_ratings (the value of the to_attr of the Prefetch object) that is a preloaded QuerySet containing the Entrys with a rating greater than or equal to four.
So you can post-process these like:
res = {
author: set(author.good_ratings)
for author in authors
}
Although since the Author objects (from this QuerySet, not in general), already carry the attribute, so there is probably not much use anyway.
I can't seem to isolate a single record from this query:
subcust = OwnerCustom.objects.get(carcustom=ncset, owner=sset)
This is the error:
OwnerCustom matching query does not exist
In the actual data, there is only actually one matching record in OwnerCustom for each record in CarCustom. It's supposed to be a kind of many-to-many where there are standard differences listed in CarCustom for each Car, and each owner may maintain their own customizations (overrides) or those default OwnerCustom entries.
Note, there are many different Owner of the same Car. And of course, I'm not actually doing cars, this is a renaming from the original purpose.
Here's the relevant models:
class Car(models.Model):
car_name = models.CharField(max_length=50)
class CarCustom(models.Model):
car = models.ForeignKey(Car, models.PROTECT)
class Owner(models.Model):
car = models.ForeignKey(Car, models.PROTECT)
class OwnerCustom(models.Model):
owner = models.ForeignKey(Owner, models.PROTECT)
carcustom = models.ForeignKey(CarCustom, models.PROTECT)
name = models.CharField(max_length=50)
And the code:
car_queryset = Car.objects.filter(car_name="fancy car")
for nset in car_queryset:
owner_queryset = Owner.objects.filter(car=nset)
for sset in owner_queryset :
carcustom_queryset = CarCustom.objects.filter(car=nset)
for ncset in carcustom_queryset:
subcust = OwnerCustom.objects.get(carcustom=ncset, owner=sset)
I've tried stuff like:
subcust = OwnerCustom.objects.filter(carcustom=ncset, owner=sset).first()
Which gives me a NoneType, and then tried:
subcust = OwnerCustom.objects.filter(carcustom=ncset, owner=sset)[:1].get()
Which gives "matching query does not exist" and this:
subcust = OwnerCustom.objects.filter(carcustom=ncset, owner=sset)[0]
Gives "list index out of range"
UPDATE: I CAN get a working function by using code like this, but I would think since there is only one (guaranteed by application) matching record possible for OwnerCustom.objects.filter(carcustom=ncset, owner=sset) that I could find a better way to fetch it:
car_queryset = Car.objects.filter(car_name="fancy car")
for nset in car_queryset:
owner_queryset = Owner.objects.filter(car=nset)
for sset in owner_queryset :
carcustom_queryset = CarCustom.objects.filter(car=nset)
for ncset in carcustom_queryset:
subcust_queryset = OwnerCustom.objects.filter(carcustom=ncset, owner=sset)
for subcust in subcust_queryset :
logger.info(subcust.name)
I'm fairly new to Django and have worked through some Test Driven Development. I try to adhere to the principles of TDD, but there are some contexts where I don't know how to proceed (like the model below). I have a model that is very similar to what I show here. Essentially, the idea is to construct a book. Sometimes that book consists of chapters, other times the book has chapters and other books included. So, my question really about trying to test until I get to a model similar to the one below with the same functionality. I've tested this model in my python shell and it outputs how I expected it to, but I would like something more robust.
I also want to be able to use this model in the core of my project and need to be able to test it as I continue to build on top. What would be some good example unit tests for testing a model like this? Or any advice on where to look for tests that work with ContentType and other abstract models? Thanks!
from django.db import models
from django.contrib.contenttypes import generic
from django.contrib.contenttypes.models import ContentType
class Element(models.Model):
title = models.TextField()
class Meta:
abstract = True
class Chapter(Element):
body = models.TextField()
def __unicode__(self):
return self.body
class Book(Element):
description = models.TextField()
def __unicode__(self):
return self.description
class BookElement(models.Model):
protocol_id = models.PositiveIntegerField()
element_content_type = models.ForeignKey(ContentType)
element_id = models.PositiveIntegerField()
element = generic.GenericForeignKey('element_content_type', 'element_id')
# Sort order, heavy things sink.
element_weight = models.PositiveIntegerField()
def __unicode__(self):
return u'%s %s' % (self.protocol_id, self.element , self.element_weight)
Update
Here is a test I worked out for entering elements into the database and retrieving them. It works, but seems long tests more than one thing. If there is a better way, I am open to suggestions.
class BookAndChapterModelTest(TestCase):
def test_saving_and_retrieving_book_elements(self):
# input data objects and save
book = Book()
book.title = "First book"
book.description = "Testing, round one"
book.save()
first_chapter = Chapter()
first_chapter.title = 'step 1'
first_chapter.body ='This is step 1'
first_chapter.save()
second_chapter = Chapter()
second_chapter.title = 'step 2'
second_chapter.body = 'This is step 2'
second_chapter.save()
# link content types to chapter or book model
chapter_ct = ContentType.objects.get_for_model(first_chapter)
book_ct = ContentType.objects.get_for_model(book)
# Assemble BookElement order by weight of chapter or book
BookElement.objects.create(
book_id=book.pk,
element_content_type=chapter_ct,
element_id=first_chapter.pk,
element_weight='1')
BookElement.objects.create(
book_id=book.pk,
element_content_type=chapter_ct,
element_id=second_chapter.pk,
element_weight='2')
BookElement.objects.create(
book_id=book.pk,
element_content_type=book_ct,
element_id=book.pk,
element_weight='3')
# Test number of elements
saved_book_element = BookElement.objects.all()
self.assertEqual(saved_book_element.count(), 3)
# Test elements are in the proper position based on weighting
first_book_element = saved_book_element[0]
self.assertEqual(str(first_book_element), 'This is step 1')
third_book_element = saved_book_element[2]
self.assertEqual(str(third_book_element), "Testing, round one")
factory-boy to the rescue! it makes testing fun fun
http://factoryboy.readthedocs.org/en/latest/examples.html
I've looked at doing a query using an extra and/or annotate but have not been able to get the result I want.
I want to get a list of Products, which has active licenses and also the total number of available licenses. An active license is defined as being not obsolete, in date, and the number of licenses less the number of assigned licenses (as defined by a count on the manytomany field).
The models I have defined are:
class Vendor(models.Model):
name = models.CharField(max_length=200)
url = models.URLField(blank=True)
class Product(models.Model):
name = models.CharField(max_length=200)
vendor = models.ForeignKey(Vendor)
product_url = models.URLField(blank=True)
is_obsolete = models.BooleanField(default=False, help_text="Is this product obsolete?")
class License(models.Model):
product = models.ForeignKey(Product)
num_licenses = models.IntegerField(default=1, help_text="The number of assignable licenses.")
licensee_name = models.CharField(max_length=200, blank=True)
license_key = models.TextField(blank=True)
license_startdate = models.DateField(default=date.today())
license_enddate = models.DateField(null=True, blank=True)
is_obsolete = models.BooleanField(default=False, help_text="Is this licenses obsolete?")
licensees = models.ManyToManyField(User, blank=True)
I have tried filtering by the License model. Which works, but I don't know how to then collate / GROUP BY / aggregate the returned data into a single queryset that is returned.
When trying to filter by procuct, I can quite figure out the query I need to do. I can get bits and pieces, and have tried using a .extra() select= query to return the number of available licenses (which is all I really need at this point) of which there will be multiple licenses associated with a product.
So, the ultimate answer I am after is, how can I retrieve a list of available products with the number of available licenses in Django. I'd rather not resort to using raw as much as possible.
An example queryset that gets all the License details I want, I just can't get the product:
License.objects.annotate(
used_licenses=Count('licensees')
).extra(
select={
'avail_licenses': 'licenses_license.num_licenses - (SELECT count(*) FROM licenses_license_licensees WHERE licenses_license_licensees.license_id = licenses_license.id)'
}
).filter(
is_obsolete=False,
num_licenses__gt=F('used_licenses')
).exclude(
license_enddate__lte=date.today()
)
Thank you in advance.
EDIT (2014-02-11):
I think I've solved it in possibly an ugly way. I didn't want to make too many DB calls if I can, so I get all the information using a License query, then filter it in Python and return it all from inside a manager class. Maybe an overuse of Dict and list. Anyway, it works, and I can expand it with additional info later on without a huge amount of risk or custom SQL. And it also uses some of the models parameters that I have defined in the model class.
class LicenseManager(models.Manager):
def get_available_products(self):
licenses = self.get_queryset().annotate(
used_licenses=Count('licensees')
).extra(
select={
'avail_licenses': 'licenses_license.num_licenses - (SELECT count(*) FROM licenses_license_licensees WHERE licenses_license_licensees.license_id = licenses_license.id)'
}
).filter(
is_obsolete=False,
num_licenses__gt=F('used_licenses')
).exclude(
license_enddate__lte=date.today()
).prefetch_related('product')
products = {}
for lic in licenses:
if lic.product not in products:
products[lic.product] = lic.product
products[lic.product].avail_licenses = lic.avail_licenses
else:
products[lic.product].avail_licenses += lic.avail_licenses
avail_products = []
for prod in products.values():
if prod.avail_licenses > 0:
avail_products.append(prod)
return avail_products
EDIT (2014-02-12):
Okay, this is the final solution I have decided to go with. Uses Python to filter the results. Reduces cache calls, and has a constant number of SQL queries.
The lesson here is that for something with many levels of filtering, it's best to get as much as needed, and filter in Python when returned.
class ProductManager(models.Manager):
def get_all_available(self, curruser):
"""
Gets all available Products that are available to the current user
"""
q = self.get_queryset().select_related().prefetch_related('license', 'license__licensees').filter(
is_obsolete=False,
license__is_obsolete=False
).exclude(
license__enddate__lte=date.today()
).distinct()
# return a curated list. Need further information first
products = []
for x in q:
x.avail_licenses = 0
x.user_assigned = False
# checks licenses. Does this on the model level as it's cached so as to save SQL queries
for y in x.license.all():
if not y.is_active:
break
x.avail_licenses += y.available_licenses
if curruser in y.licensees.all():
x.user_assigned = True
products.append(x)
return q
One strategy would be to get all the product ids from your License queryset:
productIDList = list(License.objects.filter(...).values_list(
'product_id', flat=True))
and then query the products using that list of ids:
Product.objects.filter(id__in=productIDList)
In Django, how do I construct a COUNT query for a ManyToManyField?
My models are as follows, and I want to get all the people whose name starts with A and who are the lord or overlord of at least one Place, and order the results by name.
class Manor(models.Model):
lord = models.ManyToManyField(Person, null=True, related_name="lord")
overlord = models.ManyToManyField(Person, null=True, related_name="overlord")
class Person(models.Model):
name = models.CharField(max_length=100)
So my query should look something like this... but how do I construct the third line?
people = Person.objects.filter(
Q(name__istartswith='a'),
Q(lord.count > 0) | Q(overlord.count > 0) # pseudocode
).order_by('name'))
Actually it's not the count you're interested in here, but just whether or not there are any members in that relationship.
Q(lord__isnull=False) | Q(overlord__isnull=False)
In this case, better resort to raw SQL.
for p in Person.objects.raw('SELECT * FROM myapp_person WHERE...'):
print p