Remove rows with same ID from two lists in Django - python

I have a list of objects called checkins that lists all the times a user has checked into something and another set of objects called flagged_checkins that is certain checkins that the user has flagged. They both reference a 3rd table called with a location_id
I'd like to take the two lists of objects, and remove any of the checkins which have a location_id in flagged_checkins
How do I compare these sets and remove the rows from 'checkins'

As per this SO question,
checkins.objects.filter( location_id__in = list(flagged_checkins.objects.values_list('location_id', flat=True)) ).delete()

If you are talking about a queryset then, you can definitely try:
checkins.objects.exclude( location_id__in = list(flagged_checkins.objects.values_list('location_id', flat=True)) )
This would remove the objects based on your criteria. But not from the db level.

Related

What is the best way to bulk create in a through table?

This is what my through table looks like:
class ThroughTable(models.Model):
user = models.ForeignKey(User)
society = models.ForeignKey(Society)
I am getting 2 lists containing ids of 2 model objects, which have to be added in my through table.
user_list = [1,2,5,6,9]
society_list = [1,2,3,4]
Here I want to create the entry in Through table for each possible pair in these 2 lists.
I was thinking about using using nested loops to iterate and create the objects in Through table, but it seems very naive, and has a complexity of n*n.
Is there a better approach to solve this issue?
Django provides a bulk_create() method to make entries in the database. It has optional argument batch_size because if you have millions of records, you can not enter all the records in one go so break the records in batches and enter the database.
ThroughTable.objects.bulk_create(item, batch_size)
From the docs:
bulk_create(objs, batch_size=None, ignore_conflicts=False)
This method inserts the provided list of objects into the database in an efficient manner >(generally only 1 query, no matter how many objects there are):
For your case first create a list with all the possible combinations and then save it.
items = []
for user, society in user_list:
for society in society_list:
item = ThroughTable(user=user, society=society)
items.append(item)
ThroughTable.objects.bulk_create(items)

Django postgres order_by distinct on field

We have a limitation for order_by/distinct fields.
From the docs: "fields in order_by() must start with the fields in distinct(), in the same order"
Now here is the use case:
class Course(models.Model):
is_vip = models.BooleanField()
...
class CourseEvent(models.Model):
date = models.DateTimeField()
course = models.ForeignKey(Course)
The goal is to fetch the courses, ordered by nearest date but vip goes first.
The solution could look like this:
CourseEvent.objects.order_by('-course__is_vip', '-date',).distinct('course_id',).values_list('course')
But it causes an error since the limitation.
Yeah I understand why ordering is necessary when using distinct - we get the first row for each value of course_id so if we don't specify an order we would get some arbitrary row.
But what's the purpose of limiting order to the same field that we have distinct on?
If I change order_by to something like ('course_id', '-course__is_vip', 'date',) it would give me one row for course but the order of courses will have nothing in common with the goal.
Is there any way to bypass this limitation besides walking through the entire queryset and filtering it in a loop?
You can use a nested query using id__in. In the inner query you single out the distinct events and in the outer query you custom-order them:
CourseEvent.objects.filter(
id__in=CourseEvent.objects\
.order_by('course_id', '-date').distinct('course_id')
).order_by('-course__is_vip', '-date')
From the docs on distinct(*fields):
When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order.

How to compare Specific value of two different tables in django?

I have two tables 'Contact' and other is "Subscriber".. I want to Compare Contact_id of both and want to show only those Contact_id which is present in Contact but not in Subscriber.These two tables are in two different Models.
Something like this should work:
Contact.objects.exclude(
id__in=Subscriber.objects.all()
).values_list('id', flat=True)
Note that these are actually two SQL queries. I'm sure there are ways to optimize it, but this will usually work fine.
Also, the values_list has nothing to do with selecting the objects, it just modifies "format" of what is returned (list of IDs instead of queryset of objects - but same database records in both cases).
If you are excluding by some field other then Subscriber.id (e.g: Subscriber.quasy_id):
Contact.objects.exclude(
id__in=Subscriber.objects.all().values_list('quasy_id', flat=True)
).values_list('id', flat=True)
Edit:
This answer assumes you don't have a relationship between your Contact and Subscriber models. If you do, then see #navit's answer, it is a better choice.
Edit 2:
That flat=True inside exclude is actually not needed.
I assume you have your model like this:
class Subscriber(models.Model):
contact = models.ForeignKey(Contact)
You can do what you want like this:
my_list = Subscriber.objects.filter(contact=None)
This retrieves Subscribers which don't have a Contact. Retrieveing a list of Contacts is straightforward.
If you want to compare value of fields in two different tables(which have connection with ForeignKey) you can use something like this:
I assume model is like below:
class Contact(models.Model):
name = models.TextField()
family = models.TextField()
class Subscriber(models.Model):
subscriber_name = models.ForeignKey(Contact, on_delete=models.CASCADE)
subscriber_family = models.TextField()
this would be the query:
query = Subscriber.objects.filter(subscriber_name =F(Contact__name))
return query

Get duplicates in django

I've this name field in my database and quite a few of the names are duplicates. I want to have them unique. I know I can set the unique = True but that would only help with future entries. I want to know all the current entries with duplicate names. Is there an easy way to print out all the names that are duplicate in the doctor model?
class Doctor(models.Model):
name = models.CharField(max_length=1300)
To get rid of all duplicates from your database, you must ask yourself a question first - what to do with them? Remove? Merge somehow? Change name of each duplicate?
After answering that question, simply construct data migration (with RunPython migration) that will do desired operation on each duplicated entry.
To find all duplicates, you can do:
from django.db.models import Count
with_duplicates = Doctor.objects.annotate(count=Count('id')).order_by('id').distinct('name').filter(count__gt=1)
That query will fetch from database first (by id) record from duplicates group (for example if you have 3 doctors named "who", it will fetch first of them and it will fetch only doctors with duplicates).
Having that, for each doctor that have duplicates, you can get list of that duplicates:
with_duplicates = Doctor.objects.annotate(count=Count('id')).order_by('id').distinct('name').filter(count__gt=1)
for doctor in with_duplicates:
duplicates = Doctor.objects.filter(name=doctor.name).exclude(id=doctor.id)
And do something with them.
class Doctor(models.Model):
name = models.CharField(max_length=1300, unique = True)

Django - Following ForeignKey relationships "backward" for entire QuerySet

is it possible to follow ForeignKey relationships backward for entire querySet?
i mean something like this:
x = table1.objects.select_related().filter(name='foo')
x.table2.all()
when table1 hase ForeignKey to table2.
in
https://docs.djangoproject.com/en/1.2/topics/db/queries/#following-relationships-backward
i can see that it works only with get() and not filter()
Thanks
You basically want to get QuerySet of different type from data you start with.
class Kid(models.Model):
mom = models.ForeignKey('Mom')
name = models.CharField…
class Mom(models.Model):
name = models.CharField…
Let's say you want to get all moms having any son named Johnny.
Mom.objects.filter(kid__name='Johnny')
Let's say you want to get all kids of any Lucy.
Kid.objects.filter(mom__name='Lucy')
You should be able to use something like:
for y in x:
y.table2.all()
But you could also use get() for a list of the unique values (which will be id, unless you have a different specified), after finding them using a query.
So,
x = table1.objects.select_related().filter(name='foo')
for y in x:
z=table1.objects.select_related().get(y.id)
z.table2.all()
Should also work.
You can also use values() to fetch specific values of a foreign key reference. With values the select query on the DB will be reduced to fetch only those values and the appropriate joins will be done.
To re-use the example from Krzysztof Szularz:
jonny_moms = Kid.objects.filter(name='Jonny').values('mom__id', 'mom__name').distinct()
This will return a dictionary of Mom attributes by using the Kid QueryManager.

Categories

Resources