Writing a tuple search with Django ORM - python

I'm trying to write a search based on tuples with the Django ORM syntax.
The final sql statement should look something like:
SELECT * FROM mytable WHERE (field_a,field_b) IN ((1,2),(3,4));
I know I can achieve this in django using the extra keyword:
MyModel.objects.extra(
where=["(field_a, field_b) IN %s"],
params=[((1,2),(3,4))]
)
but the "extra" keyword will be deprecated at some point in django so I'd like a pure ORM/django solution.
Searching the web, I found https://code.djangoproject.com/ticket/33015 and the comment from Simon Charette, something like the snippet below could be OK, but I can't get it to work.
from django.db.models import Func, lookups
class ExpressionTuple(Func):
template = '(%(expressions)s)'
arg_joiner = ","
MyModel.objects.filter(lookups.In(
ExpressionTuple('field_a', 'field_b'),
((1,2),(3,4)),
))
I'm using Django 3.2 but I don't expect Django 4.x to do a big difference here. My db backend is posgresql in case it matters.

Related

Flask-SQLAlchemy Legacy vs New Query Interface

I am trying to update some queries in a web application because as stated in Flask-SQLAlchemy
You may see uses of Model.query or session.query to build queries. That query interface is
considered legacy in SQLAlchemy. Prefer using the session.execute(select(...)) instead.
I have a query:
subnets = db.session.query(Subnet).order_by(Subnet.id).all()
Which is translated into:
SELECT subnet.id AS subnet_id, subnet.name AS subnet_name, subnet.network AS subnet_network, subnet.access AS subnet_access, subnet.date_created AS subnet_date_created
FROM subnet ORDER BY subnet.id
And I take the subnets variable and loop it over in my view in two different locations. And it works.
However, when I try to update my query and use the new SQLAlchemy interface:
subnets = db.session.execute(db.select(Subnet).order_by(Subnet.id)).scalars()
I can only loop once and there is nothing left to loop over in the second loop?
How can I achieve the same result with the new query interface?
As noted in the comments to the question, your second example is not directly comparable to your first example because your second example is missing the .all() at the end.
Side note:
session.scalars(select(Subnet).order_by(Subnet.id)).all()
is a convenient shorthand for
session.execute(select(Subnet).order_by(Subnet.id)).scalars().all()
and is the recommended approach for SQLAlchemy 1.4+.
Check out the 2.0 migration docs for ORM:
https://docs.sqlalchemy.org/en/14/changelog/migration_20.html#migration-orm-usage
It lists some examples to show your how to migrate your code from 1.x style to 2.x style. For example:
get()
1.x:
session.query(User).get(42)
2.x:
session.get(User, 42)
all()
1.x:
session.query(User).all()
2.x:
session.execute(select(User)).scalars().all()
Note
For Flask-SQLAlchemy, you also need to:
Replace session with db.session.
Replace select with db.select or import the select directly from sqlalchemy:
from sqlalchemy import select
session.scalars() is not supported (yet), use session.execute().scalars() instead.

Django: Raw SQL with connection.cursor()

I am a complete newbie to Django. I need to perform the following query and use img.img_loc to populate a list of images in a template:
SELECT img.img_loc, author.surname, author.given_name, author.email
FROM image_full AS img
LEFT JOIN author_contact_zzz AS author ON img.pmcid = author.pmcid
WHERE img.pmcid = 545600
GROUP BY img.img_loc, author.email;
I read the documentations here: https://docs.djangoproject.com/en/1.10/topics/db/sql/
However, I do not understand where the function:
def my_custom_sql(self):
that they are talking about in the last section is supposed to go to (views.py ?) and what is 'self' in that case, since my view is not defined as a class.
Thanks!
That looks like a typo. There's no need for self there.
The code can go wherever you like, though.

What is the replacement for DateModifierNode in new versions of Django

I want to do a query based on two fields of a model, a date, offset by an int, used as a timedelta
model.objects.filter(last_date__gte=datetime.now()-timedelta(days=F('interval')))
is a no-go, as a F() expression cannot be passed into a timedelta
A little digging, and I discovered DateModifierNode - though it seems it was removed in this commit: https://github.com/django/django/commit/cbb5cdd155668ba771cad6b975676d3b20fed37b (from this now-outdated SO question Django: Using F arguments in datetime.timedelta inside a query)
the commit mentions:
The .dates() queries were implemented by using custom Query, QuerySet,
and Compiler classes. Instead implement them by using expressions and
database converters APIs.
which sounds sensible, and like there should still be a quick easy way - but I've been fruitlessly looking for how to do that for a little too long - anyone know the answer?
In Django 1.10 there's simpler method to do it but you need to change the model a little: use a DurationField. My model is as follows:
class MyModel(models.Model):
timeout = models.DurationField(default=86400 * 7) # default: week
last = models.DateTimeField(auto_now_add=True)
and the query to find objects where last was before now minus timeout is:
MyModel.objects.filter(last__lt=datetime.datetime.now()-F('timeout'))
Ah, answer from the docs: https://docs.djangoproject.com/en/1.9/ref/models/expressions/#using-f-with-annotations
from django.db.models import DateTimeField, ExpressionWrapper, F
Ticket.objects.annotate(
expires=ExpressionWrapper(
F('active_at') + F('duration'), output_field=DateTimeField()))
which should make my original query look like
model.objects.annotate(new_date=ExpressionWrapper(F('last_date') + F('interval'), output_field=DateTimeField())).filter(new_date__gte=datetime.now())

Django F expressions joined field

So I am trying to update my model by running the following:
FooBar.objects.filter(something=True).update(foobar=F('foo__bar'))
but I get the following error:
FieldError: Joined field references are not permitted in this query
if this is not allowed with F expressions...how can I achieve this update?
ticket
given the information in this ticket, I now understand that this is impossible and will never be implemented in django, but is there any way to achieve this update? maybe with some work around? I do not want to use a loop because there are over 10 million FooBar objects, so SQL is much faster than python.
Django 1.11 adds supports for subqueries. You should be able to do:
from django.db.models import Subquery, OuterRef
FooBar.objects.filter(something=True).update(
foobar=Subquery(FooBar.objects.filter(pk=OuterRef('pk')).values('foo__bar')[:1])
)
Why don't use raw sql here:
Based on this, it will be something like
from django.db import connection
raw_query = '''
update app_foobar set app_foobar.foobar =
(select app_foo.bar from app_foo where app_foo.id = app_foobar.foo_id)
where app_foobar.something = 1;
'''
cursor = connection.cursor()
cursor.execute(raw_query)
This is the implementation of Georgi Yanchev's answer for two models:
class Foo(models.Model):
bar = models.ForeignKey(Bar)
Foo.objects \
.filter(foo_field_1=True) \
.update(foo_field_2=Subquery(
Bar.objects \
.filter(id=OuterRef('bar_id')) \
.values('bar_field_1')[:1]))
For anyone wanting a simpler way to do this and not having the case of huge set of objects, below snippet should work just fine:
for fooBar in FooBar.objects.filter(something=True):
fooBar.foobar = fooBar.foo.bar
fooBar.save(update_fields=['foobar'])
For a regular use-cases, this should not present much of a performance difference, especially if being run as part of a data migration.
You can, optionally, also use select_related if needed to further optimize.

How do I get the position of a result in the list after an order_by?

I'm trying to find an efficient way to find the rank of an object in the database related to it's score. My naive solution looks like this:
rank = 0
for q in Model.objects.all().order_by('score'):
if q.name == 'searching_for_this'
return rank
rank += 1
It should be possible to get the database to do the filtering, using order_by:
Model.objects.all().order_by('score').filter(name='searching_for_this')
But there doesn't seem to be a way to retrieve the index for the order_by step after the filter.
Is there a better way to do this? (Using python/django and/or raw SQL.)
My next thought is to pre-compute ranks on insert but that seems messy.
I don't think you can do this in one database query using Django ORM. But if it doesn't bothers you, I would create a custom method on a model:
from django.db.models import Count
class Model(models.Model):
score = models.IntegerField()
...
def ranking(self):
count = Model.objects.filter(score__lt=self.score).count()
return count + 1
You can then use "ranking" anywhere, as if it was a normal field:
print Model.objects.get(pk=1).ranking
Edit: This answer is from 2010. Nowadays I would recommend Carl's solution instead.
Using the new Window functions in Django 2.0 you could write it like this...
from django.db.models import Sum, F
from django.db.models.expressions import Window
from django.db.models.functions import Rank
Model.objects.filter(name='searching_for_this').annotate(
rank=Window(
expression=Rank(),
order_by=F('score').desc()
),
)
Use something like this:
obj = Model.objects.get(name='searching_for_this')
rank = Model.objects.filter(score__gt=obj.score).count()
You can pre-compute ranks and save it to Model if they are frequently used and affect the performance.
In "raw SQL" with a standard-conforming database engine (PostgreSql, SQL Server, Oracle, DB2, ...), you can just use the SQL-standard RANK function -- but that's not supported in popular but non-standard engines such as MySql and Sqlite, and (perhaps because of that) Django does not "surface" this functionality to the application.

Categories

Resources