Django: SELECT JsonField AS new_name? - python

I have a table, in which some attributes are columns, and some are implemented as a postgres JsonField.
For the columns, I can write eg
Product.objects.values(brand=F('brand_name'))
for implement a SELECT brand_name AS brand query.
I need to do something similar for a JsonField, eg
Product.objects.values(color=F('jsonproperties__color'))
However, the F expressions do not work correctly with JsonFields, and there doesn't seem to be a fix coming anytime soon.
How could I work around this?

Perhaps a simple list comprehension will do what you want:
[{"color": p["jsonproperties"]["color"]} for p in Product.objects.values("color")]

Related

Django querysets optimization - preventing selection of annotated fields

Let's say I have following models:
class Invoice(models.Model):
...
class Note(models.Model):
invoice = models.ForeignKey(Invoice, related_name='notes', on_delete=models.CASCADE)
text = models.TextField()
and I want to select Invoices that have some notes. I would write it using annotate/Exists like this:
Invoice.objects.annotate(
has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True)
This works well enough, filters only Invoices with notes. However, this method results in the field being present in the query result, which I don't need and means worse performance (SQL has to execute the subquery 2 times).
I realize I could write this using extra(where=) like this:
Invoice.objects.extra(where=['EXISTS(SELECT 1 FROM note WHERE invoice_id=invoice.id)'])
which would result in the ideal SQL, but in general it is discouraged to use extra / raw SQL.
Is there a better way to do this?
You can remove annotations from the SELECT clause using .values() query set method. The trouble with .values() is that you have to enumerate all names you want to keep instead of names you want to skip, and .values() returns dictionaries instead of model instances.
Django internaly keeps the track of removed annotations in
QuerySet.query.annotation_select_mask. So you can use it to tell Django, which annotations to skip even wihout .values():
class YourQuerySet(QuerySet):
def mask_annotations(self, *names):
if self.query.annotation_select_mask is None:
self.query.set_annotation_mask(set(self.query.annotations.keys()) - set(names))
else:
self.query.set_annotation_mask(self.query.annotation_select_mask - set(names))
return self
Then you can write:
invoices = (Invoice.objects
.annotate(has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
.filter(has_notes=True)
.mask_annotations('has_notes')
)
to skip has_notes from the SELECT clause and still geting filtered invoice instances. The resulting SQL query will be something like:
SELECT invoice.id, invoice.foo FROM invoice
WHERE EXISTS(SELECT note.id, note.bar FROM notes WHERE note.invoice_id = invoice.id) = True
Just note that annotation_select_mask is internal Django API that can change in future versions without a warning.
Ok, I've just noticed in Django 3.0 docs, that they've updated how Exists works and can be used directly in filter:
Invoice.objects.filter(Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
This will ensure that the subquery will not be added to the SELECT columns, which may result in a better performance.
Changed in Django 3.0:
In previous versions of Django, it was necessary to first annotate and then filter against the annotation. This resulted in the annotated value always being present in the query result, and often resulted in a query that took more time to execute.
Still, if someone knows a better way for Django 1.11, I would appreciate it. We really need to upgrade :(
We can filter for Invoices that have, when we perform a LEFT OUTER JOIN, no NULL as Note, and make the query distinct (to avoid returning the same Invoice twice).
Invoice.objects.filter(notes__isnull=False).distinct()
This is best optimize code if you want to get data from another table which primary key reference stored in another table
Invoice.objects.filter(note__invoice_id=OuterRef('pk'),)
We should be able to clear the annotated field using the below method.
Invoice.objects.annotate(
has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True).query.annotations.clear()

What is the equivalent of python 'in' but for sqlalchemy

I have a dictionary that is being used to query the database for a match. I have one single query line that isn't working for me. If i have a list like this:
user['names']= [Alice, Bob, John]
Since this is a list I tried to use something like this:
q.filter(UserTable.firstname in user['firstnames'])
But for whatever reason this doesn't work. However, I know that Bob is in the database. When I manually pull down all the queries I can see the name is there in one of the rows. If I do this instead:
q.filter(UserTable.firstname == user['firstnames'][1]) #Only does Bob
It works. And when I pull all the queries manually, convert each row to a dictionary, and then do a
row[#row_that_matches].firstname in user['names']
that also works. But for some reason using the "in" keyword in sqlalchemy doesn't work as expected. Does anyone know an alternative that can make an sqlalchemy query for something in a list of values?
Use the in_() column method to test a column against a sequence:
q.filter(UserTable.firstname.in_(user['firstnames'])
See the Common Filter Operations section of the Object Relational tutorial:
IN:
query.filter(User.name.in_(['ed', 'wendy', 'jack']))
# works with query objects too:
query.filter(User.name.in_(
session.query(User.name).filter(User.name.like('%ed%'))
))

How to replace columns in sqlalchemy query

I have the query:
q = Session.query(func.array_agg(Order.col))
The compiled query will be:
SELECT array_agg(order.col) FROM orders
I want dynamically replace the existing column. After replacing query have to be:
SELECT group_concat(orders.col) FROM orders
I have to use Session and model. I don't have to use SQLAlchemy core. I don't have to use subqueries. And, of course, there can be some other columns, but I need to replace only one. I tried to replace objects in column_descriptions property, I tried to use q.selectable.replace (or something like this, sorry, but I don't remember right names) and I didn't get right result.
The right method:
q = Session.query(func.array_agg(Order.col))
q.with_entities(func.group_concat(Order.col))
SELECT group_concat(orders.col) FROM orders

Programming error with order by and distinct on Django with PostgreSQL

There are a lot of similar posts but none that hit exactly at what I am trying to get to. I get that a distinct has to use the same fields as order_by, which is fine.
So I have the following query:
q = MyModel.objects.order_by('field1', 'field2', '-tstamp').distinct('field1', 'field2')
Ultimately I am trying to find the latest entry in the table for all combinations of field1 and field2. The order_by does what I think it should, and that's great. But when I do the distinct I get the following error.
ProgrammingError: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
Ultimately this seems like a SQL problem (not a Django one). However looking at the django docs for distinct, it shows what can and can't work. It does say that
q = MyModel.objects.order_by('field1', 'field2', '-tstamp').distinct('field1')
will work (...and it does). But I don't understand that when I add on field2 in the same order as done in the order_by I still get the same result. Any help would be greatly appreciated
EDIT: I also notice that if I do
q = MyModel.objects.order_by('field1', 'field2', '-tstamp').distinct('field1', 'field2', 'tstamp') # with or without the - in the order_by
It still raises the same error, though the docs suggest this should work just fine
Was able to get the query to run properly by using pk's in the order_by()
q = MyModel.objects.order_by('field1__pk', 'field2__pk', '-tstamp').distinct('field1__pk', 'field2__pk')
Apparently when they are not ordinary types the orderby doesn't play super nice. However using the pk's of the objects seems to work
Test this one :
q = MyModel.objects.order_by('field1__id', 'field2__id', '-tstamp').distinct('field1__id', 'field2__id')

Django OR query using Extra and Filter

I am trying to use Django's ORM to generate a query using both extra and filter methods. Something like this:
Model.objects.filter(clauseA).extra(clauseB).all()
This generates a query, but the issue is that everything in the filter clause is AND'd with everything in the extra clause, so the sql looks like:
SELECT * FROM model WHERE clauseA AND clauseB.
My question is, is there a way to change the default combination operator for a query in Django such that the query generated will be:
SELECT * FROM model WHERE clauseA OR clauseB.
Try Q object
Model.objects.filter(Q(clauseA) | ~Q(clauseB))
EDIT
try this
Model.objects.filter(clauseA) | Model.objects.extra(clauseB)
It might be easier if you just get rid of the filter clause, and include that filter directly into extra OR'd with your Postgres specific function. I think it is already a limitation of the Django ORM.
You can attempt to create your own Func expression though. Once you have created one for your Postgres specific function, you might be able to use a combination of Func(), F(), and Q() objects to get rid of that nasty .extra() function and chain them nicely.

Categories

Resources