I have a Order model and Payment model. Payment model has a jsonb column data.
My Query:
orders = (
Order
.select(Order, Payment.data.alias('payment_data'))
.join(Payment, JOIN_LEFT_OUTER, on=(Order.payment==Payment.id))
.iterator()
)
When I am iterating over the above query, and accessing order.payment_data, I am getting an AttributeError
But if I write the query below, it gives me the payment_data key in the dict while iterating over the orders:
orders = (
Order
.select(Order, Payment.data.alias('payment_data'))
.join(Payment, JOIN_LEFT_OUTER, on=(Order.payment==Payment.id))
.dicts()
.iterator()
)
Can someone please explain me what I am doing wrong in the first query and how can have access to order.payment_data?
Thanks
When I am iterating over the above query, and accessing order.payment_data, I am getting an AttributeError
The payment data is probably getting attached to the related payment instance. So instead of order.payment_data you would look up the value using:
order.payment.payment_data
If you want all attributes simply patched directly onto the order, use the objects() query method, which skips the model/relation graph:
orders = (Order
.select(Order, Payment.data.alias('payment_data'))
.join(Payment, JOIN_LEFT_OUTER, on=(Order.payment==Payment.id))
.objects() # Do not make object-graph
.iterator())
for order in orders:
print(order.id, order.payment_data)
This is all covered in the docs: http://docs.peewee-orm.com/en/latest/peewee/relationships.html#selecting-from-multiple-sources
This could be a result of having NULL fields in joined results. Probably you miss payment_data for some records and peewee doesn't handle this situation as expected.
Check if your query results contain NULLs in places of payment_data. If so you should probably check if order has payment_data attribute on each iteration.
Here is more detailed explanation on Github: https://github.com/coleifer/peewee/issues/1756#issuecomment-430399189
Related
Let's say I have following models:
class Invoice(models.Model):
...
class Note(models.Model):
invoice = models.ForeignKey(Invoice, related_name='notes', on_delete=models.CASCADE)
text = models.TextField()
and I want to select Invoices that have some notes. I would write it using annotate/Exists like this:
Invoice.objects.annotate(
has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True)
This works well enough, filters only Invoices with notes. However, this method results in the field being present in the query result, which I don't need and means worse performance (SQL has to execute the subquery 2 times).
I realize I could write this using extra(where=) like this:
Invoice.objects.extra(where=['EXISTS(SELECT 1 FROM note WHERE invoice_id=invoice.id)'])
which would result in the ideal SQL, but in general it is discouraged to use extra / raw SQL.
Is there a better way to do this?
You can remove annotations from the SELECT clause using .values() query set method. The trouble with .values() is that you have to enumerate all names you want to keep instead of names you want to skip, and .values() returns dictionaries instead of model instances.
Django internaly keeps the track of removed annotations in
QuerySet.query.annotation_select_mask. So you can use it to tell Django, which annotations to skip even wihout .values():
class YourQuerySet(QuerySet):
def mask_annotations(self, *names):
if self.query.annotation_select_mask is None:
self.query.set_annotation_mask(set(self.query.annotations.keys()) - set(names))
else:
self.query.set_annotation_mask(self.query.annotation_select_mask - set(names))
return self
Then you can write:
invoices = (Invoice.objects
.annotate(has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
.filter(has_notes=True)
.mask_annotations('has_notes')
)
to skip has_notes from the SELECT clause and still geting filtered invoice instances. The resulting SQL query will be something like:
SELECT invoice.id, invoice.foo FROM invoice
WHERE EXISTS(SELECT note.id, note.bar FROM notes WHERE note.invoice_id = invoice.id) = True
Just note that annotation_select_mask is internal Django API that can change in future versions without a warning.
Ok, I've just noticed in Django 3.0 docs, that they've updated how Exists works and can be used directly in filter:
Invoice.objects.filter(Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
This will ensure that the subquery will not be added to the SELECT columns, which may result in a better performance.
Changed in Django 3.0:
In previous versions of Django, it was necessary to first annotate and then filter against the annotation. This resulted in the annotated value always being present in the query result, and often resulted in a query that took more time to execute.
Still, if someone knows a better way for Django 1.11, I would appreciate it. We really need to upgrade :(
We can filter for Invoices that have, when we perform a LEFT OUTER JOIN, no NULL as Note, and make the query distinct (to avoid returning the same Invoice twice).
Invoice.objects.filter(notes__isnull=False).distinct()
This is best optimize code if you want to get data from another table which primary key reference stored in another table
Invoice.objects.filter(note__invoice_id=OuterRef('pk'),)
We should be able to clear the annotated field using the below method.
Invoice.objects.annotate(
has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True).query.annotations.clear()
Confused working with query object results. I am not using foreign keys in this example.
lookuplocation = aliased(ValuePair)
lookupoccupation = aliased(ValuePair)
persons = db.session.query(Person.lastname, lookuplocation.displaytext, lookupoccupation.displaytext).\
outerjoin(lookuplocation, Person.location == lookuplocation.valuepairid).\
outerjoin(lookupoccupation, Person.occupation1 == lookupoccupation.valuepairid).all()
Results are correct as far as data is concerned. However, when I try to access an individual row of data I have an issue:
persons[0].lastname works as I expected and returns data.
However, there is a person.displaytext in the result but since I aliased the displaytext entity, I get just one result. I understand why I get the result but I need to know what aliased field names I would use to get the two displaytext columns.
The actual SQL statement generated by the above join is as follows:
SELECT person.lastname AS person_lastname, valuepair_1.displaytext AS valuepair_1_displaytext, valuepair_2.displaytext AS valuepair_2_displaytext
FROM person LEFT OUTER JOIN valuepair AS valuepair_1 ON person.location = valuepair_1.valuepairid LEFT OUTER JOIN valuepair AS valuepair_2 ON person.occupation1 = valuepair_2.valuepairid
But none of these "as" field names are available to me in the results.
I'm new to SqlAlchemy so most likely this is a "newbie" issue.
Thanks.
Sorry - RTFM issue - should have been:
lookuplocation.displaytext.label("myfield1"),
lookupoccupation.displaytext.label("myfield2")
After results are returned reference field with person.myfield
Simple.
In simple worlds I try to write in Django a query which will return results
Similar to this:
select id, max(score), screen, details from results
group by screen
where user_id=123
order by score desc, screen
My code:
results = Results.objects.filter(user=user).values('screen').annotate(score=Max('score')).order_by('-score', 'screen')
But i doesn't return Results objects but json. I would like to get results as Results object like
user = User.objects.filter(id=id)
results = Results.objects.filter(user=user).order_by('-score', 'screen')
Anyone can help?
It is the values() call that turns your QuerySet
into a list of dicts.
If you want to keep it a QuerySet, but only retrieve the fields from the database that you are interested in, use only(*fields) or defer(*fields)
results = Results.objects.filter(user=user)
.annotate(score=Max('score'))
.order_by('-score', 'screen')
.only('screen') # not values()
If you try to access fields that were deferred on instances it will result in extra db hits. You can verify that these are incomplete Results instances:
> type(results[0])
<class 'app_name.Results_Deferred_foo_bar_7265632c0017c141e84a00d9fdb4760'>
There are a lot of similar posts but none that hit exactly at what I am trying to get to. I get that a distinct has to use the same fields as order_by, which is fine.
So I have the following query:
q = MyModel.objects.order_by('field1', 'field2', '-tstamp').distinct('field1', 'field2')
Ultimately I am trying to find the latest entry in the table for all combinations of field1 and field2. The order_by does what I think it should, and that's great. But when I do the distinct I get the following error.
ProgrammingError: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
Ultimately this seems like a SQL problem (not a Django one). However looking at the django docs for distinct, it shows what can and can't work. It does say that
q = MyModel.objects.order_by('field1', 'field2', '-tstamp').distinct('field1')
will work (...and it does). But I don't understand that when I add on field2 in the same order as done in the order_by I still get the same result. Any help would be greatly appreciated
EDIT: I also notice that if I do
q = MyModel.objects.order_by('field1', 'field2', '-tstamp').distinct('field1', 'field2', 'tstamp') # with or without the - in the order_by
It still raises the same error, though the docs suggest this should work just fine
Was able to get the query to run properly by using pk's in the order_by()
q = MyModel.objects.order_by('field1__pk', 'field2__pk', '-tstamp').distinct('field1__pk', 'field2__pk')
Apparently when they are not ordinary types the orderby doesn't play super nice. However using the pk's of the objects seems to work
Test this one :
q = MyModel.objects.order_by('field1__id', 'field2__id', '-tstamp').distinct('field1__id', 'field2__id')
Help! Can't figure this out! I'm getting a Integrity error on get_or_create even with a defaults parameter set.
Here's how the model looks stripped down.
class Example(models.Model):model
user = models.ForeignKey(User)
text = models.TextField()
def __unicode__(self):
return "Example"
I run this in Django:
def create_example_model(user, textJson):
defaults = {text: textJson.get("text", "undefined")}
model, created = models.Example.objects.get_or_create(
user=user,
id=textJson.get("id", None),
defaults=defaults)
if not created:
model.text = textJson.get("text", "undefined")
model.save()
return model
I'm getting an error on the get_or_create line:
IntegrityError: (1062, "Duplicate entry '3020' for key 'PRIMARY'")
It's live so I can't really tell what the input is.
Help? There's actually a defaults set, so it's not like, this problem where they do not have a defaults. Plus it doesn't have together-unique. Django : get_or_create Raises duplicate entry with together_unique
I'm using python 2.6, and mysql.
You shouldn't be setting the id for objects in general, you have to be careful when doing that.
Have you checked to see the value for 'id' that you are putting into the database?
If that doesn't fix your issue then it may be a database issue, for PostgreSQL there is a special sequence used to increment the ID's and sometimes this does not get incremented. Something like the following:
SELECT setval('tablename_id_seq', (SELECT MAX(id) + 1 FROM
tablename_id_seq));
get_or_create() will try to create a new object if it can't find one that is an exact match to the arguments you pass in.
So is what I'm assuming is happening is that a different user has made an object with the id of 3020. Since there is no object with the user/id combo you're requesting, it tries to make a new object with that combo, but fails because a different user has already created an item with the id of 3020.
Hopefully that makes sense. See what the following returns. Might give a little insight as to what has gone on.
models.Example.objects.get(id=3020)
You might need to make 3020 a string in the lookup. I'm assuming a string is coming back from your textJson.get() method.
One common but little documented cause for get_or_create() fails is corrupted database indexes.
Django depends on the assumption that there is only one record for given identifier, and this is in turn enforced using UNIQUE index on this particular field in the database. But indexes are constantly being rewritten and they may get corrupted e.g. when the database crashes unexpectedly. In such case the index may no longer return information about an existing record, another record with the same field is added, and as result you'll be hitting the IntegrityError each time you try to get or create this particular record.
The solution is, at least in PostgreSQL, to REINDEX this particular index, but you first need to get rid of the duplicate rows programmatically.