How do I order my objects in a random way? - python

I'm using Django with Python 3.7 and PostGres 9.5. I use the following to get all my objects and iterate over them ...
article_set = Article.objects.all()
for article in article_set:
Is there a way to modify my existing query or possibly the loop so that the objects are returned in a random order each time? I would prefer not to make a second query if at all possible.

As is explained in the documentation you can use order_by('?') as follows:
article_set = Article.objects.order_by('?')

Make a loop like this
article_set = Article.objects.all()
for article in random.shuffle(article_set):
print(article)

I'm not too familiar with Django, and correct me if I'm wrong but I'd assume that:
Article.objects.all()
#Roughly equates to:
"SELECT * from articles"
So I believe you should be able to do:
Article.objects.all().order_by('RANDOM()')
#Thus producing this SQL statement:
"SELECT * from articles ORDER BY RANDOM()"
Which should mix up your output.
Otherwise, I'd say go with the random.shuffle approach.
from random import shuffle
article_set = Article.objects.all()
for article in shuffle(article_set):
#do work
Let me know if the order_by worked and I can edit accordingly. Like I said I'm not too familiar with Django so this is untested but in theory should work. Thanks!

Related

Django querysets optimization - preventing selection of annotated fields

Let's say I have following models:
class Invoice(models.Model):
...
class Note(models.Model):
invoice = models.ForeignKey(Invoice, related_name='notes', on_delete=models.CASCADE)
text = models.TextField()
and I want to select Invoices that have some notes. I would write it using annotate/Exists like this:
Invoice.objects.annotate(
has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True)
This works well enough, filters only Invoices with notes. However, this method results in the field being present in the query result, which I don't need and means worse performance (SQL has to execute the subquery 2 times).
I realize I could write this using extra(where=) like this:
Invoice.objects.extra(where=['EXISTS(SELECT 1 FROM note WHERE invoice_id=invoice.id)'])
which would result in the ideal SQL, but in general it is discouraged to use extra / raw SQL.
Is there a better way to do this?
You can remove annotations from the SELECT clause using .values() query set method. The trouble with .values() is that you have to enumerate all names you want to keep instead of names you want to skip, and .values() returns dictionaries instead of model instances.
Django internaly keeps the track of removed annotations in
QuerySet.query.annotation_select_mask. So you can use it to tell Django, which annotations to skip even wihout .values():
class YourQuerySet(QuerySet):
def mask_annotations(self, *names):
if self.query.annotation_select_mask is None:
self.query.set_annotation_mask(set(self.query.annotations.keys()) - set(names))
else:
self.query.set_annotation_mask(self.query.annotation_select_mask - set(names))
return self
Then you can write:
invoices = (Invoice.objects
.annotate(has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
.filter(has_notes=True)
.mask_annotations('has_notes')
)
to skip has_notes from the SELECT clause and still geting filtered invoice instances. The resulting SQL query will be something like:
SELECT invoice.id, invoice.foo FROM invoice
WHERE EXISTS(SELECT note.id, note.bar FROM notes WHERE note.invoice_id = invoice.id) = True
Just note that annotation_select_mask is internal Django API that can change in future versions without a warning.
Ok, I've just noticed in Django 3.0 docs, that they've updated how Exists works and can be used directly in filter:
Invoice.objects.filter(Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
This will ensure that the subquery will not be added to the SELECT columns, which may result in a better performance.
Changed in Django 3.0:
In previous versions of Django, it was necessary to first annotate and then filter against the annotation. This resulted in the annotated value always being present in the query result, and often resulted in a query that took more time to execute.
Still, if someone knows a better way for Django 1.11, I would appreciate it. We really need to upgrade :(
We can filter for Invoices that have, when we perform a LEFT OUTER JOIN, no NULL as Note, and make the query distinct (to avoid returning the same Invoice twice).
Invoice.objects.filter(notes__isnull=False).distinct()
This is best optimize code if you want to get data from another table which primary key reference stored in another table
Invoice.objects.filter(note__invoice_id=OuterRef('pk'),)
We should be able to clear the annotated field using the below method.
Invoice.objects.annotate(
has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True).query.annotations.clear()

Mongoengine... How can I compare two fields?

For example..
class Example(Document):
up = IntField()
down = IntField()
and.. I want to retrieve documents whose up field is greater or equal to down.
But.. this is issue.
My wrong query code would be..
Example.objects(up__gte=down)
How can I use a field that resides in mongodb not python code as a queryset value?
Simple answer: not possible. Something like WHERE A = B in SQL is not doable in an efficient way in MongoDB (apart from using the $where clause which should be avoided).
this may be what you wanted::
db.myCollection.find( { $where: "this.credits == this.debits" } );
have a look at: http://docs.mongodb.org/manual/reference/operator/query/where/
but I donot know how to use it in mongoengine.

Django charfield queryset filter without escaping the MYSQL wildcard

I'm looking for a way to do this:
qs = MyModel.objects.filter(mystring__like="____10____")
#Which would create a sql clause
... LIKE '____10____'
instead of behave like this:
qs = MyModel.objects.filter(mystring__icontains="____10____")
#Which creates a sql clause
... LIKE %\_\\_\\_\\_10\\_\\_\\_\\_%
I know I can use a regex filter, but that's substantially slower and more error prone than just using the built-in mysql wildcard feature (I've tested it directly in mysql, the query strings are long enough that the difference is substantial).
EDIT:
figured out how to do this with the .extra() method with madisvain's help.
qs = MyModel.objects.extra(where=["`mystring` LIKE '____10____'"])
In terms of performance difference, 2000 random queries with the regex approach took 20.5 seconds, with this approach 2000 random queries take 6 seconds.
I didn't even know django had this __like version. Well but according to the docs please read this!
https://docs.djangoproject.com/en/dev/topics/db/queries/#escaping-percent-signs-and-underscores-in-like-statements
You should be writing the query like this.
qs = MyModel.objects.filter(mystring__contains="____10____")
Or this.
qs = MyModel.objects.extra(where="mystring LIKE '____10____'")
Read more about the extra() method here:
https://docs.djangoproject.com/en/dev/ref/models/querysets/#extra
You can use the raw() manager to preform raw SQL queries, so the line would become:
qs = MyModel.objects.raw("SELECT * from MyApp_MyModel where mystring like %s", [variable])
I would suggest to regex filter function or case insensitive iregex version. Example:
qs = MyModel.objects.filter(mystring__regex="....10....")
Disadvantage: you may check regex support for the backend you use, impact to performance and in the case of backend change this could cause potential problems: the regular expression syntax is that of the database backend in use.

sort by count in json

I'm using tastypie to create json from my django models however I'm running into a problem that I think should have a simple fix.
I have an object Blogs wich has Comment object children. I want to be able to do something like this with my json:
/api/v1/blogs/?order_by=comment_count
But I can't figure out how to sort on a field that's not part of the original comment/ blog model. I create comment_count myself in a dehydrate method that just takes the array of comments and returns comments.count()
Any help would be much appreciated - I can't seem to find any explanation.
If I understood correctly this should help:
Blog.objects.annotate(comment_count=Count('comments')).order_by('comment_count')
You might be able to do it with extra like something like:
Blog.objects.extra(
select={
'entry_count': 'SELECT COUNT(*) FROM blog_entry WHERE blog_entry.blog_id = blog_blog.id'
},
order_by = ['-entry_count'],
)
I haven't tested this, but it should work. The caveat is it will only work with a relational database.

Really long query

How do u do long query? Is there way to optimize it?
I would do complicated and long query:
all_accepted_parts = acceptedFragment.objects.filter(fragmentID = fragment.objects.filter(categories = fragmentCategory.objects.filter(id=1)))
but it doesn't work, i get:
Error binding parameter 0 - probably unsupported type.
I will be thankful for any hint how i could optimize it or solve of course too - more thankful :)
If it's not working, you can't optimize it. First make it work.
At first glance, it seems that you have really mixed concepts about fields, relationships and equality/membership. First go thought the docs, and build your query piece by piece on the python shell (likely from the inside out).
Just a shot in the dark:
all_accepted_parts = acceptedFragment.objects.filter(fragment__in = fragment.objects.filter(categories = fragmentCategory.objects.get(id=1)))
or maybe:
all_accepted_parts = acceptedFragment.objects.filter(fragment__in = fragment.objects.filter(categories = 1))
As others have said, we really need the models, and some explanation of what you're actually trying to achieve.
But it looks like you want to do a related table lookup. Rather than getting all the related objects in a separate nested query, you should use Django's related model syntax to do the join within your query.
Something like:
acceptedFragment.objects.filter(fragment__categories__id = 1)

Categories

Resources