Django - When is a query made? - python

I want to minimize the number of database queries my application makes, and I am familiarizing myself more with Django's ORM. I am wondering, what are the cases where a query is executed.
For instance, this format is along the lines of the answer I'm looking for (for example purposes, not accurate to my knowledge):
Model.objects.get()
Always launches a query
Model.objects.filter()
Launches a query if objects is empty only
(...)
I am assuming curried filter operations never make additional requests, but from the docs it looks like filter() does indeed make database requests if it's the first thing called.

If you're using test cases, you can use this custom assertion included in django's TestCase: assertNumQueries().
Example:
with self.assertNumQueries(2):
x = SomeModel.objects.get(pk=1)
y = x.some_foreign_key_in_object
If the expected number of queries was wrong, you'd see an assertion failed message of the form:
Num queries (expected - actual):
2 : 5
In this example, the foreign key would cause an additional query even though there's no explicit query (get, filter, exclude, etc.).
For this reason, I would use a practical approach: Test or logging, instead of trying to learn each of the cases in which django is supposed to query.
If you don't use unit tests, you may use this other method which prints the actual SQL statements sent by django, so you can have an idea of the complexity of the query, and not just the number of queries:
(DEBUG setting must be set to True)
from django.db import connection
x = SomeModel.objects.get(pk=1)
y = x.some_foreign_key_in_object
print connection.queries
The print would show a dictionary of queries:
[
{'sql': 'SELECT a, b, c, d ... FROM app_some_model', 'time': '0.002'},
{'sql': 'SELECT j, k, ... FROM app_referenced_model JOIN ... blabla ',
'time': '0.004'}
]
Docs on connection.queries.
Of course, you can also combine both methods and use the print connection.queries in your test cases.

See Django's documentation on when querysets are evaluated: https://docs.djangoproject.com/en/dev/ref/models/querysets/#when-querysets-are-evaluated
Evaluation in this case means that the query is executed. This mostly happens when you are trying to access the results, eg. when calling list() or len() on it or iterating over the results.
get()in your example doesn't return a queryset but a model objects, therefore it is evaluated immediately.

Related

SQLAlchemy escape/sanitize variables without using ORM/parameters

Does SQLAlchemy provide a function to safely quote/escape a literal when constructing raw queries? I've read through the docs and have seen nothing, not sure if it is a protected internal?
I am aware parameterized statements are the correct way to handle this, but due to a fringe case/bug with pyodbc/mssql-driver this is not an option. As a result I am forced to execute a compiled ORM statement with literal_binds=True. Worst case scenario I would just limit the input to A-Z|0-9 or get creative with chained subqueries/aliases but I would prefer to avoid this.
e.g.
variable = sqlalchemy.escape_text(input("User Input")) // something like this
q = session.query(MyObj).filter(MyObj.colA == variable))
connection.execute(q.selectable.compile(compile_kwargs={'literal_binds': True}), bind=session.bind)

Chaining methods with lazy execution in Python

I was studying Django when i found they are chaining their query methods like Post.objects.filter(pk=1).filter(title='first').filter(author='me') to construct a query without actually executing it, and only execute the query when we try to access and work with its result.
From there i got interested to know how they are doing this so i can apply the same approach in my work, so for instance i can have something like
Writing code like myProduct.discount('10%').discountLimit('100$').tax('10$').shipping('20$') will only evaluate when i try work with it.
Build custom DB manager for non Django apps where i can chain my query methods and execute the query automatically only when i try to access its result (or at least when chaining ends). So i can end with something like
#doesn't hit the DB
myPost = Post.objects.select(...).where(...).where(...).limit(...)
#only hit the DB on usage
print(myPost.title)
So my QUESTION is, how can i do so?
Approaches that i thought of but i don't like
I can implement an .execute() method to perform the actual execution, calling it at the tail of the chain or whenever desired Post.objects.select(x).where(y).offset(z).execute()
I can insert a delay within each of the query builder methods to make sure it is the last in chain
class Post:
def where(self,...):
me = now()
self.lastCall = me
#process the inputs here
self.query += "WHERE ..."
self.lazyExecute(me)
return self
def lazyExecute(self,identifier):
delay(5000)
if self.lastCall = identifier
self.executeQuery()
else:
pass

Django related_table() with extra()

I'm trying to use .extra() function with .related_table():
foo_objects = Foo.objects.all()
result = foo.extra(select={'is_ok':'IF(bar.is_ok,"Yes","No")'}).select_related('bar')
Foo and Bar are connected (Foo has bar_id) with models and everything,
but I keep getting "Unknown column 'bar.is_ok' in 'field list'" when calling result.values(),
Looking at the Query generated (the actual query produced, not foo.query), it doesn't
seem to join the two, any ideas on how I do that ?
The following query ought to work, but I can't really test it...
foo_objects = Foo.objects.select_related('bar').extra(select={'is_ok':'IF(bar.is_ok,"Yes","No")'})
It doesn't matter which order you do the select_related() and extra() in, as long as they're both on the same queryset.
Update
If you need it to work with a ValuesQuerySet, you can't use select_related(), so you have to do it slightly differently, by using additional parameters to the extra()...
foo_objects = Foo.objects.extra(tables=('bar',),
where=('foo.bar_id=bar.id',),
select={'is_ok':'IF(bar.is_ok,"Yes","No")'}).values()
...or if you don't need "Yes" and "No" back, you can just use...
foo_objects = Foo.objects.values('bar__is_ok')
...which will force the join.
See also Django ticket #3358.

Idiomatic/fast Django ORM check for existence on mysql/postgres

If I want to check for the existence and if possible retrieve an object, which of the following methods is faster? More idiomatic? And why? If not either of the two examples I list, how else would one go about doing this?
if Object.objects.get(**kwargs).exists():
my_object = Object.objects.get(**kwargs)
my_object = Object.objects.filter(**kwargs)
if my_object:
my_object = my_object[0]
If relevant, I care about mysql and postgres for this.
Why not do this in a try/except block to avoid the multiple queries / query then an if?
try:
obj = Object.objects.get(**kwargs)
except Object.DoesNotExist:
pass
Just add your else logic under the except.
django provides a pretty good overview of exists
Using your first example it will do the query two times, according to the documentation:
if some_queryset has not yet been evaluated, but you
know that it will be at some point, then using some_queryset.exists()
will do more overall work (one query for the existence check plus an
extra one to later retrieve the results) than simply using
bool(some_queryset), which retrieves the results and then checks if
any were returned.
So if you're going to be using the object, after checking for existance, the docs suggest just using it and forcing evaluation 1 time using
if my_object:
pass

Duplicate an AppEngine Query object to create variations of a filter without affecting the base query

In my AppEngine project I have a need to use a certain filter as a base then apply various different extra filters to the end, retrieving the different result sets separately. e.g.:
base_query = MyModel.all().filter('mainfilter', 123)
Then I need to use the results of various sub queries separately:
subquery1 = basequery.filter('subfilter1', 'xyz')
#Do something with subquery1 results here
subquery2 = basequery.filter('subfilter2', 'abc')
#Do something with subquery2 results here
Unfortunately 'filter()' affects the state of the basequery Query instance, rather than just returning a modified version. Is there any way to duplicate the Query object and use it as a base? Is there perhaps a standard Python way of duping an object that could be used?
The extra filters are actually applied by the results of different forms dynamically within a wizard, and they use the 'running total' of the query in their branch to assess whether to ask further questions.
Obviously I could pass around a rudimentary stack of filter criteria, but I'd rather use the Query itself if possible, as it adds simplicity and elegance to the solution.
There's no officially approved (Eg, not likely to break) way to do this. Simply creating the query afresh from the parameters when you need it is your best option.
As Nick has said, you better create the query again, but you can still avoid repeating yourself. A good way to do that would be like this:
#inside a request handler
def create_base_query():
return MyModel.all().filter('mainfilter', 123)
subquery1 = create_base_query().filter('subfilter1', 'xyz')
#Do something with subquery1 results here
subquery2 = create_base_query().filter('subfilter2', 'abc')
#Do something with subquery2 results here

Categories

Resources