Django related_table() with extra() - python

I'm trying to use .extra() function with .related_table():
foo_objects = Foo.objects.all()
result = foo.extra(select={'is_ok':'IF(bar.is_ok,"Yes","No")'}).select_related('bar')
Foo and Bar are connected (Foo has bar_id) with models and everything,
but I keep getting "Unknown column 'bar.is_ok' in 'field list'" when calling result.values(),
Looking at the Query generated (the actual query produced, not foo.query), it doesn't
seem to join the two, any ideas on how I do that ?

The following query ought to work, but I can't really test it...
foo_objects = Foo.objects.select_related('bar').extra(select={'is_ok':'IF(bar.is_ok,"Yes","No")'})
It doesn't matter which order you do the select_related() and extra() in, as long as they're both on the same queryset.
Update
If you need it to work with a ValuesQuerySet, you can't use select_related(), so you have to do it slightly differently, by using additional parameters to the extra()...
foo_objects = Foo.objects.extra(tables=('bar',),
where=('foo.bar_id=bar.id',),
select={'is_ok':'IF(bar.is_ok,"Yes","No")'}).values()
...or if you don't need "Yes" and "No" back, you can just use...
foo_objects = Foo.objects.values('bar__is_ok')
...which will force the join.
See also Django ticket #3358.

Related

web2py validating db model for group menbers

I am using web2py, and am trying to build a field for auth_user, that should be validated to be a member of a certain group. So, in the models/db.py I have added a field that tells who is the manager of the user:
auth.settings.extra_fields['auth_user']=[Field('manager', 'reference auth_user')]
Then I have set up to db.auth_user, db.auth_group and db.auth_membership to contain users that belong to group 'managers'
And what I would like now to achieve is to validate user input so, that the 'manager' field of the auth_user could contain only users from the group 'managers'. I have gone through quite a few variations, following is maybe closest to making sense in theory in my mind:
group_id = auth.id_group('managers')
all_users_in_group = db(db.auth_membership.group_id==group_id)._select(db.auth_membership.user_id)
db.auth_user.auditor.requires = IS_IN_DB(db, db(~db.auth_user.id.belongs(all_users_in_group)).select(db.auth_user.id))
But even that is failing with
<type 'exceptions.AttributeError'>('Table' object has no attribute 'managers')
A perfect solution to my problem would show in the drop down menu not auth_user.id, but auth_user.first_name concatenated with auth_user.last_name.
I have used similar code to the following and confirm that is works with web2py 2.17.2-stable+timestamp.2018.10.06.18.54.02 (Running on nginx/1.14.0, Python 3.6.7).The previous answer will not work for this version.
group_id = auth.id_group('managers')
user_rows = db(db.auth_membership.group_id == group_id)._select(db.auth_membership.user_id)
query = db(db.auth_user.id.belongs(user_rows))
db.auth_user.auditor.requires = IS_IN_DB(query, db.auth_user.id, '%(first_name)s %(last_name)s'
You have done it correctly in your answer, but you can improve it.
The first argument of the validator can be a database connection or a DAL Set. So you can pass
directly a db query as first argument, like this
query = db((db.auth_membership.group_id == auth.id_group('managers')) &
(db.auth_membership.user_id == db.auth_user.id))
db.auth_user.auditor.requires = IS_IN_DB(query, db.auth_user.id, '%(first_name)s %(last_name)s')
You can also write query like below:
query = db((db.auth_group.role == 'managers') &
(db.auth_membership.group_id == db.auth_group.id) &
(db.auth_membership.user_id == db.auth_user.id))
I actually managed to solve this, but I would definitely not call this elegant. Would there be more idiomatic way to do this in web2py? Following is what I added to models/db.py
group_id = auth.id_group('managers')
rows=db(db.auth_membership.group_id==group_id).select(db.auth_membership.user_id)
rset=set()
for r in rows:
rset.add(int((r.user_id)))
db.auth_user.auditor.requires = IS_IN_DB(db(db.auth_user.id.belongs(rset)), db.auth_user.id, '%(first_name)s %(last_name)s')

Django - how to filter using QuerySet to get subset of objects?

According to documentation:
filter(**kwargs) Returns a new QuerySet containing objects that match
the given lookup parameters.
The lookup parameters (**kwargs) should be in the format described in
Field lookups below. Multiple parameters are joined via AND in the
underlying SQL statement.
Which to me suggests it will return a subset of items that were in original set.
However I seem to be missing something as below example does not behave as I would expect:
>>> kids = Kid.objects.all()
>>> tuple(k.name for k in kids)
(u'Bob',)
>>> toys = Toy.objects.all()
>>> tuple( (t.name, t.owner.name) for t in toys)
((u'car', u'Bob'), (u'bear', u'Bob'))
>>> subsel = Kid.objects.filter( owns__in = toys )
>>> tuple( k.name for k in subsel )
(u'Bob', u'Bob')
>>> str(subsel.query)
'SELECT "bug_kid"."id", "bug_kid"."name" FROM "bug_kid" INNER JOIN "bug_toy" ON ("bug_kid"."id" = "bug_toy"."owner_id") WHERE "bug_toy"."id" IN (SELECT U0."id" FROM "bug_toy" U0)'
As you can see in above subsel ends up returning duplicate records, which is not what I wanted. My question is what is the proper way to get subset? (note: set by definition will not have multiple occurrences of the same object)
Explanation as to why it behaves like that would be also nice, as to me filter means what you achieve with filter() built-in function in Python. Which is: take elements that fulfill requirement (or in other words discard ones that do not). And this definition doesn't seem to allow introduction/duplication of objects.
I know can aplly distinct() to the whole thing, but that still results in rather ugly (and probably slower than could be) query:
>>> str( subsel.distinct().query )
'SELECT DISTINCT "bug_kid"."id", "bug_kid"."name" FROM "bug_kid" INNER JOIN "bug_toy" ON ("bug_kid"."id" = "bug_toy"."owner_id") WHERE "bug_toy"."id" IN (SELECT U0."id" FROM "bug_toy" U0)'
My models.py for completeness:
from django.db import models
class Kid(models.Model):
name = models.CharField(max_length=200)
class Toy(models.Model):
name = models.CharField(max_length=200)
owner = models.ForeignKey(Kid, related_name='owns')
edit:
After a chat with #limelight the conclusion is that my problem is that I expect filter() to behave according to dictionary definition. And i.e. how it works in Python or any other sane framework/language.
More precisely if I have set A = {x,y,z} and I invoke A.filter( <predicate> ) I don't expect any elements to get duplicated. With Django's QuerySet however it behaves like this:
A = {x,y,z}
A.filter( <predicate> )
# now A i.e. = {x,x}
So first of all the issue is inappropriate method name (something like match() would be much better).
Second thing is that I think it is possible to create more efficient query than what Django allows me to. I might be wrong on that, if I will have a bit of time I will probably try to check if that is true.
This is kind of ugly, but works (without any type safety):
toy_owners = Toy.objects.values("owner_id") # optionally with .distinct()
Kid.objects.filter(id__in=toy_owners)
If performance is not an issue, I think #limelights is right.
PS! I tested your query on Django 1.6b2 and got the same unnecessary complex query.
Instead DISTINCT you can use GROUP BY (annotate in django) to get distinct kids.
toy_owners = Toy.objects.values_list("owner_id", flat=True).distinct()
Kid.objects.only('name').filter(pk__in=toy_owners).annotate(count=Count('owns'))

Django - When is a query made?

I want to minimize the number of database queries my application makes, and I am familiarizing myself more with Django's ORM. I am wondering, what are the cases where a query is executed.
For instance, this format is along the lines of the answer I'm looking for (for example purposes, not accurate to my knowledge):
Model.objects.get()
Always launches a query
Model.objects.filter()
Launches a query if objects is empty only
(...)
I am assuming curried filter operations never make additional requests, but from the docs it looks like filter() does indeed make database requests if it's the first thing called.
If you're using test cases, you can use this custom assertion included in django's TestCase: assertNumQueries().
Example:
with self.assertNumQueries(2):
x = SomeModel.objects.get(pk=1)
y = x.some_foreign_key_in_object
If the expected number of queries was wrong, you'd see an assertion failed message of the form:
Num queries (expected - actual):
2 : 5
In this example, the foreign key would cause an additional query even though there's no explicit query (get, filter, exclude, etc.).
For this reason, I would use a practical approach: Test or logging, instead of trying to learn each of the cases in which django is supposed to query.
If you don't use unit tests, you may use this other method which prints the actual SQL statements sent by django, so you can have an idea of the complexity of the query, and not just the number of queries:
(DEBUG setting must be set to True)
from django.db import connection
x = SomeModel.objects.get(pk=1)
y = x.some_foreign_key_in_object
print connection.queries
The print would show a dictionary of queries:
[
{'sql': 'SELECT a, b, c, d ... FROM app_some_model', 'time': '0.002'},
{'sql': 'SELECT j, k, ... FROM app_referenced_model JOIN ... blabla ',
'time': '0.004'}
]
Docs on connection.queries.
Of course, you can also combine both methods and use the print connection.queries in your test cases.
See Django's documentation on when querysets are evaluated: https://docs.djangoproject.com/en/dev/ref/models/querysets/#when-querysets-are-evaluated
Evaluation in this case means that the query is executed. This mostly happens when you are trying to access the results, eg. when calling list() or len() on it or iterating over the results.
get()in your example doesn't return a queryset but a model objects, therefore it is evaluated immediately.

SQLAlchemy - loading user by username

Just diving into pylons here, and am trying to get my head around the basics of SQLALchemy. I have figured out how to load a record by id:
user_q = session.query(model.User)
user = user_q.get(user_id)
But how do I query by a specific field (i.e. username)? I assume there is a quick way to do it with the model rather than hand-building the query. I think it has something with the add_column() function on the query object, but I can't quite figure out how to use it. I've been trying stuff like this, but obviously it doesn't work:
user_q = meta.Session.query(model.User).add_column('username'=user_name)
user = user_q.get()
I believe you want something like this:
user = meta.Session.query(model.User).filter_by(name=user_name).first()
Which is short for this:
user = meta.Session.query(model.User).filter(model.User.name=user_name).first()
Here's more documentation
If you change first() into one() it will raise an exception if there isn't a user (if you want to assert one exists).
I highly suggest reading through both Object Relational Tutorial as well as the and the SQL Expression Language Tutorial.

Duplicate an AppEngine Query object to create variations of a filter without affecting the base query

In my AppEngine project I have a need to use a certain filter as a base then apply various different extra filters to the end, retrieving the different result sets separately. e.g.:
base_query = MyModel.all().filter('mainfilter', 123)
Then I need to use the results of various sub queries separately:
subquery1 = basequery.filter('subfilter1', 'xyz')
#Do something with subquery1 results here
subquery2 = basequery.filter('subfilter2', 'abc')
#Do something with subquery2 results here
Unfortunately 'filter()' affects the state of the basequery Query instance, rather than just returning a modified version. Is there any way to duplicate the Query object and use it as a base? Is there perhaps a standard Python way of duping an object that could be used?
The extra filters are actually applied by the results of different forms dynamically within a wizard, and they use the 'running total' of the query in their branch to assess whether to ask further questions.
Obviously I could pass around a rudimentary stack of filter criteria, but I'd rather use the Query itself if possible, as it adds simplicity and elegance to the solution.
There's no officially approved (Eg, not likely to break) way to do this. Simply creating the query afresh from the parameters when you need it is your best option.
As Nick has said, you better create the query again, but you can still avoid repeating yourself. A good way to do that would be like this:
#inside a request handler
def create_base_query():
return MyModel.all().filter('mainfilter', 123)
subquery1 = create_base_query().filter('subfilter1', 'xyz')
#Do something with subquery1 results here
subquery2 = create_base_query().filter('subfilter2', 'abc')
#Do something with subquery2 results here

Categories

Resources