Python/SQL Server - Subquery inside of a SELECT problem

Python/SQL Server - Subquery inside of a SELECT problem - python

I try to make my queries a little bit better and also try to reduce the amont of them I have inside my code.
So I stumbled over subqueries. The thing I want to achieve now is that I SELECT the BankID in my Entity Table and with that I directly want to get the Name of the BankID inside of my Bank Table.
I use Python and SQL Server with a Pooled DB. If i missed something please let me know!
At the moment my code is just getting the ID:
import DB_Pool
ms = DB_Pool.Database()
entity_id = 1
entity_data = ms.ExecQuery("SELECT Name,BankID FROM Entity WHERE EntityID = ? AND IsCurrent = 1",(entity_id,))
print(entity_data)
This is working fine.
But I cant get my head around on where to add the BankName SELECT with the BankID I got from the Entity Table?

You seem to be looking for a join. Assuming that Entity and Bank relate through column BankID, that would be:
SELECT e.Name entityName, e.BankID, b.Name bankName
FROM Entity e
INNER JOIN Bank b on b.BankID = e.BankID
WHERE e.EntityID = ? AND e.IsCurrent = 1

Related

Django ORM: Get latest record for distinct field

I'm having loads of trouble translating some SQL into Django.
Imagine we have some cars, each with a unique VIN, and we record the dates that they are in the shop with some other data. (Please ignore the reason one might structure the data this way. It's specifically for this question. :-) )
class ShopVisit(models.Model):
vin = models.CharField(...)
date_in_shop = models.DateField(...)
mileage = models.DecimalField(...)
boolfield = models.BooleanField(...)
We want a single query to return a Queryset with the most recent record for each vin and update it!
special_vins = [...]
# Doesn't work
ShopVisit.objects.filter(vin__in=special_vins).annotate(max_date=Max('date_in_shop').filter(date_in_shop=F('max_date')).update(boolfield=True)
# Distinct doesn't work with update
ShopVisit.objects.filter(vin__in=special_vins).order_by('vin', '-date_in_shop).distinct('vin').update(boolfield=True)
Yes, I could iterate over a queryset. But that's not very efficient and it takes a long time when I'm dealing with around 2M records. The SQL that could do this is below (I think!):
SELECT *
FROM cars
INNER JOIN (
SELECT MAX(dateInShop) as maxtime, vin
FROM cars
GROUP BY vin
) AS latest_record ON (cars.dateInShop= maxtime)
AND (latest_record.vin = cars.vin)
So how can I make this happen with Django?

This is somewhat untested, and relies on Django 1.11 for Subqueries, but perhaps something like:
latest_visits = Subquery(ShopVisit.objects.filter(id=OuterRef('id')).order_by('-date_in_shop').values('id')[:1])
ShopVisit.objects.filter(id__in=latest_visits)
I had a similar model, so went to test it but got an error of:
"This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery"
The SQL it generated looked reasonably like what you want, so I think the idea is sound. If you use PostGres, perhaps it has support for that type of subquery.
Here's the SQL it produced (trimmed up a bit and replaced actual names with fake ones):
SELECT `mymodel_activity`.* FROM `mymodel_activity` WHERE `mymodel_activity`.`id` IN (SELECT U0.`id` FROM `mymodel_activity` U0 WHERE U0.`id` = (`mymodel_activity`.`id`) ORDER BY U0.`date_in_shop` DESC LIMIT 1)

I wonder if you found the solution yourself.
I could come up with only raw query string. Django Raw SQL query Manual
UPDATE "yourapplabel_shopvisit"
SET boolfield = True WHERE date_in_shop
IN (SELECT MAX(date_in_shop) FROM "yourapplabel_shopvisit" GROUP BY vin);

SQLAlchemy - aggregate inner query joining to outer query

The below is an example of a query which I'm trying to write in SQLAlchemy without much luck. I'm quite new to SQLA and am able to convert some queries but not this this:
select car, min(units)
from (
select car,
(select sum(case when side = 0 then 1 else -1 end * doors)
from p.trades i
where i.car = o.car and i.date = o.date
and i.server_time <= o.server_time) units
from p.trades o
where date = '2016-01-19'
and car in ('Golf')
order by server_time, trade_id
) boff
group by car
Can anyone be of assistance?
Thanks, much appreciated

I know that this is not what you expect, but I'd just use SQL query.
I worked with few different ORMs and my experience is that it is usually not worth trying to write relatively complex queries using object-like syntax.
Anything simple, like read / write a record or do a simple query, is usually obvious and clear, so it's easy to write it and to maintain it.
For more complex queries you will both spend time initially to convert to the ORM language and spend time later, when you need to modify it, to remember how it worked and to understand how it can be modified.
So I would just do this:
data = session.query(MyModel).from_statement(text(
"""
select * from
....
....
""")).params(x=a, y=b).all()

Web2Py DAL, left join and operator precedence

In my DB, I've basically 3 tables:
usergroup(id, name, deleted)
usergroup_presentation(id, groupid, presentationid)
presentation(id, name)
I'm trying to run this DAL query:
left_join = db.usergroup_presentation.on((db.usergroup_presentation.group_id==db.usergroup.id)
&(db.usergroup_presentation.presentation_id==db.presentation.id))
result = db(db.usergroup.deleted==False).select(
db.usergroup.id,
db.usergroup.name,
db.usergroup_presentation.id,
left=left_join,
orderby=db.usergroup.name)
And SQL returns this errors: Unknown column 'presentation.id' in 'on clause'
The generated SQL looks something like that:
SELECT usergroup.id, usergroup.name, usergroup_presentation.id
FROM presentation, usergroup
LEFT JOIN usergroup_presentation ON ((usergroup_presentation.group_id = usergroup.id) AND (usergroup_presentation.presentation_id = presentation.id))
WHERE (usergroup.deleted = 'F')
ORDER BY usergroup.name;
I did some researches on Google and I got this:
http://mysqljoin.com/joins/joins-in-mysql-5-1054-unknown-column-in-on-clause/
Then I tried to run this query directly in my DB:
SELECT usergroup.id, usergroup.name, usergroup_presentation.id
FROM (presentation, usergroup)
LEFT JOIN usergroup_presentation ON ((usergroup_presentation.group_id = usergroup.id) AND (usergroup_presentation.presentation_id = presentation.id))
WHERE (usergroup.deleted = 'F')
ORDER BY usergroup.name;
And indeed it works when adding the brackets around the FROM tables.
My question is how can I generate a SQL query like this (with brackets) with DAL without executing a basic executesql ?
Even better, I would like to get a cleaner SQL query using INNER JOIN and LEFT JOIN. I don't know if it's possible with my query though.

I believe this has now been fixed in trunk. Please help us check it. P.S. next time open a ticket (https://code.google.com/p/web2py/issues/list) and it will be fixed sooner.

Selecting Datastore ID in Google App Engine?

I'm trying to make a query that selects everything where the id is 6. The problem is that I cant seem to get it to work. This is what the code looks like at the moment:
query = db.GqlQuery("SELECT * FROM Users WHERE id = 6")
result = query.get()
for result in query:
self.response.out.write(result.username)
Theres no errors or anything but it just wont output the username. Has anyone had this problem before or know what I did wrong?

If you're using the id value assigned by the datastore, there can only be a single entity with a given id.
How about this instead:
idNum = 6
# handy function the datastore API provides...
user = Users.get_by_id(idNum)
self.response.out.write(user.username)

SQLAlchemy filter query by related object

Using SQLAlchemy, I have a one to many relation with two tables - users and scores. I am trying to query the top 10 users sorted by their aggregate score over the past X amount of days.
users:
id
user_name
score
scores:
user
score_amount
created
My current query is:
top_users = DBSession.query(User).options(eagerload('scores')).filter_by(User.scores.created > somedate).order_by(func.sum(User.scores).desc()).all()
I know this is clearly not correct, it's just my best guess. However, after looking at the documentation and googling I cannot find an answer.
EDIT:
Perhaps it would help if I sketched what the MySQL query would look like:
SELECT user.*, SUM(scores.amount) as score_increase
FROM user LEFT JOIN scores ON scores.user_id = user.user_id
WITH scores.created_at > someday
ORDER BY score_increase DESC

The single-joined-row way, with a group_by added in for all user columns although MySQL will let you group on just the "id" column if you choose:
sess.query(User, func.sum(Score.amount).label('score_increase')).\
join(User.scores).\
filter(Score.created_at > someday).\
group_by(User).\
order_by("score increase desc")
Or if you just want the users in the result:
sess.query(User).\
join(User.scores).\
filter(Score.created_at > someday).\
group_by(User).\
order_by(func.sum(Score.amount))
The above two have an inefficiency in that you're grouping on all columns of "user" (or you're using MySQL's "group on only a few columns" thing, which is MySQL only). To minimize that, the subquery approach:
subq = sess.query(Score.user_id, func.sum(Score.amount).label('score_increase')).\
filter(Score.created_at > someday).\
group_by(Score.user_id).subquery()
sess.query(User).join((subq, subq.c.user_id==User.user_id)).order_by(subq.c.score_increase)
An example of the identical scenario is in the ORM tutorial at: http://docs.sqlalchemy.org/en/latest/orm/tutorial.html#selecting-entities-from-subqueries

You will need to use a subquery in order to compute the aggregate score for each user. Subqueries are described here: http://www.sqlalchemy.org/docs/05/ormtutorial.html?highlight=subquery#using-subqueries

I am assuming the column (not the relation) you're using for the join is called Score.user_id, so change it if this is not the case.
You will need to do something like this:
DBSession.query(Score.user_id, func.sum(Score.score_amount).label('total_score')).group_by(Score.user_id).filter(Score.created > somedate).order_by('total_score DESC')[:10]
However this will result in tuples of (user_id, total_score). I'm not sure if the computed score is actually important to you, but if it is, you will probably want to do something like this:
users_scores = []
q = DBSession.query(Score.user_id, func.sum(Score.score_amount).label('total_score')).group_by(Score.user_id).filter(Score.created > somedate).order_by('total_score DESC')[:10]
for user_id, total_score in q:
user = DBSession.query(User)
users_scores.append((user, total_score))
This will result in 11 queries being executed, however. It is possible to do it all in a single query, but due to various limitations in SQLAlchemy, it will likely create a very ugly multi-join query or subquery (dependent on engine) and it won't be very performant.
If you plan on doing something like this often and you have a large amount of scores, consider denormalizing the current score onto the user table. It's more work to upkeep, but will result in a single non-join query like:
DBSession.query(User).order_by(User.computed_score.desc())
Hope that helps.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python/SQL Server - Subquery inside of a SELECT problem - python

You seem to be looking for a join. Assuming that Entity and Bank relate through column BankID, that would be: SELECT e.Name entityName, e.BankID, b.Name bankName FROM Entity e INNER JOIN Bank b on b.BankID = e.BankID WHERE e.EntityID = ? AND e.IsCurrent = 1

Related

Django ORM: Get latest record for distinct field

SQLAlchemy - aggregate inner query joining to outer query

Web2Py DAL, left join and operator precedence

Selecting Datastore ID in Google App Engine?

SQLAlchemy filter query by related object

Categories

Resources