SQLAlchemy subquery in from clause without join

SQLAlchemy subquery in from clause without join - python

i need a little help.
I have following query and i'm, curious about how to represent it in terms of sqlalchemy.orm. Currently i'm executing it by session.execute. Its not critical for me, but i'm just curious. The thing that i'm actually don't know is how to put subquery in FROM clause (nested view) without doing any join.
select g_o.group_ from (
select distinct regexp_split_to_table(g.group_name, E',') group_
from (
select array_to_string(groups, ',') group_name
from company
where status='active'
and array_to_string(groups, ',') like :term
limit :limit
) g
) g_o
where g_o.group_ like :term
order by 1
limit :limit
I need this subquery thing because of speed issue - without limit in the most inner query function regexp_split_to_table starts to parse all data and does limit only after that. But my table is huge and i cannot afford that.
If something is not very clear, please, ask, i'll do my best)

I presume this is PostgreSQL.
To create a subquery, use subquery() method. The resulting object can be used as if it were Table object. Here's how your query would look like in SQLAlchemy:
subq1 = session.query(
func.array_to_string(Company.groups, ',').label('group_name')
).filter(
(Company.status == 'active') &
(func.array_to_string(Company.groups, ',').like(term))
).limit(limit).subquery()
subq2 = session.query(
func.regexp_split_to_table(subq1.c.group_name, ',')
.distinct()
.label('group')
).subquery()
q = session.query(subq2.c.group).\
filter(subq2.c.group.like(term)).\
order_by(subq2.c.group).\
limit(limit)
However, you could avoid one subquery by using unnest function instead of converting array to string with arrayt_to_string and then splitting it with regexp_split_to_table:
subq = session.query(
func.unnest(Company.groups).label('group')
).filter(
(Company.status == 'active') &
(func.array_to_string(Company.groups, ',').like(term))
).limit(limit).subquery()
q = session.query(subq.c.group.distinct()).\
filter(subq.c.group.like(term)).\
order_by(subq.c.group).\
limit(limit)

Related

Can't convert SQL to django query (having doesn't work)

I have this SQL:
SELECT
stock_id, consignment_id, SUM(qty), SUM(cost)
FROM
warehouse_regсonsignmentproduct
WHERE
product_id = '1'
GROUP BY
stock_id, consignment_id
HAVING
SUM(qty) > 0
I used django ORM to create this query:
regСonsignmentProduct.objects
.filter(product='1')
.order_by('period')
.values('stock', 'consignment')
.annotate(total_qty=Sum('qty'), total_cost=Sum('cost'))
.filter(total_qty__gt=0)
But my django query returns an incorrect result.
I think, the problem is in "annotate"
Thanks!

You need to order by the values to force grouping, so:
regСonsignmentProduct.objects.filter(product='1').values(
'stock', 'consignment'
).annotate(
total_qty=Sum('qty'),
total_cost=Sum('cost')
).order_by('stock', 'consignment').filter(total_qty__gt=0)

Confusing SQLAlchemy conversion of simple subquery

I've been wrestling with what should be a simple conversion of a straightforward SQL query into an SQLAlchemy expression, and I just cannot get things to line up the way I mean in the subquery. This is a single-table query of a "Comments" table; I want to find which users have made the most first comments:
SELECT user_id, count(*) AS count
FROM comments c
where c.date = (SELECT MIN(c2.date)
FROM comments c2
WHERE c2.post_id = c.post_id
)
GROUP BY user_id
ORDER BY count DESC
LIMIT 20;
I don't know how to write the subquery so that it refers to the outer query, and if I did, I wouldn't know how to assemble this into the outer query itself. (Using MySQL, which shouldn't matter.)

Well, after giving up for a while and then looking back at it, I came up with something that works. I'm sure there's a better way, but:
c2 = aliased(Comment)
firstdate = select([func.min(c2.date)]).\
where(c2.post_id == Comment.post_id).\
as_scalar() # or scalar_subquery(), in SQLA 1.4
users = session.query(
Comment.user_id, func.count('*').label('count')).\
filter(Comment.date == firstdate).\
group_by(Comment.user_id).\
order_by(desc('count')).\
limit(20)

SQLAlchemy Coalesce and Join

I'm having a lot of trouble converting my sql query to sqlalchemy. I haven't been able to find any resources doing what I am trying to do.
The query I am trying to convert is:
SELECT
COALESCE(d.manager_name, e.name) AS name,
COALESCE(d.department_name, e.department_name) AS department
FROM employee e
LEFT JOIN department d ON e.id = d.id
WHERE e.date = '2018-11-05'
In sqlalchemy I came up with:
query = self.session.query(
func.coalesce(Department.manager_name, Employee.name),
func.coalesce(Department.department_name, Employee.department_name)).join(Department,
Employee.id == Department.id,
).filter(
Employee.date == '2018-11-05',
)
But keep getting the error:
sqlalchemy.exc.InvalidRequestError: Can't join table/selectable 'Department' to itself.
WHY?! The statements are exact!

Since Department is the leftmost item in your query, joins take place against it. To control what is considered the first – or the "left" – entity in the join use Query.select_from():
query = self.session.query(
func.coalesce(Department.manager_name, Employee.name),
func.coalesce(Department.department_name, Employee.department_name)).\
select_from(Employee).\
outerjoin(Department, Employee.id == Department.id).\
filter(Employee.date == '2018-11-05')
This behaviour is also explained in the ORM tutorial under "Querying with Joins", and Query.join(): "Controlling what to Join From".
Your query construct was also using Query.join(), though the raw SQL had LEFT JOIN. In that case Query.outerjoin() or join(..., isouter=True) should be used.

SQLAlchemy select from subquery and order by subquery field

I have a database table with tweets in a jsonb field.
I have a query to get the tweets ordered by the most retweeted, this is what it looks like:
SELECT * FROM (
SELECT DISTINCT ON (raw->'retweeted_status'->'id_str')
raw->'retweeted_status' as status,
raw->'retweeted_status'->'retweet_count' as cnt
FROM tweet
WHERE (raw->'retweeted_status') is not null
ORDER BY raw->'retweeted_status'->'id_str', cnt DESC
) t
ORDER BY cnt DESC
I'm trying to create this query with sqlalchemy, this is where i got so far:
session.query(Tweet.raw['retweeted_status'],
Tweet.raw['retweeted_status']['retweet_count'].label('cnt'))\
.filter(~Tweet.raw.has_key('retweeted_status'))\
.distinct(Tweet.raw['retweeted_status']['id_str']).order_by(Tweet.raw['retweeted_status']['id_str'].desc()).subquery()
But how to go from that to order by cnt?

It may not produce the exact query you have shown but should point you in the right direction: you can use your label 'cnt' in order_by, like: .order_by('cnt').
Moreover you can use your label as an argument for sqlalchemy.desc function. Summing up:
from sqlalchemy import desc
q = (
session.query(
Tweet.raw['retweeted_status'],
Tweet.raw['retweeted_status']['retweet_count'].label('cnt')
)
.filter(~Tweet.raw.has_key('retweeted_status'))
.distinct(
Tweet.raw['retweeted_status']['id_str']
)
.order_by(desc('cnt'))
).subquery()
Additional hint: you can format your query nicely if you put it in parentheses.
You may want to read answers to a general question on python sqlalchemy label usage too.

Selecting a Join in sqlalchemy gives too many rows

I am trying to build a compound SQL query that builds a table from a join I have previously performed. (Using SqlAlchemy (Core part) with python3 and Postgresql 9.4)
I include here the relevant part of my python3 code. I first create "in_uuid_set" using a select with a group_by. Then I join "in_uuid_set" with "in_off_messages" to get "jn_in".
Finally, I try to build a new table "incoming" from "jn_in" by selecting and generating the wanted columns:
in_uuid_set = \
sa.select([in_off_messages.c.src_uuid.label('remote_uuid')])\
.select_from(in_off_messages)\
.where(in_off_messages.c.dst_uuid == local_uuid)\
.group_by(in_off_messages.c.src_uuid)\
.alias()
jn_in = in_uuid_set.join(in_off_messages,\
and_(\
in_off_messages.c.src_uuid == in_uuid_set.c.remote_uuid,\
in_off_messages.c.dst_uuid == local_uuid,\
))\
.alias()
incoming = sa.select([\
in_off_messages.c.msg_uuid.label('msg_uuid'),\
in_uuid_set.c.remote_uuid.label('remote_uuid'),\
in_off_messages.c.msg_type.label('msg_type'),\
in_off_messages.c.date_sent.label('date_sent'),\
in_off_messages.c.content.label('content'),\
in_off_messages.c.was_read.label('was_read'),\
true().label('is_incoming')]
)\
.select_from(jn_in)
Surprisingly, I get that "incoming" has more rows than "jn_in". "incoming" has 12 rows, while "jn_in" has only 2 rows. I expect that "incoming" will have the same amount of rows (2) as "jn_in".
I also include here the SQL output the SqlAlchemy generates for "incoming":
SELECT in_off_messages.msg_uuid AS msg_uuid,
anon_1.remote_uuid AS remote_uuid,
in_off_messages.msg_type AS msg_type,
in_off_messages.date_sent AS date_sent,
in_off_messages.content AS content,
in_off_messages.was_read AS was_read,
1 AS is_incoming
FROM in_off_messages,
(SELECT in_off_messages.src_uuid AS remote_uuid
FROM in_off_messages
WHERE in_off_messages.dst_uuid = :dst_uuid_1
GROUP BY in_off_messages.src_uuid) AS anon_1,
(SELECT anon_1.remote_uuid AS anon_1_remote_uuid,
in_off_messages.msg_uuid AS in_off_messages_msg_uuid,
in_off_messages.orig_src_uuid AS in_off_messages_orig_src_uuid,
in_off_messages.src_uuid AS in_off_messages_src_uuid,
in_off_messages.dst_uuid AS in_off_messages_dst_uuid,
in_off_messages.msg_type AS in_off_messages_msg_type,
in_off_messages.date_sent AS in_off_messages_date_sent,
in_off_messages.content AS in_off_messages_content,
in_off_messages.was_read AS in_off_messages_was_read
FROM (SELECT in_off_messages.src_uuid AS remote_uuid
FROM in_off_messages
WHERE in_off_messages.dst_uuid = :dst_uuid_1
GROUP BY in_off_messages.src_uuid) AS anon_1
JOIN in_off_messages
ON in_off_messages.src_uuid = anon_1.remote_uuid
AND in_off_messages.dst_uuid = :dst_uuid_2) AS anon_2
Something doesn't look right for me with this SQL output, mostly because I see GROUP BY too many times. I would have expected it to show up about once, but it seems like it shows up twice here.
My guesses is that somehow some braces went out of place (In the generated SQL). I also suspect that I did something wrong with the alias() thing, though I'm not sure about it.
What should I do to get the wanted result (Same amount of rows for "jn_in" and "incoming")?

After playing with the code for a while, I found a way to fix it.
The answer was eventually related to the alias().
In order to make this work, the second alias() (Of jn_in) should be omitted, like this:
in_uuid_set = \
sa.select([in_off_messages.c.src_uuid.label('remote_uuid')])\
.select_from(in_off_messages)\
.where(in_off_messages.c.dst_uuid == local_uuid)\
.group_by(in_off_messages.c.src_uuid)\
.alias()
jn_in = in_uuid_set.join(in_off_messages,\
and_(\
in_off_messages.c.src_uuid == in_uuid_set.c.remote_uuid,\
in_off_messages.c.dst_uuid == local_uuid,\
))
# <<< The alias() is gone >>>
incoming = sa.select([\
in_off_messages.c.msg_uuid.label('msg_uuid'),\
in_uuid_set.c.remote_uuid.label('remote_uuid'),\
in_off_messages.c.msg_type.label('msg_type'),\
in_off_messages.c.date_sent.label('date_sent'),\
in_off_messages.c.content.label('content'),\
in_off_messages.c.was_read.label('was_read'),\
true().label('is_incoming')]
)\
.select_from(jn_in)
It seems, however, that the first alias() (of in_uuid_set) can not be ommited. If I try to omit it, I get this error message:
E subquery in FROM must have an alias
E LINE 2: FROM (SELECT in_off_messages.src_uuid AS remote_uuid
E ^
E HINT: For example, FROM (SELECT ...) [AS] foo.
As a generalization of this, probably if you have a select that you want to put as a clause somewhere else, then you want to alias() it, however if you have a join that you want to put as a clause, you should not alias() it.
For the sake of completeness, I include here the resulting SQL of the new code:
SELECT in_off_messages.msg_uuid AS msg_uuid,
anon_1.remote_uuid AS remote_uuid,
in_off_messages.msg_type AS msg_type,
in_off_messages.date_sent AS date_sent,
in_off_messages.content AS content,
in_off_messages.was_read AS was_read,
1 AS is_incoming
FROM (SELECT in_off_messages.src_uuid AS remote_uuid
FROM in_off_messages
WHERE in_off_messages.dst_uuid = :dst_uuid_1
GROUP BY in_off_messages.src_uuid) AS anon_1
JOIN in_off_messages
ON in_off_messages.src_uuid = anon_1.remote_uuid
AND in_off_messages.dst_uuid = :dst_uuid_2
Much shorter than the one at the question.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

SQLAlchemy subquery in from clause without join - python

Related

Can't convert SQL to django query (having doesn't work)

Confusing SQLAlchemy conversion of simple subquery

SQLAlchemy Coalesce and Join

SQLAlchemy select from subquery and order by subquery field

Selecting a Join in sqlalchemy gives too many rows

Categories

Resources