I have a database table with tweets in a jsonb field.
I have a query to get the tweets ordered by the most retweeted, this is what it looks like:
SELECT * FROM (
SELECT DISTINCT ON (raw->'retweeted_status'->'id_str')
raw->'retweeted_status' as status,
raw->'retweeted_status'->'retweet_count' as cnt
FROM tweet
WHERE (raw->'retweeted_status') is not null
ORDER BY raw->'retweeted_status'->'id_str', cnt DESC
) t
ORDER BY cnt DESC
I'm trying to create this query with sqlalchemy, this is where i got so far:
session.query(Tweet.raw['retweeted_status'],
Tweet.raw['retweeted_status']['retweet_count'].label('cnt'))\
.filter(~Tweet.raw.has_key('retweeted_status'))\
.distinct(Tweet.raw['retweeted_status']['id_str']).order_by(Tweet.raw['retweeted_status']['id_str'].desc()).subquery()
But how to go from that to order by cnt?
It may not produce the exact query you have shown but should point you in the right direction: you can use your label 'cnt' in order_by, like: .order_by('cnt').
Moreover you can use your label as an argument for sqlalchemy.desc function. Summing up:
from sqlalchemy import desc
q = (
session.query(
Tweet.raw['retweeted_status'],
Tweet.raw['retweeted_status']['retweet_count'].label('cnt')
)
.filter(~Tweet.raw.has_key('retweeted_status'))
.distinct(
Tweet.raw['retweeted_status']['id_str']
)
.order_by(desc('cnt'))
).subquery()
Additional hint: you can format your query nicely if you put it in parentheses.
You may want to read answers to a general question on python sqlalchemy label usage too.
Related
I have this SQL:
SELECT
stock_id, consignment_id, SUM(qty), SUM(cost)
FROM
warehouse_regсonsignmentproduct
WHERE
product_id = '1'
GROUP BY
stock_id, consignment_id
HAVING
SUM(qty) > 0
I used django ORM to create this query:
regСonsignmentProduct.objects
.filter(product='1')
.order_by('period')
.values('stock', 'consignment')
.annotate(total_qty=Sum('qty'), total_cost=Sum('cost'))
.filter(total_qty__gt=0)
But my django query returns an incorrect result.
I think, the problem is in "annotate"
Thanks!
You need to order by the values to force grouping, so:
regСonsignmentProduct.objects.filter(product='1').values(
'stock', 'consignment'
).annotate(
total_qty=Sum('qty'),
total_cost=Sum('cost')
).order_by('stock', 'consignment').filter(total_qty__gt=0)
I've been wrestling with what should be a simple conversion of a straightforward SQL query into an SQLAlchemy expression, and I just cannot get things to line up the way I mean in the subquery. This is a single-table query of a "Comments" table; I want to find which users have made the most first comments:
SELECT user_id, count(*) AS count
FROM comments c
where c.date = (SELECT MIN(c2.date)
FROM comments c2
WHERE c2.post_id = c.post_id
)
GROUP BY user_id
ORDER BY count DESC
LIMIT 20;
I don't know how to write the subquery so that it refers to the outer query, and if I did, I wouldn't know how to assemble this into the outer query itself. (Using MySQL, which shouldn't matter.)
Well, after giving up for a while and then looking back at it, I came up with something that works. I'm sure there's a better way, but:
c2 = aliased(Comment)
firstdate = select([func.min(c2.date)]).\
where(c2.post_id == Comment.post_id).\
as_scalar() # or scalar_subquery(), in SQLA 1.4
users = session.query(
Comment.user_id, func.count('*').label('count')).\
filter(Comment.date == firstdate).\
group_by(Comment.user_id).\
order_by(desc('count')).\
limit(20)
I am trying to query a list on a unique field, and also the count of each unique field using peewee ORM. I can get what I want easily from MySQL workbench, however I can't seem to get a similar result out of peewee. The working MySQL query looks like this:
select Title, Severity, count(*) from qmodel group by Title;
I have tried a few variations in peewee but nothing is has worked. This is about as close as I have gotten:
from application.database.models import qmodel as q
_field_select_list = [
q.Title,
q.Severity,
fn.COUNT(q.Title),
]
for record in q.select(*_field_select_list).group_by(q.Title):
print record
This returns the count, but replaces the title field on the return with the count, no title ( example {'Severity': '3', 'Title': '25'})
I also made my field select look like this:
_field_select_list = [
q.Title,
q.Severity,
fn.COUNT(SQL('*')),
]
But that just gives me a grouped list, no count. I have tried many other combinations with no luck.
You'll need to do something like this:
query = (QModel
.select(QModel.title, QModel.severity, fn.COUNT(QModel.id).alias('ct'))
.group_by(QModel.title, QModel.severity))
for obj in query:
print obj.title, obj.severity, obj.ct
Note, in most databases you need to group by every column you select that is a non-aggregate.
Have you tried just leaving the COUNT function empty? It works fine for me on my data.
query = q.select(q.Title, q.Severity, fn.COUNT()).group_by(q.Title)
This my is query:
SELECT kategoriharga,ongkoskirim,diskon,ratingproduk,ratingtoko,label
FROM
(SELECT *
FROM pohonkeputusan
where perdaerah='Kabupaten Toba Samosir'
order by label desc
) AS sub
GROUP BY
kategoriharga,ongkoskirim,diskon,ratingproduk,ratingtoko
How to make to be query set in Django?
I don't understand why you want to group by all fields. Try to use distinct:
Pohonkeputusan.objects.filter(perdaerah='Kabupaten Toba Samosir').order_by('-label').values_list('kategoriharga', 'ongkoskirim', 'diskon', 'ratingproduk', 'ratingtoko').distinct()
i need a little help.
I have following query and i'm, curious about how to represent it in terms of sqlalchemy.orm. Currently i'm executing it by session.execute. Its not critical for me, but i'm just curious. The thing that i'm actually don't know is how to put subquery in FROM clause (nested view) without doing any join.
select g_o.group_ from (
select distinct regexp_split_to_table(g.group_name, E',') group_
from (
select array_to_string(groups, ',') group_name
from company
where status='active'
and array_to_string(groups, ',') like :term
limit :limit
) g
) g_o
where g_o.group_ like :term
order by 1
limit :limit
I need this subquery thing because of speed issue - without limit in the most inner query function regexp_split_to_table starts to parse all data and does limit only after that. But my table is huge and i cannot afford that.
If something is not very clear, please, ask, i'll do my best)
I presume this is PostgreSQL.
To create a subquery, use subquery() method. The resulting object can be used as if it were Table object. Here's how your query would look like in SQLAlchemy:
subq1 = session.query(
func.array_to_string(Company.groups, ',').label('group_name')
).filter(
(Company.status == 'active') &
(func.array_to_string(Company.groups, ',').like(term))
).limit(limit).subquery()
subq2 = session.query(
func.regexp_split_to_table(subq1.c.group_name, ',')
.distinct()
.label('group')
).subquery()
q = session.query(subq2.c.group).\
filter(subq2.c.group.like(term)).\
order_by(subq2.c.group).\
limit(limit)
However, you could avoid one subquery by using unnest function instead of converting array to string with arrayt_to_string and then splitting it with regexp_split_to_table:
subq = session.query(
func.unnest(Company.groups).label('group')
).filter(
(Company.status == 'active') &
(func.array_to_string(Company.groups, ',').like(term))
).limit(limit).subquery()
q = session.query(subq.c.group.distinct()).\
filter(subq.c.group.like(term)).\
order_by(subq.c.group).\
limit(limit)