Lets say I have SQL Alchemy ORM classes:
class Session(db.Model):
id = db.Column(db.Integer, primary_key=True)
user_agent = db.Column(db.Text, nullable=False)
class Run(db.Model):
id = db.Column(db.Integer, primary_key=True)
session_id = db.Column(db.Integer, db.ForeignKey('session.id'))
session = db.relationship('Session', backref=db.backref('runs', lazy='dynamic'))
And I want to query for essentially the following:
((session.id, session.user_agent, session.runs.count())
for session in Session.query.order_by(Session.id.desc()))
However, this is clearly 1+n queries, which is terrible. What is the correct way to do this, with 1 query? In normal SQL, I would do this with something along the lines of:
SELECT session.id, session.user_agent, COUNT(row.id) FROM session
LEFT JOIN rows on session.id = rows.session_id
GROUP BY session.id ORDER BY session.id DESC
Construct a subquery that groups and counts session ids from runs, and join to that in your final query.
sq = session.query(Run.session_id, func.count(Run.session_id).label('count')).group_by(Run.session_id).subquery()
result = session.query(Session, sq.c.count).join(sq, sq.c.session_id == Session.id).all()
The targeted SQL could be produced simply with:
db.session.query(Session, func.count(Run.id)).\
outerjoin(Run).\
group_by(Session.id).\
order_by(Session.id.desc())
Related
In the case of many-to-many relationships, an association table can be used in the form of Association Object pattern.
I have the following setup of two classes having a M2M relationship through UserCouncil association table.
class Users(Base):
name = Column(String, nullable=False)
email = Column(String, nullable=False, unique=True)
created_at = Column(DateTime, default=datetime.utcnow)
password = Column(String, nullable=False)
salt = Column(String, nullable=False)
councils = relationship('UserCouncil', back_populates='user')
class Councils(Base):
name = Column(String, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
users = relationship('UserCouncil', back_populates='council')
class UserCouncil(Base):
user_id = Column(UUIDType, ForeignKey(Users.id, ondelete='CASCADE'), primary_key=True)
council_id = Column(UUIDType, ForeignKey(Councils.id, ondelete='CASCADE'), primary_key=True)
role = Column(Integer, nullable=False)
user = relationship('Users', back_populates='councils')
council = relationship('Councils', back_populates='users')
However, in this situation, suppose I want to search for a council with a specific name cname for a given user user1. I can do the following:
for council in user1.councils:
if council.name == cname:
dosomething(council)
Or, alternatively, this:
session.query(UserCouncil) \
.join(Councils) \
.filter((UserCouncil.user_id == user1.id) & (Councils.name == cname)) \
.first() \
.council
While the second one is more similar to raw SQL queries and performs better, the first one is simpler. Is there any other, more idiomatic way of expressing this query which is better performing while also utilizing the relationship linkages instead of explicitly writing traditional joins?
First, I think even the SQL query you bring as an example might need to go to fetch the UserCouncil.council relationship again to the DB if it is not loaded in the memory already.
I think that given you want to search directly for the Council instance given its .name and the User instance, this is exactly what you should ask for. Below is the query for that with 2 options on how to filter on user_id (you might be more familiar with the second option, so please use it):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.filter(Councils.users.any(UserCouncil.user_id == user_id)) # v1: this does not require JOIN, but produces the same result as below
# .join(UserCouncil).filter(UserCouncil.user_id == user_id) # v2: join, very similar to original SQL
)
council = session.execute(q).scalars().first()
As to making it more simple and idiomatic, I can only suggest to wrap it in a method or property on the User instance:
class Users(...):
...
def get_council_by_name(self, councils_name):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.join(UserCouncil).filter(with_parent(self, Users.councils))
)
return object_session(self).execute(q).scalars().first()
so that you can later call it user.get_council_by_name('xxx')
Edit-1: added SQL queries
v1 of the first q query above will generate following SQL:
SELECT councils.id,
councils.name
FROM councils
WHERE councils.name = :name_1
AND (EXISTS
(SELECT 1
FROM user_councils
WHERE councils.id = user_councils.council_id
AND user_councils.user_id = :user_id_1
)
)
while v2 option will generate:
SELECT councils.id,
councils.name
FROM councils
JOIN user_councils ON councils.id = user_councils.council_id
WHERE councils.name = :name_1
AND user_councils.user_id = :user_id_1
I have a scenario to iterate up session_number column for related user_name. If a user created a session before I'll iterate up the last session_number but if a user created session for the first time session_number should start from 1. I tried to illustrate on below. Right now I handle this by using logic but try to find more elegant way to do that in SqlAlchemy.
id - user_name - session_number
1 user_1 1
2 user_1 2
3 user_2 1
4 user_1 3
5 user_2 2
Here is my python code of the table. My database is PostgreSQL and I'm using alembic to upgrade tables. Right now it continues to iterate up the session_number regardless user_name.
class UserSessions(db.Model):
__tablename__ = 'user_sessions'
id = db.Column(db.Integer, primary_key=True, unique=True)
username = db.Column(db.String, nullable=False)
session_number = db.Column(db.Integer, Sequence('session_number_seq', start=0, increment=1))
created_at = db.Column(db.DateTime)
last_edit = db.Column(db.DateTime)
__table_args__ = (
db.UniqueConstraint('username', 'session_number', name='_username_session_number_idx_'),
)
I've searched on the internet for this situation but those were not like my problem. Is it possible to achieve this with SqlAlchemy/PostgreSQL actions?
First, I do not know of any "pure" solution for this situation by using either SqlAlchemy or Postgresql or a combination of the two.
Although it might not be exactly the solution you are looking for, I hope it will give you some ideas.
If you wanted to calculate the session_number for the whole table without it being stored, i would use the following query or a variation of thereof:
def get_user_sessions_with_rank():
expr = (
db.func.rank()
.over(partition_by=UserSessions.username, order_by=[UserSessions.id])
.label("session_number")
)
subq = db.session.query(UserSessions.id, expr).subquery("subq")
q = (
db.session.query(UserSessions, subq.c.session_number)
.join(subq, UserSessions.id == subq.c.id)
.order_by(UserSessions.id)
)
return q.all()
Alternatively, I would actually add a column_property to the model compute it on the fly for each instance of UserSessions. it is not as efficient in calculation, but for queries filtering by specific user it should be good enough:
class UserSessions(db.Model):
__tablename__ = "user_sessions"
id = db.Column(db.Integer, primary_key=True, unique=True)
username = db.Column(db.String, nullable=False)
created_at = db.Column(db.DateTime)
last_edit = db.Column(db.DateTime)
# must define this outside of the model definition because of need for aliased
US2 = db.aliased(UserSessions)
UserSessions.session_number = db.column_property(
db.select(db.func.count(US2.id))
.where(US2.username == UserSessions.username)
.where(US2.id <= UserSessions.id)
.scalar_subquery()
)
In this case, when you query for UserSessions, the session_number will be fetched from the database, while being None for newly created instances.
I have a situation where I am trying to count up the number of rows in a table when the column value is in a subquery. For example, lets say that I have some sql like so:
select count(*) from table1
where column1 in (select column2 from table2);
I have my tables defined like so:
class table1(Base):
__tablename__ = "table1"
__table_args__ = {'schema': 'myschema'}
acct_id = Column(DECIMAL(precision=15), primary_key=True)
class table2(Base):
__tablename__ = "table2"
__table_args__ = {'schema': 'myschema'}
ban = Column(String(length=128), primary_key=True)
The tables are reflected from the database so there are other attributes present that aren't explicitly specified in the class definition.
I can try to write my query but here is where I am getting stuck...
qry=self.session.query(func.?(...)) # what to put here?
res = qry.one()
I tried looking through the documentation here but I don't see any comparable implementation to the 'in' keyword which is a feature of many SQL dialects.
I am using Teradata as my backend if that matters.
sub_stmt = session.query(table2.some_id)
stmt = session.query(table1).filter(table1.id.in_(sub_stmt))
data = stmt.all()
I am creating a website using Flask and SQLAlchemy. This website keeps track of classes that a student has taken. I would like to find a way to search my database using SQLAlchemy to find all unique classes that have been entered. Here is code from my models.py for Class:
class Class(db.Model):
__tablename__ = 'classes'
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(100))
body = db.Column(db.Text)
created = db.Column(db.DateTime, default=datetime.datetime.now)
user_email = db.Column(db.String(100), db.ForeignKey(User.email))
user = db.relationship(User)
In other words, I would like to get all unique values from the title column and pass that to my views.py.
Using the model query structure you could do this
Class.query.with_entities(Class.title).distinct()
query = session.query(Class.title.distinct().label("title"))
titles = [row.title for row in query.all()]
titles = [r.title for r in session.query(Class.title).distinct()]
As #van has pointed out, what you are looking for is:
session.query(your_table.column1.distinct()).all(); #SELECT DISTINCT(column1) FROM your_table
but I will add that in most cases, you are also looking to add another filter on the results. In which case you can do
session.query(your_table.column1.distinct()).filter_by(column2 = 'some_column2_value').all();
which translates to sql
SELECT DISTINCT(column1) FROM your_table WHERE column2 = 'some_column2_value';
I developing simple blog with tagging support. Actually I would like to add tags cloud functionality and I need to get count of each tag used in blog.
My Blog and Tag models looks like:
class Blog(db.Model, ObservableModel):
__tablename__ = "blogs"
id = db.Column(db.Integer, db.Sequence('blog_id_seq'), primary_key=True)
title = db.Column(db.String(200), unique=True, nullable=True)
tags = relationship('Tag', secondary=tags_to_blogs_association_table)
class Post(db.Model, ObservableModel):
__tablename__ = "posts"
......................
blog = relationship('Blog', backref = db.backref('blogs', lazy='dynamic'))
tags = relationship('Tag', secondary=tags_to_posts_association_table)
class Tag(db.Model):
__tablename__ = "tags"
id = db.Column(db.Integer, db.Sequence('post_id_seq'), primary_key=True)
title = db.Column(db.String(30), unique=False, nullable=True)
I want to collect dictionary of pairs like tag_name : count and only one way is to iterate over Blog.tags collection with retrieving posts which contains tag item.
Actually I am not sure that it is the best (from performance point of view) solution, maybe flask-sqlalchemy provides join function?
Question: how to implement in Python using Flask-SQLAlchemy query like following:
select
t.id,
t.title,
count(post_id)
from tags t
join tags_to_blogs b on t.id=b.tag_id
join tags_to_posts p on t.id=p.tag_id
group by (t.id)
having b.blog_id=1
Try this:
query = db.session.query(Tag, db.count(Post.id))
query = query.filter(
(tags_to_posts_association_table.tag_id == Tag.id) & \
(tags_to_posts_association_table.post_id == Post.id)
)
query = query.group_by(Tag.id)
This generates this query:
SELECT tags.id AS tags_id, tags.title AS tags_title, count(posts.id) AS count_1
FROM tags, posts, tags_to_posts
WHERE tags_to_posts.tag_id = tags.id AND tags_to_posts.post_id = posts.id GROUP BY tags.id
A cleaner way could be something like this:
query = db.session.query(Tag, db.func.count(Post.id))
# This works but the preferred way is what's below it
#query = query.join(tags_to_posts_association_table, Post)
query = query.join(Post.tags)
query = query.group_by(Tag.id)
This generates this query:
SELECT tags.id AS tags_id, tags.title AS tags_title, count(posts.id) AS count_1
FROM tags INNER JOIN tags_to_posts ON tags.id = tags_to_posts.tag_id INNER JOIN posts ON posts.id = tags_to_posts.post_id GROUP BY tags.id
All these produce the same result, and you can chain them just like this:
query = db.session.query(Tag.title, db.func.count(Post.id)).join(Post.tags).group_by(Tag.id)
# This will give you a dictionary with keys the tag titles, and values the count of each
# Because you can iterate over the query, which will give you the results
# Or you can use query.all() and use it as you prefer.
results = dict(query)
Also, I'm not sure if it's db.func.count or db.count. In any way you can always from sqlalchemy import func and use func.count.
I would do it this way (pseudo code, can't remember the proper alchemy syntax but you should be able to 'convert' it quiet easily)
tags = Tags.findAll()
for tag in tags:
myDict[tag] = Post.find(tags=tag).count()
And at the and you should have all tags in myDict with their count