How to in aggregate distinct values in joined table in SQLAlchemy? - python

Here is the schema:
post_tag = Table("post_tag", Base.metadata,
Column("post_id", Integer, ForeignKey("post.id")),
Column("tag_id ", Integer, ForeignKey("tag.id")))
class Post(Base):
id = Column(Integer, primary_key=True)
tags = relationship("Tag", secondary=post_tag, backref="post", cascade="all")
collection_id = Column(Integer, ForeignKey("collection.id"))
class Tag(Base):
id = Column(Integer, primary_key=True)
description = Column("description", UnicodeText, nullable=False, default="")
post_id = Column(Integer, ForeignKey("post.id"))
class Collection(Base):
id = Column(Integer, primary_key=True)
title = Column(Unicode(128), nullable=False)
posts = relationship("Post", backref="collection", cascade="all,delete-orphan")
tags = column_property(select([Tag])
.where(and_(Post.collection_id == id, Tag.post_id == Post.id))
.correlate_except(Tag))
Basically, Post to Tag is many-to-many and Collection to Post is one-to-many.
I want to Collection.tags return a distinct set of tags of posts in collection.
However, I get the following error when I access Collection.tags:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) only a single result allowed for a SELECT that is part of an expression
EDIT
The SQL its generate
SELECT (SELECT tag.id, tag.description, tag.post_id
FROM tag, post
WHERE post.collection_id = collection.id AND tag.post_id = post.id) AS anon_1, collection.id AS collection_id, collection.title AS collection_title
FROM collection
WHERE collection.id = 1
I believe that post_id = Column(Integer, ForeignKey("post.id")) is wrong as post_id is in post_tag. However, if I change it to post_tag.post_id, it throws AttributeError: 'Table' object has no attribute 'post_id'
EDIT2
I change it to
tags = column_property(select([Tag])
.where(and_(Post.collection_id == id, post_tag.c.post_id == Post.id,
post_tag.c.tag_id == Tag.id)))
While this works
SELECT tag.id, tag.description, tag.category_id, tag.post_id
FROM tag, post, post_tag
WHERE post.collection_id = 1 AND post_tag.post_id = post.id AND post_tag.tag_id = tag.id
but the query generate by SQLAlchemy does not
SELECT (SELECT tag.id, tag.description, tag.category_id, tag.post_id
FROM tag, post, post_tag
WHERE post.collection_id = collection.id AND post_tag.post_id = post.id AND post_tag.tag_id = tag.id) AS anon_1
FROM collection
WHERE collection.id = 1

Instead of a column_property() you need a relationship() with a composite "secondary". A column property is handy for mapping some (scalar) SQL expression as a "column" that is loaded along other attributes. On the other hand you seem to want to map a collection of related Tag objects:
class Collection(Base):
id = Column(Integer, primary_key=True)
title = Column(Unicode(128), nullable=False)
posts = relationship("Post", backref="collection", cascade="all,delete-orphan")
tags = relationship(
"Tag", viewonly=True,
primaryjoin="Collection.id == Post.collection_id",
secondary="join(Post, post_tag)",
secondaryjoin="Tag.id == post_tag.c.tag_id")
If you want to eager load the relationship, a bit like the column property would have, you could default to lazy="join". It's also possible to define the eager load strategy on a per query basis using Query.options():
session.query(Collection).\
options(joinedload(Collection.tags)).\
all()
Please note that your example has a typo(?) in the definition of the secondary table post_tags. The column tag_id has trailing whitespace in the name.

Related

AttributeError: 'Query' object has no attribute 'is_clause_element' when joining table with query

AttributeError: 'Query' object has no attribute 'is_clause_element' when joining table with query
I have a query that counts the amount of keywords a company has and then sorts them by the amount of keywords they have.
query_company_ids = Session.query(enjordplatformCompanyToKeywords.company_id.label("company_id"),func.count(enjordplatformCompanyToKeywords.keyword_id)).group_by(enjordplatformCompanyToKeywords.company_id).order_by(desc(func.count(enjordplatformCompanyToKeywords.keyword_id))).limit(20)
I then want to get information about these companies like image, title, info etc and send it to the frontend (this is done later by looping through companies_query).
Though I have trouble in building the connection between the query_company_ids query and enjordplatformCompanies table.
I have tried two ways of doing this:
companies_query = Session.query(enjordplatformCompanies, query_company_ids).filter(enjordplatformCompanies.id == query_company_ids.company_id).all()
companies_query = Session.query(enjordplatformCompanies, query_company_ids).join( query_company_ids, query_company_ids.c.company_id == enjordplatformCompanies.id).all()
But both of them result in the error: AttributeError: 'Query' object has no attribute 'is_clause_element'
Question
How can I join the query_company_ids query and enjordplatformCompanies table?
Thanks
Here are the table definitions
class enjordplatformCompanies(Base):
__tablename__ = "enjordplatform_companies"
id = Column(Integer, primary_key=True, unique=True)
name = Column(String)
about = Column(String)
image = Column(String)
website = Column(String)
week_added = Column(Integer)
year_added = Column(Integer)
datetime_added = Column(DateTime)
created_by_userid = Column(Integer)
company_type = Column(String)
contact_email=Column(String)
adress=Column(String)
city_code=Column(String)
city=Column(String)
class enjordplatformCompanyToKeywords(Base):
__tablename__ = "enjordplatform_company_to_keywords"
id = Column(Integer, primary_key=True, unique=True)
company_id = Column(Integer,ForeignKey("enjordplatform_companies.id"))
keyword_id = Column(Integer,ForeignKey("enjordplatform_keywords.id"))
I copied your example query above and was getting a lot of weird errors until I realized you use Session instead of session. I guess make sure you are using an instance instead of the class or sessionmaker.
Below I create an explicit subquery() to get the company id paired with its keyword count and then I join the companies class against that, applying the order and limit to the final query.
with Session(engine) as session, session.begin():
subq = session.query(
enjordplatformCompanyToKeywords.company_id,
func.count(enjordplatformCompanyToKeywords.keyword_id).label('keyword_count')
).group_by(
enjordplatformCompanyToKeywords.company_id
).subquery()
q = session.query(
enjordplatformCompanies,
subq.c.keyword_count
).join(
subq,
enjordplatformCompanies.id == subq.c.company_id
).order_by(
desc(subq.c.keyword_count)
)
for company, keyword_count in q.limit(20).all():
print (company.name, keyword_count)
This isn't the exact method but explains the intention of calling .subquery() above:
subquery

SQLAlchemy many-to-many association querying specific child

In the case of many-to-many relationships, an association table can be used in the form of Association Object pattern.
I have the following setup of two classes having a M2M relationship through UserCouncil association table.
class Users(Base):
name = Column(String, nullable=False)
email = Column(String, nullable=False, unique=True)
created_at = Column(DateTime, default=datetime.utcnow)
password = Column(String, nullable=False)
salt = Column(String, nullable=False)
councils = relationship('UserCouncil', back_populates='user')
class Councils(Base):
name = Column(String, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
users = relationship('UserCouncil', back_populates='council')
class UserCouncil(Base):
user_id = Column(UUIDType, ForeignKey(Users.id, ondelete='CASCADE'), primary_key=True)
council_id = Column(UUIDType, ForeignKey(Councils.id, ondelete='CASCADE'), primary_key=True)
role = Column(Integer, nullable=False)
user = relationship('Users', back_populates='councils')
council = relationship('Councils', back_populates='users')
However, in this situation, suppose I want to search for a council with a specific name cname for a given user user1. I can do the following:
for council in user1.councils:
if council.name == cname:
dosomething(council)
Or, alternatively, this:
session.query(UserCouncil) \
.join(Councils) \
.filter((UserCouncil.user_id == user1.id) & (Councils.name == cname)) \
.first() \
.council
While the second one is more similar to raw SQL queries and performs better, the first one is simpler. Is there any other, more idiomatic way of expressing this query which is better performing while also utilizing the relationship linkages instead of explicitly writing traditional joins?
First, I think even the SQL query you bring as an example might need to go to fetch the UserCouncil.council relationship again to the DB if it is not loaded in the memory already.
I think that given you want to search directly for the Council instance given its .name and the User instance, this is exactly what you should ask for. Below is the query for that with 2 options on how to filter on user_id (you might be more familiar with the second option, so please use it):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.filter(Councils.users.any(UserCouncil.user_id == user_id)) # v1: this does not require JOIN, but produces the same result as below
# .join(UserCouncil).filter(UserCouncil.user_id == user_id) # v2: join, very similar to original SQL
)
council = session.execute(q).scalars().first()
As to making it more simple and idiomatic, I can only suggest to wrap it in a method or property on the User instance:
class Users(...):
...
def get_council_by_name(self, councils_name):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.join(UserCouncil).filter(with_parent(self, Users.councils))
)
return object_session(self).execute(q).scalars().first()
so that you can later call it user.get_council_by_name('xxx')
Edit-1: added SQL queries
v1 of the first q query above will generate following SQL:
SELECT councils.id,
councils.name
FROM councils
WHERE councils.name = :name_1
AND (EXISTS
(SELECT 1
FROM user_councils
WHERE councils.id = user_councils.council_id
AND user_councils.user_id = :user_id_1
)
)
while v2 option will generate:
SELECT councils.id,
councils.name
FROM councils
JOIN user_councils ON councils.id = user_councils.council_id
WHERE councils.name = :name_1
AND user_councils.user_id = :user_id_1

Exclude something from where clause by using SQLAlchemy core

I have the following model:
class Vote(BaseModel):
__tablename__ 'vote'
id = sa.Column(sa.Integer, autoincrement=True, index=True, primary_key=True)
value = sa.Column(sa.Integer, nullable=False)
rated_user_id = sa.Column(
sa.Integer, sa.ForeignKey('user.id', ondelete='cascade'))
rating_user_id = sa.Column(
sa.Integer, sa.ForeignKey('user.id', ondelete='cascade'))
And I just want to make a query with gives me joined data., nevertheless I don't know how to make this query. This is my approach:
query = sa.select(
[votes, users.alias('u1'), users.alias('u2')],
use_labels=True
).select_from(votes.join(users.alias('u1'),votes.c.rated_user_id == users.alias('u1').c.id).join(users.alias('u2'), votes.c.rating_user_id == users.alias('u2').c.id))
Buy it doesn't work because it includes "user" as "u1" in FROM clause.
Thanks!
Each invocation of alias() produces a unique alias object, even if you give them the same label. Instead give the aliases a name and use the same object in every part of your query:
u1 = users.alias('u1')
u2 = users.alias('u2')
query = sa.select([votes, u1, u2], use_labels=True).\
select_from(votes.
join(u1, votes.c.rated_user_id == u1.c.id).
join(u2, votes.c.rating_user_id == u2.c.id))

SQLAlchemy adding column with aggregate function to a dynamic loader list (AppenderQuery)

I get an incorrect record set, while adding an aggregate function like func.sum on a dynamic relationship. I have listed out a sample code below to demonstrate this.
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import (
relationship,
scoped_session,
sessionmaker,
backref
)
from sqlalchemy import (
create_engine,
Table,
Column,
Integer,
String,
ForeignKey,
func
)
from zope.sqlalchemy import ZopeTransactionExtension
import transaction
Base = declarative_base()
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
userid = Column(String(15), unique=True, nullable=False)
article_list = relationship("Article", backref="user", lazy="dynamic")
class Tag(Base):
__tablename__ = 'tags'
id = Column(Integer, primary_key=True)
name = Column(String(25), nullable=False, unique=True)
class Article(Base):
__tablename__ = 'articles'
id = Column(Integer, primary_key=True)
title = Column(String(25), nullable=False)
duration = Column(Integer)
user_id = Column(Integer, ForeignKey('users.id'), nullable=False)
tags = relationship('Tag', secondary="tag_map",
backref=backref("article_list", lazy="dynamic"))
tag_map_table = Table(
'tag_map', Base.metadata,
Column('tag_id', Integer, ForeignKey('tags.id'), nullable=False),
Column('article_id', Integer, ForeignKey('articles.id'), nullable=False))
engine = create_engine('sqlite:///tag_test.sqlite')
DBSession.configure(bind=engine)
Base.metadata.create_all(engine)
with transaction.manager:
t1 = Tag(name='software')
t2 = Tag(name='hardware')
john = User(userid='john')
a1 = Article(title='First article', duration=300)
a1.user = john
a1.tags.append(t1)
a1.tags.append(t2)
DBSession.add(a1)
a2 = Article(title='Second article', duration=50)
a2.user = john
a2.tags.append(t1)
a2.tags.append(t2)
DBSession.add(a1)
As we see above in the code, I have added two tags for both the articles. Now I want to query the articles written by the user 'John' grouped by tags along with it I want to find the sum of each tag duration.
john = DBSession.query(User).filter(User.userid=='john').first()
res = john.article_list.join(Article.tags).add_column(
func.sum(Article.duration)).group_by(Tag.id)
for article, tsum in res:
print ("Article : %s, Sum duration : %d" % (article.title, tsum))
The query generated for res is
SELECT articles.id AS articles_id, articles.title AS articles_title, articles.duration AS articles_duration, articles.user_id AS articles_user_id, sum(articles.duration) AS sum_1
FROM articles JOIN tag_map AS tag_map_1 ON articles.id = tag_map_1.article_id JOIN tags ON tags.id = tag_map_1.tag_id
WHERE :param_1 = articles.user_id GROUP BY tags.id
which when executed directly on the sqlite database yields two rows corresponding to the two tags
2|Second article|50|1|350
2|Second article|50|1|350
Whereas, the results returned by SQLAlchemy reflect only one row
Article : Second article, Sum duration : 350
But, if I add an extra column to contain tag-name in the AppenderQuery object
res = john.article_list.join(Article.tags).add_column(Tag.name).add_column(
func.sum(Article.duration)).group_by(Tag.id)
for article, tag_name, tsum in res:
print ("Article : %s, Tag : %s, Sum duration : %d" % (
article.title, tag_name, tsum))
I get proper results
Article : Second article, Tag : software, Sum duration : 350
Article : Second article, Tag : hardware, Sum duration : 350
So, what is the right way of using aggregate functions on AppenderQuery object in order to get categorized results?

Flask how to calculate tags count

I developing simple blog with tagging support. Actually I would like to add tags cloud functionality and I need to get count of each tag used in blog.
My Blog and Tag models looks like:
class Blog(db.Model, ObservableModel):
__tablename__ = "blogs"
id = db.Column(db.Integer, db.Sequence('blog_id_seq'), primary_key=True)
title = db.Column(db.String(200), unique=True, nullable=True)
tags = relationship('Tag', secondary=tags_to_blogs_association_table)
class Post(db.Model, ObservableModel):
__tablename__ = "posts"
......................
blog = relationship('Blog', backref = db.backref('blogs', lazy='dynamic'))
tags = relationship('Tag', secondary=tags_to_posts_association_table)
class Tag(db.Model):
__tablename__ = "tags"
id = db.Column(db.Integer, db.Sequence('post_id_seq'), primary_key=True)
title = db.Column(db.String(30), unique=False, nullable=True)
I want to collect dictionary of pairs like tag_name : count and only one way is to iterate over Blog.tags collection with retrieving posts which contains tag item.
Actually I am not sure that it is the best (from performance point of view) solution, maybe flask-sqlalchemy provides join function?
Question: how to implement in Python using Flask-SQLAlchemy query like following:
select
t.id,
t.title,
count(post_id)
from tags t
join tags_to_blogs b on t.id=b.tag_id
join tags_to_posts p on t.id=p.tag_id
group by (t.id)
having b.blog_id=1
Try this:
query = db.session.query(Tag, db.count(Post.id))
query = query.filter(
(tags_to_posts_association_table.tag_id == Tag.id) & \
(tags_to_posts_association_table.post_id == Post.id)
)
query = query.group_by(Tag.id)
This generates this query:
SELECT tags.id AS tags_id, tags.title AS tags_title, count(posts.id) AS count_1
FROM tags, posts, tags_to_posts
WHERE tags_to_posts.tag_id = tags.id AND tags_to_posts.post_id = posts.id GROUP BY tags.id
A cleaner way could be something like this:
query = db.session.query(Tag, db.func.count(Post.id))
# This works but the preferred way is what's below it
#query = query.join(tags_to_posts_association_table, Post)
query = query.join(Post.tags)
query = query.group_by(Tag.id)
This generates this query:
SELECT tags.id AS tags_id, tags.title AS tags_title, count(posts.id) AS count_1
FROM tags INNER JOIN tags_to_posts ON tags.id = tags_to_posts.tag_id INNER JOIN posts ON posts.id = tags_to_posts.post_id GROUP BY tags.id
All these produce the same result, and you can chain them just like this:
query = db.session.query(Tag.title, db.func.count(Post.id)).join(Post.tags).group_by(Tag.id)
# This will give you a dictionary with keys the tag titles, and values the count of each
# Because you can iterate over the query, which will give you the results
# Or you can use query.all() and use it as you prefer.
results = dict(query)
Also, I'm not sure if it's db.func.count or db.count. In any way you can always from sqlalchemy import func and use func.count.
I would do it this way (pseudo code, can't remember the proper alchemy syntax but you should be able to 'convert' it quiet easily)
tags = Tags.findAll()
for tag in tags:
myDict[tag] = Post.find(tags=tag).count()
And at the and you should have all tags in myDict with their count

Categories

Resources