SQLAlchemy counting all rows, I want specific rows - python

In words, I'm trying to achieve this goal:
"Get 5 comments where comment.post_id == self.context.id and sort those by the highest number of Comment_Vote.vote_type == 'like' "
Currently the models are:
vote_enum = ENUM('like', 'dislike', name='vote_enum', create_type=False)
class User(Base):
__tablename__='users'
id = Column(Integer, primary_key=True, autoincrement=True)
username = Column(String(65), nullable=False)
comments = relationship('Comment', backref='user')
comment_vote = relationship('Comment_Vote', backref='user')
posts=relationship('Post', backref='user')
class Post(Base):
__tablename__ = 'post'
id = Column(Integer, primary_key=True, autoincrement=True)
body= Column(String(1500))
comments= relationship('Comment',backref='post', order_by='desc(Comment.date_created)', lazy='dynamic')
owner_id= Column(Integer, ForeignKey('users.id'))
class Comment(Base):
__tablename__='comment'
id = Column(Integer, primary_key=True, autoincrement=True)
body= Column(String(500))
parent_id = Column(Integer, ForeignKey('comment.id'))
post_id= Column(Integer, ForeignKey('post.id'))
user_id= Column(Integer, ForeignKey('users.id'))
children = relationship("Comment",
backref=backref('parent', remote_side=[id]),
lazy='dynamic'
)
del_flag= Column(Boolean, default=False)
date_created= Column(DateTime(), default=datetime.datetime.utcnow())
last_edited= Column(DateTime(), default=datetime.datetime.utcnow())
comment_vote= relationship("Comment_Vote", backref="comment", lazy='dynamic')
class Comment_Vote(Base):
__tablename__='comment_vote'
id = Column(Integer, primary_key=True, autoincrement=True)
user_id= Column(Integer, ForeignKey('users.id'))
comment_id= Column(Integer, ForeignKey('comment.id'))
vote_type = Column('vote_enum', vote_enum)
#classmethod
def total_likes(cls, comment_id, session):
return session.query(cls).filter(cls.id == comment_id).first().comment_vote.filter(Comment_Vote.vote_type=='like').count()
My functioning query is:
f = session.query(Comment_Vote.comment_id, funcfilter(func.count(1), Comment_Vote.vote_type == 'like').label('total_likes')).group_by(Comment_Vote.comment_id).subquery()
comments = session.query(Comment, f.c.total_likes).join(f, Comment.id==f.c.comment_id).filter(Comment.post_id == self.context.id).order_by('total_likes DESC').limit(5)
This has the nasty side effect of counting ALL comment_vote 'likes', even for comments that aren't relevant to that post.
I'd be really grateful for a bit of advice on how to rearrange this so it didn't have to count everything first. What I want may not be possible, and I'm working mostly within the ORM.
DB behind the SQLAlchemy is Postgresql.

This could be a nice place to use a lateral subquery. It is the "foreach" of SQL, which is to say that a lateral subquery can reference columns of preceding FROM items. Postgresql supports lateral from versions 9.3 and up, SQLAlchemy from versions 1.1 and up:
from sqlalchemy import true
f = session.query(func.count(1).label('total_likes')).\
filter(Comment_Vote.comment_id == Comment.id, # References Comment
Comment_Vote.vote_type == 'like').\
subquery().\
lateral()
comments = session.query(Comment, f.c.total_likes).\
join(f, true()).\
filter(Comment.post_id == self.context.id).\
order_by(f.c.total_likes.desc()).\
limit(5)
I moved filtering based on vote_type to WHERE clause of the subquery, as it's unnecessary in this case to first fetch all rows and then filter in the aggregate function (which also cannot use indexes).
Of course in this case you could also use a scalar subquery in the SELECT output for same effect:
f = session.query(func.count(1)).\
filter(Comment_Vote.comment_id == Comment.id, # References Comment
Comment_Vote.vote_type == 'like').\
label('total_likes')
comments = session.query(Comment, f).\
filter(Comment.post_id == self.context.id).\
order_by(f.desc()).\
limit(5)

Related

How to access column values in SQLAlchemy result list after a join a query

I need to access colums of result query. I have these models
class Order(Base):
__tablename__ = "orders"
internal_id = Column(Integer, primary_key=True)
total_cost = Column(Float, nullable=False)
created_at = Column(TIMESTAMP(timezone=True), nullable=False, server_default=text("now()"))
customer_id = Column(Integer, ForeignKey("customers.id", ondelete="CASCADE"), nullable=False)
customer = relationship("Customer")
class Item(Base):
__tablename__ = "items"
id = Column(Integer, primary_key=True, nullable=False)
internal_id = Column(Integer, nullable=False)
price = Column(Float, nullable=False)
description = Column(String, nullable=False)
order_id = Column(Integer, ForeignKey("orders.internal_id", ondelete="CASCADE"), nullable=False)
order = relationship("Order")
Now I run this left join query that gives me all the columns from both tables
result = db.query(Order, Item).join(Item, Item.order_id == Order.internal_id, isouter=True).filter(Item.order_id == order_id).all()
I get back a list of tuples. How do I access a particular column of the result list? Doing something like this
for i in result:
print(i.???) # NOW WHAT?
Getting AttributeError: Could not locate column in row for column anytime i try to fetch it by the name I declared.
this is the full function where I need to use it
#router.get("/{order_id}")
def get_orders(order_id: int, db: Session = Depends(get_db)):
""" Get one order by id. """
# select * from orders left join items on orders.internal_id = items.order_id where orders.internal_id = {order_id};
result = db.query(Order, Item).join(Item, Item.order_id == Order.internal_id, isouter=True).filter(Item.order_id == order_id).all()
for i in result:
print(i.description) # whatever value i put here it errors out
This is the traceback
...
print(i.description) # whatever value i put here it errors out
AttributeError: Could not locate column in row for column 'description'
At least if I could somehow get the column names.. But i just cant get them. Trying keys(), _metadata.keys .. etc. Nothing works so far.
If additional implicite queries are not an issue for you, you can do something like this:
class Order(Base):
__tablename__ = "orders"
internal_id = Column(Integer, primary_key=True)
total_cost = Column(Float, nullable=False)
created_at = Column(TIMESTAMP(timezone=True), nullable=False, server_default=text("now()"))
customer_id = Column(Integer, ForeignKey("customers.id", ondelete="CASCADE"), nullable=False)
customer = relationship("Customer")
items = relationship("Item", lazy="dynamic")
order = session.query(Order).join(Item, Order.internal_id == Item.order_id, isoutrr=True).filter(Order.internal_id == order_id).first()
if order:
for i in order.items:
print(i.description)
print(order.total_cost)
However to avoid additional query when accessing items you can exploit contains_eager option:
from sqlalchemy.orm import contains_eager
order = session.query(Order).join(Item, Order.internal_id == Item.order_id, isoutrr=True).options(contains_eager("items").filter(Order.internal_id == order_id).all()
Here you have some examples: https://jorzel.hashnode.dev/an-orm-can-bite-you
Ok, so acctualy the answer is quite simple. One just simply needs to use dot notation like i.Order.total_cost or whichever other field from the Order model
result = db.query(Order, Item).join(Item, Item.order_id == Order.internal_id, isouter=True).filter(Item.order_id == order_id).all()
for i in result:
print(i.Order.total_cost)
print(i.Item.description)

Query specific columns of type relationships in sqlalchemy

I have a table with some text fields and a lot of many to many relationships. I'm having trouble querying only some of the fields. Here's how I query the table:
with db.session(raise_err=True) as session:
result = session.query(
ClientKnowledge
).options(
*[defaultload(getattr(ClientKnowledge, table.table.name))
.load_only(KqFactory().get_tables([table.table.name])[0].get_pk().name)
for table in ClientKnowledge.__mapper__.relationships]
).filter(
ClientKnowledge.id == 1
)
This works fine, SQLAlchemy is doing all the joins for me and uses the lazy=joinedload param well.
When I specify fields in the query parameters I have to do all the joins myself, but I don't want that; also tried to use with_entities() and to make the query from the other side of the relationship, but got the same result.
Is there a way I can query only specific fields without losing SQLAlchemy's ability to make the joins for me?
PS: I can't really give a complete working example since all my tables are dynamically generated. Please tell me if more context is needed.
EDIT:
here is an example of how my tables looks like
#as_declarative()
class RightTable1:
id = Column(Integer, primary_key=True)
desc = Text()
#as_declarative()
class RightTable2:
id = Column(Integer, primary_key=True)
desc = Text()
#as_declarative()
class Association1:
__tablename__ = 'association1'
left_id = Column(Integer, ForeignKey(ClientKnowledge.rel_field1), primary_key=True)
right_id = Column(Integer, ForeignKey(RightTable1.id), primary_key=True)
#as_declarative()
class Association2:
__tablename__ = 'association2'
left_id = Column(Integer, ForeignKey(ClientKnowledge.rel_field2), primary_key=True)
right_id = Column(Integer, ForeignKey(RightTable2.id), primary_key=True)
#as_declarative()
class ClientKnowledge:
id = Column(Integer, primary_key=True)
text_field = Text()
rel_field1 = relationship(
kq_table, secondary=Association1.__table__, lazy='joined',
backref=backref(RightTable1.__table__.name, lazy='joined')
)
rel_field2 = relationship(
kq_table, secondary=Association2.__table__, lazy='joined',
backref=backref(RightTable2.__table__.name, lazy='joined')
)
what i want is being able to query ClientKnowledge.rel_field1.
I have tried:
session.query(ClientKnowledge.rel_field1)
But i have to do all the join myself.

SQLAlchemy different tables for relationship?

Suppose I have the following tables/relationships defined:
class Post(Base):
__table_name__ = "post"
id = Column(Integer, primary_key=True)
text = Column(String(100))
class Comment(Base):
__table_name__ = "comment"
id = Column(Integer, primary_key=True)
post_id = Column(Integer, ForeignKey("post.id"), nullable=False)
text = Column(String(100))
Now, I want to have notifications of events, like "You were tagged in a comment" or "you were tagged in a post". Is there some way to have a foreign key relationship in SQLAlchemy that can point to either a comment or a post (or several other tables in reality)? Something like:
class Notification(Base):
__table_name__ = "notification"
id = Column(Integer, primary_key=True)
target = relationship(??either post or comment??)
user_id = Column(Integer, ForeignKey("users.id")
created_date = Column(Datetime, default=datetime.utcnow)
I suppose you could just put foreign keys to all the different types and make all but one null, but that seems ugly. I'd also rather not have multiple tables for each type of notification; as in comment_notification and post_notification. Any ideas?

sqlalchemy constraint in models inheritance

I have two simple models:
class Message(Backend.instance().get_base()):
__tablename__ = 'messages'
id = Column(Integer, primary_key=True, autoincrement=True)
sender_id = Column(Integer, ForeignKey('users.id'))
content = Column(String, nullable=False)
class ChatMessage(Message):
__tablename__ = 'chat_messages'
id = Column(Integer, ForeignKey('messages.id'), primary_key=True)
receiver_id = Column(Integer, ForeignKey('users.id'))
How to define constraint sender_id!=receiver_id?
This doesn't seem to work with joined table inheritance, I've tried and it complains that the column sender_id from Message doesn't exist when creating the constraint in ChatMessage.
This complaint makes sense, since sender_id wouldn't be in the same table as receiver_id when the tables are created, so the foreign key relationship would need to be followed to check the constraint.
One option is to make ChatMessage a single table.
Use CheckConstraint, placed in table args.
class ChatMessage(Base):
__tablename__ = 'chat_messages'
id = sa.Column(sa.Integer, primary_key=True)
sender_id = sa.Column(sa.Integer, sa.ForeignKey(User.id))
receiver_id = sa.Column(sa.Integer, sa.ForeignKey(User.id))
content = sa.Column(sa.String, nullable=False)
__table_args__ = (
sa.CheckConstraint(receiver_id != sender_id),
)

sqlalchemy foreign keys / query joins

Hi im having some trouble with foreign key in sqlalchemy not auto incrementing on a primary key ID
Im using: python 2.7, pyramid 1.3 and sqlalchemy 0.7
Here is my models
class Page(Base):
__tablename__ = 'page'
id = Column(Integer, ForeignKey('mapper.object_id'), autoincrement=True, primary_key=True)
title = Column(String(30), unique=True)
title_slug = Column(String(75), unique=True)
text = Column(Text)
date_added = Column(DateTime)
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(100), unique=True)
email = Column(String(100), unique=True)
password = Column(String(100))
class Group(Base):
__tablename__ = 'groups'
id = Column(Integer, primary_key=True)
name = Column(String(100), unique=True)
class Member(Base):
__tablename__ = 'members'
user_id = Column(Integer, ForeignKey('user.id'), primary_key=True)
group_id = Column(Integer, ForeignKey('groups.id'), primary_key=True)
class Resource(Base):
__tablename__ = 'resource'
id = Column(Integer, primary_key=True)
tablename = Column(Text)
action = Column(Text)
class Mapper(Base):
__tablename__ = 'mapper'
resource_id = Column(Integer, ForeignKey('resource.id'), primary_key=True)
group_id = Column(Integer, ForeignKey('groups.id'), primary_key=True)
object_id = Column(Integer, primary_key=True)
and here is my RAW SQL query which i've written in SQLAlchemys ORM
'''
SELECT g.name, r.action
FROM groups AS g
INNER JOIN resource AS r
ON m.resource_id = r.id
INNER JOIN page AS p
ON p.id = m.object_id
INNER JOIN mapper AS m
ON m.group_id = g.id
WHERE p.id = ? AND
r.tablename = ?;
'''
obj = Page
query = DBSession().query(Group.name, Resource.action)\
.join(Mapper)\
.join(obj)\
.join(Resource)\
.filter(obj.id == obj_id, Resource.tablename == obj.__tablename__).all()
the raw SQL Query works fine without any relations between Page and Mapper, but SQLAlchemys ORM seem to require a ForeignKey link to be able to join them. So i decided to put the ForeignKey at Page.id since Mapper.object_id will link to several different tables.
This makes the SQL ORM query with the joins work as expected but adding new data to the Page table results in a exception.
FlushError: Instance <Page at 0x3377c90> has a NULL identity key.
If this is an auto- generated value, check that the database
table allows generation of new primary key values, and that the mapped
Column object is configured to expect these generated values.
Ensure also that this flush() is not occurring at an inappropriate time,
such as within a load() event.
here is my view code:
try:
session = DBSession()
with transaction.manager:
page = Page(title, text)
session.add(page)
return HTTPFound(location=request.route_url('home'))
except Exception as e:
print e
pass
finally:
session.close()
I really don't know why, but i'd rather have the solution in SQLalchemy than doing the RAW SQL since im making this project for learning purposes :)
I do not think autoincrement=True and ForeignKey(...) play together well.
In any case, for join to work without any ForeignKey, you can just specify the join condition in the second parameter of the join(...):
obj = Page
query = DBSession().query(Group.name, Resource.action)\
.join(Mapper)\
.join(Resource)\
.join(obj, Resource.tablename == obj.__tablename__)\
.filter(obj.id == obj_id)\
.all()

Categories

Resources