In the case of many-to-many relationships, an association table can be used in the form of Association Object pattern.
I have the following setup of two classes having a M2M relationship through UserCouncil association table.
class Users(Base):
name = Column(String, nullable=False)
email = Column(String, nullable=False, unique=True)
created_at = Column(DateTime, default=datetime.utcnow)
password = Column(String, nullable=False)
salt = Column(String, nullable=False)
councils = relationship('UserCouncil', back_populates='user')
class Councils(Base):
name = Column(String, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
users = relationship('UserCouncil', back_populates='council')
class UserCouncil(Base):
user_id = Column(UUIDType, ForeignKey(Users.id, ondelete='CASCADE'), primary_key=True)
council_id = Column(UUIDType, ForeignKey(Councils.id, ondelete='CASCADE'), primary_key=True)
role = Column(Integer, nullable=False)
user = relationship('Users', back_populates='councils')
council = relationship('Councils', back_populates='users')
However, in this situation, suppose I want to search for a council with a specific name cname for a given user user1. I can do the following:
for council in user1.councils:
if council.name == cname:
dosomething(council)
Or, alternatively, this:
session.query(UserCouncil) \
.join(Councils) \
.filter((UserCouncil.user_id == user1.id) & (Councils.name == cname)) \
.first() \
.council
While the second one is more similar to raw SQL queries and performs better, the first one is simpler. Is there any other, more idiomatic way of expressing this query which is better performing while also utilizing the relationship linkages instead of explicitly writing traditional joins?
First, I think even the SQL query you bring as an example might need to go to fetch the UserCouncil.council relationship again to the DB if it is not loaded in the memory already.
I think that given you want to search directly for the Council instance given its .name and the User instance, this is exactly what you should ask for. Below is the query for that with 2 options on how to filter on user_id (you might be more familiar with the second option, so please use it):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.filter(Councils.users.any(UserCouncil.user_id == user_id)) # v1: this does not require JOIN, but produces the same result as below
# .join(UserCouncil).filter(UserCouncil.user_id == user_id) # v2: join, very similar to original SQL
)
council = session.execute(q).scalars().first()
As to making it more simple and idiomatic, I can only suggest to wrap it in a method or property on the User instance:
class Users(...):
...
def get_council_by_name(self, councils_name):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.join(UserCouncil).filter(with_parent(self, Users.councils))
)
return object_session(self).execute(q).scalars().first()
so that you can later call it user.get_council_by_name('xxx')
Edit-1: added SQL queries
v1 of the first q query above will generate following SQL:
SELECT councils.id,
councils.name
FROM councils
WHERE councils.name = :name_1
AND (EXISTS
(SELECT 1
FROM user_councils
WHERE councils.id = user_councils.council_id
AND user_councils.user_id = :user_id_1
)
)
while v2 option will generate:
SELECT councils.id,
councils.name
FROM councils
JOIN user_councils ON councils.id = user_councils.council_id
WHERE councils.name = :name_1
AND user_councils.user_id = :user_id_1
Related
I have a scenario to iterate up session_number column for related user_name. If a user created a session before I'll iterate up the last session_number but if a user created session for the first time session_number should start from 1. I tried to illustrate on below. Right now I handle this by using logic but try to find more elegant way to do that in SqlAlchemy.
id - user_name - session_number
1 user_1 1
2 user_1 2
3 user_2 1
4 user_1 3
5 user_2 2
Here is my python code of the table. My database is PostgreSQL and I'm using alembic to upgrade tables. Right now it continues to iterate up the session_number regardless user_name.
class UserSessions(db.Model):
__tablename__ = 'user_sessions'
id = db.Column(db.Integer, primary_key=True, unique=True)
username = db.Column(db.String, nullable=False)
session_number = db.Column(db.Integer, Sequence('session_number_seq', start=0, increment=1))
created_at = db.Column(db.DateTime)
last_edit = db.Column(db.DateTime)
__table_args__ = (
db.UniqueConstraint('username', 'session_number', name='_username_session_number_idx_'),
)
I've searched on the internet for this situation but those were not like my problem. Is it possible to achieve this with SqlAlchemy/PostgreSQL actions?
First, I do not know of any "pure" solution for this situation by using either SqlAlchemy or Postgresql or a combination of the two.
Although it might not be exactly the solution you are looking for, I hope it will give you some ideas.
If you wanted to calculate the session_number for the whole table without it being stored, i would use the following query or a variation of thereof:
def get_user_sessions_with_rank():
expr = (
db.func.rank()
.over(partition_by=UserSessions.username, order_by=[UserSessions.id])
.label("session_number")
)
subq = db.session.query(UserSessions.id, expr).subquery("subq")
q = (
db.session.query(UserSessions, subq.c.session_number)
.join(subq, UserSessions.id == subq.c.id)
.order_by(UserSessions.id)
)
return q.all()
Alternatively, I would actually add a column_property to the model compute it on the fly for each instance of UserSessions. it is not as efficient in calculation, but for queries filtering by specific user it should be good enough:
class UserSessions(db.Model):
__tablename__ = "user_sessions"
id = db.Column(db.Integer, primary_key=True, unique=True)
username = db.Column(db.String, nullable=False)
created_at = db.Column(db.DateTime)
last_edit = db.Column(db.DateTime)
# must define this outside of the model definition because of need for aliased
US2 = db.aliased(UserSessions)
UserSessions.session_number = db.column_property(
db.select(db.func.count(US2.id))
.where(US2.username == UserSessions.username)
.where(US2.id <= UserSessions.id)
.scalar_subquery()
)
In this case, when you query for UserSessions, the session_number will be fetched from the database, while being None for newly created instances.
In a project using Flask-SQLAlchemy, i get some intermittent errors and i think it might be due to not explicitly using transactions.
I have these two model classes, one for locations and another for closures:
class Location(db.Model):
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String)
code = sa.Column(sa.String, unique=True)
class LocationPath(db.Model):
ancestor_id = sa.Column(sa.Integer, sa.ForeignKey('location.id'), nullable=False, primary_key=True)
descendant_id = sa.Column(sa.Integer, sa.ForeignKey('location.id'), nullable=False, primary_key=True)
depth = sa.Column(sa.Integer, default=0, nullable=False)
In a background process, i'm doing a lot of inserts, so i'm bypassing the ORM to use Core:
location_table = Location.__table__
location_path_table = LocationPath.__table__
statement = select([location_table.c.id]).where(code == code)
result = db.session.get_bind().execute(statement)
location_id = result.first()
if location_id is None:
statement = location_table.insert().values(**kwargs)
result = db.session.get_bind().execute(statement)
new_id = result.inserted_primary_key[0]
result.close()
else:
new_id = location_id
# save new_id as an ancestor_id or a descendant_id
path = LocationPath.query.filter_by(
ancestor_id=ancestor_id,
descendant_id=descendant_id
).first()
if path is None:
statement = location_path_table.insert().values(
ancestor_id=ancestor_id,
descendant_id=descendant_id,
depth=depth)
# the line below intermittently generates either of two errors:
# - the inserted primary key (ancestor/descendant) does not exist
# - a duplicate key error where the path already exists
result = db.session.get_bind().execute(statement)
this has resulted in quite a bit of head-scratching on my part, since i get the ancestor_id or descendant_id either from a select or an insert, and i also query the database to see if the path exists before attempting to insert it.
Edit: the code above runs in a loop.
I have got a not very common join and filter problem.
Here are my models;
class Order(Base):
id = Column(Integer, primary_key=True)
order_id = Column(String(19), nullable=False)
... (other fields)
class Discard(Base):
id = Column(Integer, primary_key=True)
order_id = Column(String(19), nullable=False)
I want to query all and full instances of Order but just exclude those that have a match in Discard.order_id based on Order.order_id field. As you can see there is no relationship between order_id fields.
I've tried outer left join, notin_ but ended up with no success.
With this answer I've achieved desired results.
Here is my code;
orders = (
session.query(Order)
.outerjoin(Discard, Order.order_id == Discard.order_id)
.filter(Discard.order_id == None) # noqa: E711
.all()
)
I was paying too much attention to flake8 wrong syntax message at Discard.order_id == None and was using Discard.order_id is None. It appeared out they were rendered differently by sqlalchemy.
I have two tables, Products and Orders, inside my Flask-SqlAlchemy setup, and they are linked so an order can have several products:
class Products(db.Model):
id = db.Column(db.Integer, primary_key=True)
....
class Orders(db.Model):
guid = db.Column(db.String(36), default=generate_uuid, primary_key=True)
products = db.relationship(
"Products", secondary=order_products_table, backref="orders")
....
linked via:
order_products_table = db.Table("order_products_table",
db.Column('orders_guid', db.String(36), db.ForeignKey('orders.guid')),
db.Column('products_id', db.Integer, db.ForeignKey('products.id'))
# db.Column('license', dbString(36))
)
For my purposes, each product in an order will receive a unique license string, which logically should be added to the order_products_table rows of each product in an order.
How do I declare this third license column on the join table order_products_table so it gets populated it as I insert an Order?
I've since found the documentation for the Association Object from the SQLAlchemy docs, which allows for exactly this expansion to the join table.
Updated setup:
# Instead of a table, provide a model for the JOIN table with additional fields
# and explicit keys and back_populates:
class OrderProducts(db.Model):
__tablename__ = 'order_products_table'
orders_guid = db.Column(db.String(36), db.ForeignKey(
'orders.guid'), primary_key=True)
products_id = db.Column(db.Integer, db.ForeignKey(
'products.id'), primary_key=True)
order = db.relationship("Orders", back_populates="products")
products = db.relationship("Products", back_populates="order")
licenses = db.Column(db.String(36), nullable=False)
class Products(db.Model):
id = db.Column(db.Integer, primary_key=True)
order = db.relationship(OrderProducts, back_populates="order")
....
class Orders(db.Model):
guid = db.Column(db.String(36), default=generate_uuid, primary_key=True)
products = db.relationship(OrderProducts, back_populates="products")
....
What is really tricky (but also shown on the documentation page), is how you insert the data. In my case it goes something like this:
o = Orders(...) # insert other data
for id in products:
# Create OrderProducts join rows with the extra data, e.g. licenses
join = OrderProducts(licenses="Foo")
# To the JOIN add the products
join.products = Products.query.get(id)
# Add the populated JOIN as the Order products
o.products.append(join)
# Finally commit to database
db.session.add(o)
db.session.commit()
I was at first trying to populate the Order.products (or o.products in the example code) directly, which will give you an error about using a Products class when it expects a OrderProducts class.
I also struggled with the whole field naming and referencing of the back_populates. Again, the example above and on the docs show this. Note the pluralization is entirely to do with how you want your fields named.
First, the database overview:
competitors - people who compete
competitions - things that people compete at
competition_registrations - Competitors registered for a particular competition
event - An "event" at a competition.
events_couples - A couple (2 competitors) competing in an event.
First, EventCouple, a Python class corresponding to events_couples, is:
class EventCouple(Base):
__tablename__ = 'events_couples'
competition_id = Column(Integer, ForeignKey('competitions.id'), primary_key=True)
event_id = Column(Integer, ForeignKey('events.id'), primary_key=True)
leader_id = Column(Integer)
follower_id = Column(Integer)
__table_args__ = (
ForeignKeyConstraint(['competition_id', 'leader_id'], ['competition_registrations.competition_id', 'competition_registrations.competitor_id']),
ForeignKeyConstraint(['competition_id', 'follower_id'], ['competition_registrations.competition_id', 'competition_registrations.competitor_id']),
{}
)
I have a Python class, CompetitorRegistration, that corresponds to a record/row in competition_registrations. A competitor, who is registered, can compete in multiple events, but either as a "leader", or a "follower". I'd like to add to CompetitorRegistration an attribute leading, that is a list of EventCouple where the competition_id and leader_id match. This is my CompetitorRegistration class, complete with attempt:
class CompetitorRegistration(Base):
__tablename__ = 'competition_registrations'
competition_id = Column(Integer, ForeignKey('competitions.id'), primary_key=True)
competitor_id = Column(Integer, ForeignKey('competitors.id'), primary_key=True)
email = Column(String(255))
affiliation_id = Column(Integer, ForeignKey('affiliation.id'))
is_student = Column(Boolean)
registered_time = Column(DateTime)
leader_number = Column(Integer)
leading = relationship('EventCouple', primaryjoin=and_('CompetitorRegistration.competition_id == EventCouple.competition_id', 'CompetitorRegistration.competitor_id == EventCouple.leader_id'))
following = relationship('EventCouple', primaryjoin='CompetitorRegistration.competition_id == EventCouple.competition_id and CompetitorRegistration.competitor_id == EventCouple.follower_id')
However, I get:
ArgumentError: Could not determine relationship direction for primaryjoin
condition 'CompetitorRegistration.competition_id == EventCouple.competition_id
AND CompetitorRegistration.competitor_id == EventCouple.leader_id', on
relationship CompetitorRegistration.leading. Ensure that the referencing Column
objects have a ForeignKey present, or are otherwise part of a
ForeignKeyConstraint on their parent Table, or specify the foreign_keys parameter
to this relationship.
Thanks for any help, & let me know if more info is needed on the schema.
Also, another attempt of mine is visible in following — this did not error, but didn't give correct results either. (It only joined on the competition_id, and completely ignored the follower_id)
Your leading's condition mixes expression and string to be eval()ed. And following's condition mixes Python and SQL operators: and in Python is not what you expected here. Below are corrected examples using both variants:
leading = relationship('EventCouple', primaryjoin=(
(competition_id==EventCouple.competition_id) & \
(competitor_id==EventCouple.leader_id)))
leading = relationship('EventCouple', primaryjoin=and_(
competition_id==EventCouple.competition_id,
competitor_id==EventCouple.leader_id))
following = relationship('EventCouple', primaryjoin=\
'(CompetitorRegistration.competition_id==EventCouple.competition_id) '\
'& (CompetitorRegistration.competitor_id==EventCouple.follower_id)')