First, the database overview:
competitors - people who compete
competitions - things that people compete at
competition_registrations - Competitors registered for a particular competition
event - An "event" at a competition.
events_couples - A couple (2 competitors) competing in an event.
First, EventCouple, a Python class corresponding to events_couples, is:
class EventCouple(Base):
__tablename__ = 'events_couples'
competition_id = Column(Integer, ForeignKey('competitions.id'), primary_key=True)
event_id = Column(Integer, ForeignKey('events.id'), primary_key=True)
leader_id = Column(Integer)
follower_id = Column(Integer)
__table_args__ = (
ForeignKeyConstraint(['competition_id', 'leader_id'], ['competition_registrations.competition_id', 'competition_registrations.competitor_id']),
ForeignKeyConstraint(['competition_id', 'follower_id'], ['competition_registrations.competition_id', 'competition_registrations.competitor_id']),
{}
)
I have a Python class, CompetitorRegistration, that corresponds to a record/row in competition_registrations. A competitor, who is registered, can compete in multiple events, but either as a "leader", or a "follower". I'd like to add to CompetitorRegistration an attribute leading, that is a list of EventCouple where the competition_id and leader_id match. This is my CompetitorRegistration class, complete with attempt:
class CompetitorRegistration(Base):
__tablename__ = 'competition_registrations'
competition_id = Column(Integer, ForeignKey('competitions.id'), primary_key=True)
competitor_id = Column(Integer, ForeignKey('competitors.id'), primary_key=True)
email = Column(String(255))
affiliation_id = Column(Integer, ForeignKey('affiliation.id'))
is_student = Column(Boolean)
registered_time = Column(DateTime)
leader_number = Column(Integer)
leading = relationship('EventCouple', primaryjoin=and_('CompetitorRegistration.competition_id == EventCouple.competition_id', 'CompetitorRegistration.competitor_id == EventCouple.leader_id'))
following = relationship('EventCouple', primaryjoin='CompetitorRegistration.competition_id == EventCouple.competition_id and CompetitorRegistration.competitor_id == EventCouple.follower_id')
However, I get:
ArgumentError: Could not determine relationship direction for primaryjoin
condition 'CompetitorRegistration.competition_id == EventCouple.competition_id
AND CompetitorRegistration.competitor_id == EventCouple.leader_id', on
relationship CompetitorRegistration.leading. Ensure that the referencing Column
objects have a ForeignKey present, or are otherwise part of a
ForeignKeyConstraint on their parent Table, or specify the foreign_keys parameter
to this relationship.
Thanks for any help, & let me know if more info is needed on the schema.
Also, another attempt of mine is visible in following — this did not error, but didn't give correct results either. (It only joined on the competition_id, and completely ignored the follower_id)
Your leading's condition mixes expression and string to be eval()ed. And following's condition mixes Python and SQL operators: and in Python is not what you expected here. Below are corrected examples using both variants:
leading = relationship('EventCouple', primaryjoin=(
(competition_id==EventCouple.competition_id) & \
(competitor_id==EventCouple.leader_id)))
leading = relationship('EventCouple', primaryjoin=and_(
competition_id==EventCouple.competition_id,
competitor_id==EventCouple.leader_id))
following = relationship('EventCouple', primaryjoin=\
'(CompetitorRegistration.competition_id==EventCouple.competition_id) '\
'& (CompetitorRegistration.competitor_id==EventCouple.follower_id)')
Related
I have 2 SqlAlchemy entities: study and series with 1 to many relation (study has many series). I crearted a relationship that way:
class SeriesDBModel(Base):
__tablename__ = "series"
series_uid = Column(String, primary_key=True)
study_uid = Column(String, ForeignKey(StudyDBModel.study_uid))
study = relationship("StudyDBModel", lazy="subquery", cascade="all")
class StudyDBModel(Base):
__tablename__ = "study"
study_uid = Column(String, primary_key=True)
My question is:
Do I have to create the entities like this:
series = SeriesDBModel()
study = StudyDBModel()
series.study = study
Or
Can I just create both entities independently but with the same study_uid?
I'm asking because the second way i how I save my entities but using series.study returns None.
BTW, it's not a "sessio.commit()" issue. I'm verifing both entities exist in the DB before trying series.study..
I tried to fetch series.study as above.
In the case of many-to-many relationships, an association table can be used in the form of Association Object pattern.
I have the following setup of two classes having a M2M relationship through UserCouncil association table.
class Users(Base):
name = Column(String, nullable=False)
email = Column(String, nullable=False, unique=True)
created_at = Column(DateTime, default=datetime.utcnow)
password = Column(String, nullable=False)
salt = Column(String, nullable=False)
councils = relationship('UserCouncil', back_populates='user')
class Councils(Base):
name = Column(String, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
users = relationship('UserCouncil', back_populates='council')
class UserCouncil(Base):
user_id = Column(UUIDType, ForeignKey(Users.id, ondelete='CASCADE'), primary_key=True)
council_id = Column(UUIDType, ForeignKey(Councils.id, ondelete='CASCADE'), primary_key=True)
role = Column(Integer, nullable=False)
user = relationship('Users', back_populates='councils')
council = relationship('Councils', back_populates='users')
However, in this situation, suppose I want to search for a council with a specific name cname for a given user user1. I can do the following:
for council in user1.councils:
if council.name == cname:
dosomething(council)
Or, alternatively, this:
session.query(UserCouncil) \
.join(Councils) \
.filter((UserCouncil.user_id == user1.id) & (Councils.name == cname)) \
.first() \
.council
While the second one is more similar to raw SQL queries and performs better, the first one is simpler. Is there any other, more idiomatic way of expressing this query which is better performing while also utilizing the relationship linkages instead of explicitly writing traditional joins?
First, I think even the SQL query you bring as an example might need to go to fetch the UserCouncil.council relationship again to the DB if it is not loaded in the memory already.
I think that given you want to search directly for the Council instance given its .name and the User instance, this is exactly what you should ask for. Below is the query for that with 2 options on how to filter on user_id (you might be more familiar with the second option, so please use it):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.filter(Councils.users.any(UserCouncil.user_id == user_id)) # v1: this does not require JOIN, but produces the same result as below
# .join(UserCouncil).filter(UserCouncil.user_id == user_id) # v2: join, very similar to original SQL
)
council = session.execute(q).scalars().first()
As to making it more simple and idiomatic, I can only suggest to wrap it in a method or property on the User instance:
class Users(...):
...
def get_council_by_name(self, councils_name):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.join(UserCouncil).filter(with_parent(self, Users.councils))
)
return object_session(self).execute(q).scalars().first()
so that you can later call it user.get_council_by_name('xxx')
Edit-1: added SQL queries
v1 of the first q query above will generate following SQL:
SELECT councils.id,
councils.name
FROM councils
WHERE councils.name = :name_1
AND (EXISTS
(SELECT 1
FROM user_councils
WHERE councils.id = user_councils.council_id
AND user_councils.user_id = :user_id_1
)
)
while v2 option will generate:
SELECT councils.id,
councils.name
FROM councils
JOIN user_councils ON councils.id = user_councils.council_id
WHERE councils.name = :name_1
AND user_councils.user_id = :user_id_1
In a project using Flask-SQLAlchemy, i get some intermittent errors and i think it might be due to not explicitly using transactions.
I have these two model classes, one for locations and another for closures:
class Location(db.Model):
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String)
code = sa.Column(sa.String, unique=True)
class LocationPath(db.Model):
ancestor_id = sa.Column(sa.Integer, sa.ForeignKey('location.id'), nullable=False, primary_key=True)
descendant_id = sa.Column(sa.Integer, sa.ForeignKey('location.id'), nullable=False, primary_key=True)
depth = sa.Column(sa.Integer, default=0, nullable=False)
In a background process, i'm doing a lot of inserts, so i'm bypassing the ORM to use Core:
location_table = Location.__table__
location_path_table = LocationPath.__table__
statement = select([location_table.c.id]).where(code == code)
result = db.session.get_bind().execute(statement)
location_id = result.first()
if location_id is None:
statement = location_table.insert().values(**kwargs)
result = db.session.get_bind().execute(statement)
new_id = result.inserted_primary_key[0]
result.close()
else:
new_id = location_id
# save new_id as an ancestor_id or a descendant_id
path = LocationPath.query.filter_by(
ancestor_id=ancestor_id,
descendant_id=descendant_id
).first()
if path is None:
statement = location_path_table.insert().values(
ancestor_id=ancestor_id,
descendant_id=descendant_id,
depth=depth)
# the line below intermittently generates either of two errors:
# - the inserted primary key (ancestor/descendant) does not exist
# - a duplicate key error where the path already exists
result = db.session.get_bind().execute(statement)
this has resulted in quite a bit of head-scratching on my part, since i get the ancestor_id or descendant_id either from a select or an insert, and i also query the database to see if the path exists before attempting to insert it.
Edit: the code above runs in a loop.
I have got a not very common join and filter problem.
Here are my models;
class Order(Base):
id = Column(Integer, primary_key=True)
order_id = Column(String(19), nullable=False)
... (other fields)
class Discard(Base):
id = Column(Integer, primary_key=True)
order_id = Column(String(19), nullable=False)
I want to query all and full instances of Order but just exclude those that have a match in Discard.order_id based on Order.order_id field. As you can see there is no relationship between order_id fields.
I've tried outer left join, notin_ but ended up with no success.
With this answer I've achieved desired results.
Here is my code;
orders = (
session.query(Order)
.outerjoin(Discard, Order.order_id == Discard.order_id)
.filter(Discard.order_id == None) # noqa: E711
.all()
)
I was paying too much attention to flake8 wrong syntax message at Discard.order_id == None and was using Discard.order_id is None. It appeared out they were rendered differently by sqlalchemy.
So I'm quite new to SQLAlchemy.
I have a model Showing which has about 10,000 rows in the table. Here is the class:
class Showing(Base):
__tablename__ = "showings"
id = Column(Integer, primary_key=True)
time = Column(DateTime)
link = Column(String)
film_id = Column(Integer, ForeignKey('films.id'))
cinema_id = Column(Integer, ForeignKey('cinemas.id'))
def __eq__(self, other):
if self.time == other.time and self.cinema == other.cinema and self.film == other.film:
return True
else:
return False
Could anyone give me some guidance on the fastest way to insert a new showing if it doesn't exist already. I think it is slightly more complicated because a showing is only unique if the time, cinmea, and film are unique on a showing.
I currently have this code:
def AddShowings(self, showing_times, cinema, film):
all_showings = self.session.query(Showing).options(joinedload(Showing.cinema), joinedload(Showing.film)).all()
for showing_time in showing_times:
tmp_showing = Showing(time=showing_time[0], film=film, cinema=cinema, link=showing_time[1])
if tmp_showing not in all_showings:
self.session.add(tmp_showing)
self.session.commit()
all_showings.append(tmp_showing)
which works, but seems to be very slow. Any help is much appreciated.
If any such object is unique based on a combination of columns, you need to mark these as a composite primary key. Add the primary_key=True keyword parameter to each of these columns, dropping your id column altogether:
class Showing(Base):
__tablename__ = "showings"
time = Column(DateTime, primary_key=True)
link = Column(String)
film_id = Column(Integer, ForeignKey('films.id'), primary_key=True)
cinema_id = Column(Integer, ForeignKey('cinemas.id'), primary_key=True)
That way your database can handle these rows more efficiently (no need for an incrementing column), and SQLAlchemy now automatically knows if two instances of Showing are the same thing.
I believe you can then just merge your new Showing back into the session:
def AddShowings(self, showing_times, cinema, film):
for showing_time in showing_times:
self.session.merge(
Showing(time=showing_time[0], link=showing_time[1],
film=film, cinema=cinema)
)