Overview
I'm working on a project which stores Artists in a database. Each time one or more artists are added to the database, a "transaction" is created for which the user can remove to undo that particular batch. Finally, each user can define one or more "profiles", storing separate lists of Artists in each (defined in var active_profile).
Using FKs with ondelete="CASCADE", when a Profile is deleted, all transactions are deleted and subsequently any artists belonging to those transactions are deleted.
active_profile = 2
class Profile(Base):
__tablename__ = 'profile'
id = Column(Integer, primary_key=True)
name = Column(String)
def __init__(self, name):
self.name = name
class Transaction(Base):
__tablename__ = 'transaction'
id = Column(Integer, primary_key=True)
timestamp = Column(Integer, nullable=False)
profile_id = Column(
Integer,
ForeignKey('profile.id', ondelete="CASCADE"),
nullable=False,
)
artist = relationship(
"Artist",
cascade="delete",
back_populates='transaction',
)
def __init__(self):
self.timestamp = int(time.time())
self.profile_id = active_profile
class Artist(Base):
__tablename__ = 'artist'
id = Column(Integer, primary_key=True)
art_id = Column(Integer)
art_name = Column(String)
txn_id = Column(
Integer,
ForeignKey('transaction.id', ondelete="CASCADE"),
nullable=False,
)
transaction = relationship("Transaction", back_populates="artist")
def __init__(self, art_id, art_name, txn_id):
self.art_id = art_id
self.art_name = art_name
self.txn_id = txn_id
The Goal
Most queries to the database will select where the profile_id is equal to the active_profile so I'm trying to figure out if there is a way to integrate this requirement into a relationship without having to specify it each time I query the database.
I have tried specifying primaryjoin in the relationship for the Transaction class but it seems to set the primaryjoin condition only once and doesn't change if active_profile is changed. I incorrectly assumed the relationship would be re-evaluated each time the class was called:
artist = relationship(
"Artist",
cascade="delete",
back_populates="transaction",
primaryjoin=f"and_(Transaction.id == Artist.txn_id, Transaction.profile_id == {active_profile}",
)
If this was re-evaluated each time, this would be exactly what I need.
Question
Is there a way to force the relationship to be re-evaluated and if not, is there any other option without the need to append .where(Transaction.profile_id == active_profile) to each query?
Related
Context: I'm making an auctioning website for which I am using Flask-SQLAlchemy. My tables will need to have a many-to-many relationship (as one artpiece can have many user bids and a user can bid on many artpieces)
My question is: it is possible to add another column to my joining table to contain the id of the user bidding, the id of artpiece that they are bidding on and also how much they bid? Also if yes, how would I include this bid in the table when I add a record to said table?
bid_table = db.Table("bid_table",
db.Column("user_id", db.Integer, db.ForeignKey("user.user_id")),
db.Column("item_id", db.Integer, db.ForeignKey("artpiece.item_id"))
)
class User(db.Model):
user_id = db.Column(db.Integer, unique=True, primary_key=True, nullable=False)
username = db.Column(db.Integer, unique=True, nullable=False)
email = db.Column(db.String(50), unique =True, nullable=False)
password = db.Column(db.String(60), nullable=False)
creation_date = db.Column(db.DateTime, default=str(datetime.datetime.now()))
bids = db.relationship("Artpiece", secondary=bid_table, backref=db.backref("bids", lazy="dynamic"))
class Artpiece(db.Model):
item_id = db.Column(db.Integer, unique=True, primary_key=True, nullable=False)
artist = db.Column(db.String(40), nullable=False)
buyer = db.Column(db.String(40), nullable=False)
end_date = db.Column(db.String(40))
highest_bid = db.Column(db.String(40))
It is possible to do this with SQL Alchemy, but it's very cumbersome in my opinion.
SQLAlchemy uses a concept called an Association Proxy to turn a normal table into an association table. This table can have whatever data fields you want on it, but you have to manually tell SQLAlchemy which columns are foreign keys to the other two tables in question.
This is a good example from the documentation.
In your case, the UserKeyword table is the association proxy table that you want to build for your user/bid scenario.
The special_key column is the arbitrary data you would store like the bid amount.
from sqlalchemy import Column, Integer, String, ForeignKey
from sqlalchemy.ext.associationproxy import association_proxy
from sqlalchemy.orm import backref, declarative_base, relationship
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(64))
# association proxy of "user_keywords" collection
# to "keyword" attribute
keywords = association_proxy('user_keywords', 'keyword')
def __init__(self, name):
self.name = name
class UserKeyword(Base):
__tablename__ = 'user_keyword'
user_id = Column(Integer, ForeignKey('user.id'), primary_key=True)
keyword_id = Column(Integer, ForeignKey('keyword.id'), primary_key=True)
special_key = Column(String(50))
# bidirectional attribute/collection of "user"/"user_keywords"
user = relationship(User,
backref=backref("user_keywords",
cascade="all, delete-orphan")
)
# reference to the "Keyword" object
keyword = relationship("Keyword")
def __init__(self, keyword=None, user=None, special_key=None):
self.user = user
self.keyword = keyword
self.special_key = special_key
class Keyword(Base):
__tablename__ = 'keyword'
id = Column(Integer, primary_key=True)
keyword = Column('keyword', String(64))
def __init__(self, keyword):
self.keyword = keyword
def __repr__(self):
return 'Keyword(%s)' % repr(self.keyword)
Check out the full documentation for instructions on how to access and create this kind of model.
Having used this in a real project, it's not particularly fun and if you can avoid it, I would recommend it.
https://docs.sqlalchemy.org/en/14/orm/extensions/associationproxy.html
I have two objects, User and Room both of which inherit from a Base Object.
class BaseModel:
__metaclass__ = Serializable
created = Column(DateTime, default=func.now())
modified = Column(DateTime, default=func.now(), onupdate=func.now())
#declared_attr
def __tablename__(self):
return self.__name__.lower()
Base = declarative_base(cls=BaseModel)
This is my User model with the many-to-many association with Room declared on top.
association_table = Table('users_rooms', Base.metadata,
Column('user_id', Integer, ForeignKey('user.id')),
Column('room_id', Integer, ForeignKey('room.id'))
)
class User(Base):
__table_args__ = {'extend_existing': True}
id = Column(Integer, primary_key=True)
mobile = Column(String(20), index=True, unique=True)
rooms = relationship("Room", secondary=association_table,
back_populates="users")
And this is the Room model.
association_table = Table('users_rooms', Base.metadata,
Column('user_id', Integer, ForeignKey('user.id')),
Column('room_id', Integer, ForeignKey('room.id'))
)
class Room(Base):
__table_args__ = {'extend_existing': True}
id = Column(Integer, primary_key=True)
room_type = Column(String(50), default=RoomType.PRIVATE)
hex_code = Column(String(100), unique=True)
users = relationship("User", secondary=association_table, back_populates="rooms")
When I try to compile this, I get the following error.
sqlalchemy.exc.InvalidRequestError: Table 'users_rooms' is already defined for this MetaData instance. Specify 'extend_existing=True' to redefine options and columns on an existing Table object.
The error is trying to tell you that you do not need to – and shouldn't – define the association table in both modules. Define it in one of them, or in a module of its own, and either import it, or refer to it lazily in relationship() using secondary="users_rooms":
# Room model. Note the absence of `association_table`
class Room(Base):
id = Column(Integer, primary_key=True)
room_type = Column(String(50), default=RoomType.PRIVATE)
hex_code = Column(String(100), unique=True)
users = relationship("User", secondary="users_rooms", back_populates="rooms")
The table name as a string value passed in secondary= is looked up from the MetaData collection associated with the Room model.
You also should not need to sprinkle
__table_args__ = {'extend_existing': True}
in your models. If you get errors similar to the one in this question without it, the tables have already been created and included in the MetaData collection before your models are constructed. You may have used reflection, for example.
How to add objects in the constructor with relationship? The id is not yet ready when constructor is evaluated. In simpler cases it is possible to just provide a list, calculated beforehand. In the example below I tried to say there is a complex_cls_method, in a way it is more like black box.
from sqlalchemy import create_engine, MetaData, Column, Integer, String, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy.orm import sessionmaker
DB_URL = "mysql://user:password#localhost/exampledb?charset=utf8"
engine = create_engine(DB_URL, encoding='utf-8', convert_unicode=True, pool_recycle=3600, pool_size=10)
session = sessionmaker(autocommit=False, autoflush=False, bind=engine)()
Model = declarative_base()
class User(Model):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
simple = Column(String(255))
main_address = Column(String(255))
addresses = relationship("Address",
cascade="all, delete-orphan")
def __init__(self, addresses, simple):
self.simple = simple
self.main_address = addresses[0]
return # because the following does not work
self.addresses = Address.complex_cls_method(
user_id_=self.id, # <-- this does not work of course
key_="address",
value_=addresses
)
class Address(Model):
__tablename__ = 'address'
id = Column(Integer, primary_key=True)
keyword = Column(String(255))
value = Column(String(255))
user_id = Column(Integer, ForeignKey('user.id'), nullable=False)
parent_id = Column(Integer, ForeignKey('address.id'), nullable=True)
#classmethod
def complex_cls_method(cls, user_id_, key_, value_):
main = Address(keyword=key_, value="", user_id=user_id_, parent_id=None)
session.add_all([main])
session.flush()
addrs = [Address(keyword=key_, value=item, user_id=user_id_, parent_id=main.id) for item in value_]
session.add_all(addrs)
return [main] + addrs
if __name__ == "__main__":
# Model.metadata.create_all(engine)
user = User([u"address1", u"address2"], "simple")
session.add(user)
session.flush()
# as it can't be done in constructor, these additional statements needed
user.addresses = Address.complex_cls_method(
user_id_=user.id,
key_="address",
value_=[u"address1", u"address2"]
)
session.commit()
The question is, is there syntactically elegant (and technically sound) way to do this with User's constructor, or is it safer to just call a separate method of User class after session.flush to add desired objects to relationships (as in the example code)?
Giving up on constructor altogether is still possible, but less desirable option as resulting signature change would require significant refactorings.
Instead of manually flushing and setting ids etc. you could let SQLAlchemy handle persisting your object graph. You'll just need one more adjacency list relationship in Address and you're all set:
class User(Model):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
simple = Column(String(255))
main_address = Column(String(255))
addresses = relationship("Address",
cascade="all, delete-orphan")
def __init__(self, addresses, simple):
self.simple = simple
self.main_address = addresses[0]
self.addresses = Address.complex_cls_method(
key="address",
values=addresses
)
class Address(Model):
__tablename__ = 'address'
id = Column(Integer, primary_key=True)
keyword = Column(String(255))
value = Column(String(255))
user_id = Column(Integer, ForeignKey('user.id'), nullable=False)
parent_id = Column(Integer, ForeignKey('address.id'), nullable=True)
# For handling parent/child relationships in factory method
parent = relationship("Address", remote_side=[id])
#classmethod
def complex_cls_method(cls, key, values):
main = cls(keyword=key, value="")
addrs = [cls(keyword=key, value=item, parent=main) for item in values]
return [main] + addrs
if __name__ == "__main__":
user = User([u"address1", u"address2"], "simple")
session.add(user)
session.commit()
print(user.addresses)
Note the absence of manual flushes etc. SQLAlchemy automatically figures out the required order of insertions based on the object relationships, so that dependencies between rows can be honoured. This is a part of the Unit of Work pattern.
I have a situation where a user can belong to many courses, and a course can contain many users. I have it modeled in SqlAlchemy like so:
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
class Course(Base):
__tablename__ = 'courses'
id = Column(Integer, primary_key=True)
archived = Column(DateTime)
class CourseJoin(Base):
__tablename__ = 'course_joins'
id = Column(Integer, primary_key=True)
# Foreign keys
user_id = Column(Integer, ForeignKey('users.id'))
course_id = Column(Integer, ForeignKey('courses.id'))
In the system, we have the ability to "archive" a course. This is marked by a datetime field on the course model. I would like to give the User model a relationship called course_joins that only contains CourseJoins where the respective Course hasn't been archived. I'm trying to use the secondary kwarg to accomplish this like so:
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
course_joins = relationship('CourseJoin',
secondary='join(Course, CourseJoin.course_id == Course.id)',
primaryjoin='and_(CourseJoin.user_id == User.id,'
'Course.archived == None)',
order_by='CourseJoin.created')
However I'm getting this error:
InvalidRequestError: One or more mappers failed to initialize - can't proceed with initialization of other mappers. Original exception was: FROM expression expected
I believe this is the exact usecase for the secondary kwarg of relationship(), but I'm not sure what's going on.
If you really just have many-to-many relationship (plus created) column, I think the right way to define the relationship is:
courses = relationship(
'Course',
secondary='course_joins',
primaryjoin='users.c.id == course_joins.c.user_id',
secondaryjoin='and_(courses.c.id == course_joins.c.course_id, courses.c.archived == None)',
order_by='course_joins.c.created',
viewonly=True,
)
and use it like:
u1 = User(courses=[Course()])
session.add(u1)
u2 = User(courses=[Course(archived=datetime.date(2013, 1, 1))])
session.add(u2)
Otherwise, just drop the secondary completely and add your other condition to primaryjoin:
courses = relationship(
'CourseJoin',
primaryjoin=\
'and_(users.c.id == course_joins.c.user_id, '
'courses.c.id == course_joins.c.course_id, '
'courses.c.archived == None)',
order_by='course_joins.c.created',
)
I have rather simple models like these:
TableA2TableB = Table('TableA2TableB', Base.metadata,
Column('tablea_id', BigInteger, ForeignKey('TableA.id')),
Column('tableb_id', Integer, ForeignKey('TableB.id')))
class TableA(Base):
__tablename__ = 'TableA'
id = Column(BigInteger, primary_key=True)
infohash = Column(String, unique=True)
url = Column(String)
tablebs = relationship('TableB', secondary=TableA2TableB, backref='tableas')
class TableB(Base):
__tablename__ = 'TableB'
id = Column(Integer, primary_key=True)
url = Column(String, unique=True)
However, sqla generates queries like
SELECT "TableB".id, "TableB".url AS "TableB_url" FROM "TableB", "TableA2TableB"
WHERE "TableA2TableB".tableb_id = "TableB".id AND "TableA2TableB".tablea_id = 408997;
But why is there a cartesian product in the query when the attributes selected are those in TableB? TableA2TableB shouldn't be needed.
Thanks
As it is right now, there is a backref relationship in TableB (tableas) and it's loaded because the default loading mode is set to select.
You may want to change the TableA.tablebs to
tablebs = relationship('TableB', secondary=TableA2TableB, backref='tableas', lazy="dynamic")