many-many model join with SQLAlchemy - python

I'm having the worst time trying to get a many-many join to work using models in SQLAlchemy. I've found lots of examples online, but I can't ever seem to figure out if their strings represent the column names they'd like versus what the database tables actually have, or they're using Table instead of a declarative model, or something else is different and their example just doesn't work. I currently have the following setup:
Database tables TAG_TEST, TAG, and TEST
TAG_TEST has TS_TEST_ID, TG_TAG_ID, and TG_TYPE (foreign keys)
TAG has TG_TAG_ID and TG_TYPE
TEST has TS_TEST_ID
I have the following models:
class Test(Base):
from .tag import Tag
from .tag_test import TagTest
__tablename__ = u'TEST'
id = Column(u'TS_TEST_ID', INTEGER(), primary_key=True, nullable=False)
...
tags = relationship(Tag, secondary='TAG_TEST')
class Tag(Base):
from .tag_test import TagTest
__tablename__ = "TAG"
id = Column(u'TG_TAG_ID', INTEGER(), primary_key=True, nullable=False)
type = Column(u'TG_TYPE', VARCHAR(25))
...
tests = relationship("Test", secondary='TAG_TEST')
class TagTest(Base):
__tablename__ = u'TAG_TEST'
tagID = Column(u'TG_TAG_ID', INTEGER(), ForeignKey("TAG.TG_TAG_ID"), primary_key=True, nullable=False)
testID = Column(u'TS_TEST_ID', INTEGER(), ForeignKey("TEST.TS_TEST_ID"), primary_key=True, nullable=False)
tagType = Column(u'TG_TYPE', VARCHAR(50), ForeignKey("TAG.TG_TYPE"), primary_key=True, nullable=False)
...
tag = relationship("Tag", backref="testLinks")
test = relationship("Test", backref="tagLinks")
Currently I'm getting the following error:
ArgumentError: Could not determine join condition between parent/child tables on relationship Tag.tests. Specify a 'primaryjoin' expression. If 'secondary' is present, 'secondaryjoin' is needed as well.
What am I missing/doing wrong?

The tricky part with mine was the composite foreign key to the Tag table. Here's my setup:
class TagTest(Base):
__table_args__ = (ForeignKeyConstraint(['TG_TAG_ID', 'TG_TYPE'],
['TAG.TG_TAG_ID', 'TAG.TG_TYPE']), {})
tagID = Column(u'TG_TAG_ID', INTEGER(), primary_key=True, nullable=False)
testID = Column(u'TS_TEST_ID', INTEGER(), ForeignKey("TEST.TS_TEST_ID"), primary_key=True, nullable=False)
tagType = Column(u'TG_TYPE', VARCHAR(50), primary_key=True, nullable=False)
tag = relationship(Tag, backref="testLinks")
test = relationship(Test, backref="tagLinks")
class Tag(Base):
tests = relationship("Test", secondary="TAG_TEST")
Then to access the tests a tag has, I can do myTag.tests. To access the tags a test has, I can do myTest.tagLinks and then access .tag on each object in the .tagLinks property. Not as neat as I'd like, but it works.

Related

How to order query results for parent by related field for one-to-many relationship in SQLAlchemy

I have following DB models
class User(Base):
__tablename__ = "user"
user_id = Column("id", Integer(), primary_key=True)
groups = relationship(
"Group", back_populates="user", lazy="selectin", cascade="all, delete-orphan",
)
class Group(Base):
__tablename__ = "group"
group_id = Column("id", Integer(), primary_key=True)
user_id = Column(
Integer,
ForeignKey("user.id", ondelete="CASCADE"),
index=True,
)
division_id = Column(
String(96),
ForeignKey("division.id", onupdate="CASCADE"),
nullable=False,
)
name = Column(String(64), nullable=False)
user = relationship("User", back_populates="groups", lazy="selectin")
group = relationship("Division", back_populates="groups", lazy="selectin")
class Division(Base):
__tablename__ = "division"
division_id = Column("id", Integer, primary_key=True)
name = Column(String(64), nullable=False)
groups = relationship("Group", back_populates="group", lazy="selectin")
I want to fetch all the users ordered by their groups(can be something else as well, need to come from enduser), which I can easily achieve using the following query
session.query(User).join(Group).join(Division).order_by(Group.name).all()
And it might look that this works just fine but it doesn't, because since a user might have multiple groups, so in order to have correct result I first need to sort the groups for each user object i.e. something like sorted(User.group.order_by(Group.name) and then apply the order_by on the User model based on these sorted groups.
And the same thing can apply to division names as well. I know that we can provide default order_by fields while defining the relationship like below but that's not what I want since the order_by field need to come from enduser and can be any other field as well.
groups = relationship("Group", back_populates="user", lazy="selectin", cascade="all, delete-orphan",order_by=("Group.name"))
I can do this at data layer in python but that would not be ideal since there is already some ordering being done at DB layer.
So how can I achieve this at DB layer using SQLAlchemy or even with raw sql. Or is it even possible with sql?

SAWarning: Multiple rows returned with uselist=False for lazily-loaded attribute 'Similarity.tag1'

I get the following warning from SQLAlchemy and I wonder what the problem is:
\venv\lib\site-packages\sqlalchemy\orm\strategies.py:911: SAWarning: Multiple rows returned with uselist=False for lazily-loaded attribute 'Similarity.tag1'
util.warn(
\venv\lib\site-packages\sqlalchemy\orm\strategies.py:911: SAWarning: Multiple rows returned with uselist=False for lazily-loaded attribute 'Similarity.tag'
util.warn(
My ORM classes looks like this:
class Similarity(db.Model):
__tablename__ = 'similarities'
tag_id_1 = db.Column(db.ForeignKey('tags.id'), primary_key=True, nullable=False)
tag_id_2 = db.Column(db.ForeignKey('tags.id'), primary_key=True, nullable=False, index=True)
value = db.Column(DOUBLE, nullable=False)
tag = db.relationship('Tag', primaryjoin='Similarity.tag_id_1 == Tag.id', backref='tag_similarities')
tag1 = db.relationship('Tag', primaryjoin='Similarity.tag_id_2 == Tag.id', backref='tag_similarities_0')
class Tag(db.Model):
__tablename__ = 'tags'
id = db.Column(db.Integer, primary_key=True, nullable=False, autoincrement=True)
language_id = db.Column(db.ForeignKey('languages.id'), primary_key=True, nullable=False, index=True)
name = db.Column(db.String(45), nullable=False, index=True)
language = db.relationship('Language', primaryjoin='Tag.language_id == Language.id', backref='tags')
I use SQLAlchemy 1.3.22 and Python 3.8. Can you please explain what the message is about and how I can fix this?
As #zvone said, the relationships tag and tag1 can point to multiple tags because Tag has two primary columns. That's what the warning message is about: The relationships aren't unique.
Therefore, the solution is either to remove these relationships or to make them unique.

SQLAlchemy AssertionError: Dependency rule tried to blank-out primary key column

because of the following line: contact_person.selections_to_persons[selection_id].post_address = address
i get the following error at the next commit:
AssertionError: Dependency rule tried to blank-out primary key column 'selections_to_persons.t_person_to_department_id' on instance ''
The important parts of the involved models are:
class SelectionToPerson(OrmModelBase, TableModelBase):
__tablename__ = 'selections_to_persons'
__table_args__ = (
ForeignKeyConstraint(
["address_tablename",
"address_id",
"t_person_to_department_id"],
["address_collection.tablename",
"address_collection.id",
"address_collection.t_person_to_department_id"],
name="fk_post_address_selection_to_person", use_alter=True
),
)
selection_id = Column(Integer,
ForeignKey('selections.selection_id',
onupdate=NO_ACTION,
ondelete=CASCADE),
primary_key=True, nullable=False)
t_person_to_department_id = Column(
Integer,
ForeignKey('t_persons_to_departments.t_person_to_department_id',
onupdate=NO_ACTION,
ondelete=CASCADE),
primary_key=True,
nullable=False)
address_tablename = Column(String, nullable=False)
address_id = Column(Integer, nullable=False)
post_address = relationship(AddressCollection)
class AddressCollection(OrmModelBase, ViewModelBase):
__tablename__ = 'address_collection'
tablename = Column(String, primary_key=True)
id = Column(Integer, primary_key=True)
t_person_to_department_id = Column(
Integer,
ForeignKey('t_persons_to_departments.t_person_to_department_id'),
primary_key=True)
Does anyone know why this error occurs?
One of the cases when this error occurs is an attempt to assign null to a field that is a primary key.
You have several primary keys that are specified by foreign keys.
I don't know for sure, but it is possible that the expression contact_person.selections_to_persons[selection_id].post_address = address created an object with null reference. That is, after assignment, some object remains with a null reference.
I am leaving a few links that describe how to use cascades in different cases. This might help those who get this error.
This is how cascades work:
https://docs.sqlalchemy.org/en/13/orm/cascades.html#unitofwork-cascades
Here's how you can configure cascades using the example of deleting: https://docs.sqlalchemy.org/en/13/orm/tutorial.html#configuring-delete-delete-orphan-cascade

How to relate APScheduler JobStore with SQLAlchemy Models (foreign key) - Python Flask

I've got a Python Flask app using flask.ext.sqlalchemy and apscheduler.schedulers.background. I've created a JobStore and gotten a table called apscheduler_jobs is has the following fields:
|id |next_run_time|job_state|
------------------------------
|TEXT| REAL | TEXT |
I want to relate a an SQLAlchemy Model object to that table using something like this:
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
scheduler.add_jobstore('sqlalchemy', url=app.config['SQLALCHEMY_DATABASE_URI'])
class Event(db.Model):
__tablename__ = "event"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
name = db.Column(db.String(255), nullable=False)
jobs = db.relationship('scheduler', backref='apscheduler_jobs')
So I want to use the table from the APScheduler apscheduler_jobs and then associate that with a foreign key to my Event object. That last line there will basically break as "scheduler" isn't a defined SQLAlchmey model
qlalchemy.exc.InvalidRequestError: When initializing mapper Mapper|Event|event, expression 'scheduler' failed to locate a name ("name 'scheduler' is not defined"). If this is a class name, consider adding this relationship() to the <class 'project.models.Event'> class after both dependent classes have been defined.
So I think I need an inbetween Model class called "job" or something, then relate that to apscheduler_jobs, but something here still feels bad - because APScheduler is making this table up I've got no control over what's going on there - should I be concerned about that?
EDIT1:
So I created 2 models, one "Event" then one "Job", the "Job" then relates to the table apscheduler_jobs
class Job(db.Model):
__tablename__ = "job"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
name = db.Column(db.String(255), nullable=False)
apscheduler_job_id = db.Column(db.Integer, db.ForeignKey('apscheduler_jobs.id'))
event_id = db.Column(db.Integer, db.ForeignKey('event.id'))
problem there is that when I dropped the DB and recreated it it's thrown the error:
sqlalchemy.exc.NoReferencedTableError: Foreign key associated with column 'job.apscheduler_job_id' could not find table 'apscheduler_jobs' with which to generate a foreign key to target column 'id'
Now I could get around that in my database creation script, but again it still feels like I'm doing this the wrong way
EDIT2
I managed to get it to work, though this feels pretty wrong, I've now got 3 models: Event, Job, and APSchedulerJobsTable. The final model basically matches what the APScheduler apscheduler_jobs looks like. There must be a better way to do this though.
from project import db
class Event(db.Model):
__tablename__ = "event"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
name = db.Column(db.String(255), nullable=False)
jobs = db.relationship('Job', backref='job_event')
class Job(db.Model):
__tablename__ = "job"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
apscheduler_job_id = db.Column(db.TEXT, db.ForeignKey('apscheduler_jobs.id'))
event_id = db.Column(db.Integer, db.ForeignKey('event.id'))
class APSchedulerJobsTable(db.Model):
# TODO: This feels bad man
__tablename__ = "apscheduler_jobs"
id = db.Column(db.TEXT, primary_key=True, autoincrement=True)
next_run_time = db.Column(db.REAL)
job_state = db.Column(db.TEXT)
Ok, two solutions - neither really perfect IMO:
Solution One, probably more clean - simply have a Text field in the job table that contains aspscheduler_job_ids - this is not a foreign key though but once the aspscheduler_job ID is known it's possible to go ahead and store it in the job table for later reference
class Event(db.Model):
__tablename__ = "event"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
name = db.Column(db.String(255), nullable=False)
jobs = db.relationship('Job', backref='job_event')
class Job(db.Model):
__tablename__ = "job"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
event_id = db.Column(db.Integer, db.ForeignKey('event.id'))
apscheduler_job_id = db.Column(db.TEXT)
Catch for this one is in order to drop the full db you'll need to run this to include dropping the unmanaged table apscheduler_jobs:
db.reflect()
db.drop_all()
Solution Two, add the apscheduler table to the model itself, and then set up the foreign key:
class Event(db.Model):
__tablename__ = "event"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
name = db.Column(db.String(255), nullable=False)
jobs = db.relationship('Job', backref='job_event')
class Job(db.Model):
__tablename__ = "job"
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
event_id = db.Column(db.Integer, db.ForeignKey('event.id'))
apscheduler_job_id = db.Column(db.TEXT, db.ForeignKey('apscheduler_jobs.id'))
class APSchedulerJobsTable(db.Model):
# TODO: This feels bad man
__tablename__ = "apscheduler_jobs"
id = db.Column(db.TEXT, primary_key=True, autoincrement=True)
next_run_time = db.Column(db.REAL)
job_state = db.Column(db.TEXT)
job = db.relationship('Job', backref='job_event')

sqlalchemy: referencing label()'d column in a filter or clauselement

I'm trying to perform a query that works across a many->many relation ship between bmarks and tags with a secondary table of bmarks_tags. The query involves several subqueries and I have a need to DISTINCT a column. I later want to join that to another table via the DISTINCT'd ids.
I've tried it a few ways and this seems closest:
tagid = alias(Tag.tid.distinct())
test = select([bmarks_tags.c.bmark_id],
from_obj=[bmarks_tags.join(DBSession.query(tagid.label('tagid'))),
bmarks_tags.c.tag_id == tagid])
return DBSession.execute(qry)
But I get an error:
⇝ AttributeError: '_UnaryExpression' object has no attribute 'named_with_column'
Does anyone know how I can perform the join across the bmarks_tags.tag_id and the result of the Tag.tid.distinct()?
Thanks
Schema:
# this is the secondary table that ties bmarks to tags
bmarks_tags = Table('bmark_tags', Base.metadata,
Column('bmark_id', Integer, ForeignKey('bmarks.bid'), primary_key=True),
Column('tag_id', Integer, ForeignKey('tags.tid'), primary_key=True)
)
class Tag(Base):
"""Bookmarks can have many many tags"""
__tablename__ = "tags"
tid = Column(Integer, autoincrement=True, primary_key=True)
name = Column(Unicode(255), unique=True)
Something like this should work:
t = DBSession.query(Tag.tid.distinct().label('tid')).subquery('t')
test = select([bmarks_tags.c.bmark_id], bmarks_tags.c.tag_id == t.c.tid)
return DBSession.execute(test)
It is hard to tell what you are trying to accomplish, but since you are using orm anyways (and there is not much reason anymore to go with bare selects in sa these days), you should probably start by establishing a many-to-many relation:
bmarks_tags = Table('bmark_tags', Base.metadata,
Column('bmark_id', Integer, ForeignKey('bmarks.bid'), primary_key=True),
Column('tag_id', Integer, ForeignKey('tags.tid'), primary_key=True)
)
class Tag(Base):
"""Bookmarks can have many many tags"""
__tablename__ = "tags"
tid = Column(Integer, primary_key=True)
name = Column(Unicode(255), unique=True)
class BMark(Base):
__tablename__ = 'bmarks'
bid = Column(Integer, primary_key=True)
tags = relation(Tag, secondary=bmarks_tags, backref="bmarks")
Then get your query and go from there:
query = DBSession.query(BMark).join(BMark.tags)
If not, give us the actual sql you are trying to make sqlalchemy emit.

Categories

Resources