SQLAlchemy Many to Many Understanding - python

I just can not wrap my head around the Many to Many Relationships in (Flask-)SQLAlchemy or how backrefs seem to apply to my problem.
Heres what I want do achieve:
n Users each have n (predefined) Assignments to do
Each User can Submit their work (Submission - belonging to one of 8 Assignments) multiple times.
Quick Example: Dummy User has 2 Assignments (e.g. Programm a For Loop), he/she submitted 2 code snippets (each graded individually) for the first assignment and none yet for the second.
So here is what I got so far in terms of Class Definition in SQLAlchemy:
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
class User(db.Model):
__tablename__ = "users"
id = db.Column(db.Integer, primary_key=True)
username = db.Column(db.String(80), unique=True, nullable=False)
assignments = db.relationship(
"Assignment", secondary="submissions", backref=db.backref("users", lazy=True)
)
class Assignment(db.Model):
__tablename__ = "assignments"
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(40), nullable=False)
class Submission(db.Model):
__tablename__ = "submissions"
id = db.Column(db.Integer, primary_key=True)
user_id = db.Column(db.Integer, db.ForeignKey("users.id"))
assignment_id = db.Column(db.Integer, db.ForeignKey("assignments.id"))
user = db.relationship(User, backref=db.backref("submissions"))
assignment = db.relationship(
Assignment,
backref=db.backref("submissions"),
)
What I get is these warning so I think I am missing/not understanding something here:
relationship 'Assignment.submissions' will copy column assignments.id to column submissions.assignment_id, which conflicts with relationship(s): 'Assignment.users' (copies assignments.id to submissions.assignment_id), 'User.assignments' (copies assignments.id to submissions.assignment_id).
relationship 'Submission.user' will copy column users.id to column submissions.user_id, which conflicts with relationship(s): 'Assignment.users' (copies users.id to submissions.user_id), 'User.assignments' (copies users.id to submissions.user_id) etc.
Thanks in Advance for your help!

Please read a warning section under the Association Object documentation, which describes your case where you are building separate relationships (direct to/from the association table and a secondary).
Based on your model, I assume that the many-to-many will be read-only as it will not allow you to access attributes on the "Submission" anyways, and I would mark it as such:
class User(db.Model):
__tablename__ = "users"
...
assignments = db.relationship(
"Assignment",
secondary="submissions",
backref=db.backref("users", lazy=True, viewonly=True),
viewonly=True,
)
Also Association Proxy might be useful.

Related

SqlAlchemy AmbiguousForeign Keys Error - foreign key attribute is specified [duplicate]

Am trying to setup a postgresql table that has two foreign keys that point to the same primary key in another table.
When I run the script I get the error
sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child tables on relationship Company.stakeholder - there are multiple foreign key paths linking the tables. Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as containing a foreign key reference to the parent table.
That is the exact error in the SQLAlchemy Documentation yet when I replicate what they have offered as a solution the error doesn't go away. What could I be doing wrong?
#The business case here is that a company can be a stakeholder in another company.
class Company(Base):
__tablename__ = 'company'
id = Column(Integer, primary_key=True)
name = Column(String(50), nullable=False)
class Stakeholder(Base):
__tablename__ = 'stakeholder'
id = Column(Integer, primary_key=True)
company_id = Column(Integer, ForeignKey('company.id'), nullable=False)
stakeholder_id = Column(Integer, ForeignKey('company.id'), nullable=False)
company = relationship("Company", foreign_keys='company_id')
stakeholder = relationship("Company", foreign_keys='stakeholder_id')
I have seen similar questions here but some of the answers recommend one uses a primaryjoin yet in the documentation it states that you don't need the primaryjoin in this situation.
Tried removing quotes from the foreign_keys and making them a list. From official documentation on Relationship Configuration: Handling Multiple Join Paths
Changed in version 0.8: relationship() can resolve ambiguity between
foreign key targets on the basis of the foreign_keys argument alone;
the primaryjoin argument is no longer needed in this situation.
Self-contained code below works with sqlalchemy>=0.9:
from sqlalchemy import create_engine, Column, Integer, String, ForeignKey
from sqlalchemy.orm import relationship, scoped_session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
engine = create_engine(u'sqlite:///:memory:', echo=True)
session = scoped_session(sessionmaker(bind=engine))
Base = declarative_base()
#The business case here is that a company can be a stakeholder in another company.
class Company(Base):
__tablename__ = 'company'
id = Column(Integer, primary_key=True)
name = Column(String(50), nullable=False)
class Stakeholder(Base):
__tablename__ = 'stakeholder'
id = Column(Integer, primary_key=True)
company_id = Column(Integer, ForeignKey('company.id'), nullable=False)
stakeholder_id = Column(Integer, ForeignKey('company.id'), nullable=False)
company = relationship("Company", foreign_keys=[company_id])
stakeholder = relationship("Company", foreign_keys=[stakeholder_id])
Base.metadata.create_all(engine)
# simple query test
q1 = session.query(Company).all()
q2 = session.query(Stakeholder).all()
The latest documentation:
http://docs.sqlalchemy.org/en/latest/orm/join_conditions.html#handling-multiple-join-paths
The form of foreign_keys= in the documentation produces a NameError, not sure how it is expected to work when the class hasn't been created yet. With some hacking I was able to succeed with this:
company_id = Column(Integer, ForeignKey('company.id'), nullable=False)
company = relationship("Company", foreign_keys='Stakeholder.company_id')
stakeholder_id = Column(Integer, ForeignKey('company.id'), nullable=False)
stakeholder = relationship("Company",
foreign_keys='Stakeholder.stakeholder_id')
In other words:
… foreign_keys='CurrentClass.thing_id')

Two-way foreign keys on different attributes, one-to-one and one-to-many

I have two models: User and ReferralKey, with these requirements:
On creation of User, a ReferralKey is automatically created and added to the DB
ReferralKey keeps track of which users were referred by it
ReferralKey keeps track of which user owns it
As per the answer to this question, the best solution seems to be to create the ReferralKey within the constructor of User. The solution to the other two require foreign keys, and seems really messy—entangling the tables together in such a way that I might as well put them in the same table.
The solution to the first looks like this:
def User(model):
id = Column(BigInteger(), autoincrement=True, primary_key=True)
referral_key = relationship('ReferralKey', uselist=False)
...
def __init__(self):
self.referral_key = ReferralKey()
def ReferralKey(model):
id = Column(BigInteger(), autoincrement=True, primary_key=True)
user_id = Column(BigInteger(), ForeignKey('user.id', ondelete='SET NULL'), nullable=True)
This works as intended, and solves the first and third points. The problem arises when trying to solve the 2nd. This (for some reason) necessitates a new foreign key in User, which necessitates the declaration of a relationship in both User and ReferralKey to (I guess) disambiguate the foreign keys:
def User(model):
id = Column(BigInteger(), autoincrement=True, primary_key=True)
referral_key = relationship('ReferralKey', uselist=False)
referrer_id = Column(BigInteger(), ForeignKey('referral_key.id', ondelete='SET NULL'))
referrer = relationship('ReferralKey', foreign_keys=['referrer_id'], backref='used_by')
...
def __init__(self):
self.referral_key = ReferralKey()
def ReferralKey(model):
__tablename__='referral_key'
id = Column(BigInteger(), autoincrement=True, primary_key=True)
user_id = Column(BigInteger(), ForeignKey('user.id', ondelete='SET NULL'), nullable=True)
user = relationship('User', foreign_keys=['user_id'])
I've tried all different permutations of relationship and ForeignKey, and always get the same error:
sqlalchemy.exc.CircularDependencyError: Can't sort tables for DROP; an unresolvable foreign key dependency exists between tables: referral_key, users. Please ensure that the ForeignKey and ForeignKeyConstraint objects involved in the cycle have names so that they can be dropped using DROP CONSTRAINT.
Ultimately, my problem is that I just don't understand what I'm doing. Why do I need to change the User table at all in order to keep track of things on the ReferralKey table? What purpose does the relationship declaration serve—why is it ambiguous without this declaration? If User has a foreign key referencing ReferralKey and ReferralKey has a foreign key referencing User—and either of these should be set to NULL in case of deletion, why does SQL need more information than that?
Why can't I just have:
def User(model):
id = Column(BigInteger(), autoincrement=True, primary_key=True)
def __init__(self):
ReferralKey(user_id=self.id)
def ReferralKey(model):
__tablename__='referral_key'
id = Column(BigInteger(), autoincrement=True, primary_key=True)
user_id = Column(BigInteger(), ForeignKey('user.id', ondelete='SET NULL'), nullable=True)
used_by = [list of user IDs]
def __init__(self, user_id):
if user_id:
self.user_id == user_id
This feels to me so much cleaner and more intuitive. If I want to add (or remove!) referral keys, I hardly have to worry about adding things to User because it's mostly independent of the functioning of the referral keys. Why do I need to add a column in the user table to keep track of something that I want the ReferralKey to keep track of?
I'm totally ignorant of this, basically. Would anyone mind helping me out?
Try the following.
It relies on the information:
A user creates his own master key on registration.
A user may register a slave key when he registers, but maybe not.
This way the foreign keys remain in one table but you need to differentiate between them..
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
Base= declarative_base()
class User(Base):
__tablename__ = "user"
id = Column(Integer, primary_key=True)
own_referral_id = Column(Integer, ForeignKey('referral_key.id'))
own_referral_key = relationship('ReferralKey', foreign_keys=[own_referral_id], back_populates='owner')
signup_referral_id = Column(Integer, ForeignKey('referral_key.id'))
signup_referral_key = relationship('ReferralKey', foreign_keys=[signup_referral_id], back_populates='signups')
def __init__(self, **kwargs):
self.own_referral_key = ReferralKey()
super().__init__(**kwargs)
class ReferralKey(Base):
__tablename__ = "referral_key"
id = Column(Integer, primary_key=True)
owner = relationship('User', foreign_keys=[User.own_referral_id], back_populates='own_referral_key', uselist=False)
signups = relationship('User', foreign_keys=[User.signup_referral_id], back_populates='signup_referral_key', uselist=True)
e = create_engine("sqlite://")
Base.metadata.create_all(e)
s = Session(e)
u1 = User()
s.add(u1)
s.commit()
u2 = User(signup_referral_id = u1.own_referral_id)
u3 = User(signup_referral_id = u1.own_referral_id)
s.add(u2)
s.add(u3)
s.commit()
print(u1.own_referral_key.signups)

SQLAlchemy Handling Multiple Paths In One Relationship

Please note: this question is related but separate from my other currently open question SQLAlchemy secondary join relationship on multiple foreign keys.
The SQLAlchemy documentation describes handling multiple join paths in a single class for multiple relationships:
from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
Base = declarative_base()
class Customer(Base):
__tablename__ = 'customer'
id = Column(Integer, primary_key=True)
name = Column(String)
billing_address_id = Column(Integer, ForeignKey("address.id"))
shipping_address_id = Column(Integer, ForeignKey("address.id"))
billing_address = relationship("Address")
shipping_address = relationship("Address")
class Address(Base):
__tablename__ = 'address'
id = Column(Integer, primary_key=True)
street = Column(String)
city = Column(String)
state = Column(String)
zip = Column(String)
Within the same section the documentation shows three separate ways to define the relationship:
billing_address = relationship("Address", foreign_keys=[billing_address_id])
billing_address = relationship("Address", foreign_keys="[Customer.billing_address_id]")
billing_address = relationship("Address", foreign_keys="Customer.billing_address_id")
As you can see in (1) and (2) SQLAlchemy allows you to define a list of foreign_keys. In fact, the documentation explicitly states:
In this specific example, the list is not necessary in any case as there’s only one Column we need: billing_address = relationship("Address", foreign_keys="Customer.billing_address_id")
But I cannot determine how to use the list notation to specify multiple foreign keys in a single relationship.
For the classes
class PostVersion(db.Model):
id = db.Column(db.Integer, primary_key=True)
...
tag_1_id = db.Column(db.Integer, db.ForeignKey("tag.id"))
tag_2_id = db.Column(db.Integer, db.ForeignKey("tag.id"))
tag_3_id = db.Column(db.Integer, db.ForeignKey("tag.id"))
tag_4_id = db.Column(db.Integer, db.ForeignKey("tag.id"))
tag_5_id = db.Column(db.Integer, db.ForeignKey("tag.id"))
class Tag(db.Model):
id = db.Column(db.Integer, primary_key=True)
tag = db.Column(db.String(127))
I have tried all of the following:
tags = db.relationship("Tag", foreign_keys=[tag_1_id, tag_2_id, tag_3_id, tag_4_id, tag_5_id]) resulting in
sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child tables on relationship AnnotationVersion.tags - there are multiple foreign key paths linking the tables. Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as containing a foreign key reference to the parent table.
tags = db.relationship("Tag", foreign_keys="[tag_1_id, tag_2_id, tag_3_id, tag_4_id, tag_5_id]") resulting in
sqlalchemy.exc.InvalidRequestError: When initializing mapper Mapper|AnnotationVersion|annotation_version, expression '[tag_1_id, tag_2_id, tag_3_id, tag_4_id, tag_5_id]' failed to locate a name ("name 'tag_1_id' is not defined"). If this is a class name, consider adding this relationship() to the class after both dependent classes have been defined.
And many others variations on the list style, using quotes inside and outside, using Table names and Class names.
I've actually solved the problem in the course of this question. Since there seems to be no direct documentation, I'll answer it myself instead of deleting this question.
The key is to define the relationship on a primary join and specify the uselist parameter.
tags = db.relationship("Tag", primaryjoin="or_(PostVersion.tag_1_id==Tag.id,"
"PostVersion.tag_2_id==Tag.id, PostVersion.tag_3_id==Tag.id,"
"PostVersion.tag_4_id==Tag.id, PostVersion.tag_5_id==Tag.id)",
uselist=True)

Sort by Count of Many to Many Relationship - SQLAlchemy

I am using Flask-SQLAlchemy to to query my Postgres database.
I am currently trying to query for suggestions of titles with the following query:
res = Title.query.filter(Titles.name.ilike(searchstring)).limit(20)
So far so good.
Now I would like to order the results by the number of "subscribers" each Title object has.
I am aware of the following SO question: SQLAlchemy ordering by count on a many to many relationship however its solution did not work for me.
I am receiving the following error:
ProgrammingError: (ProgrammingError) column "publishers_1.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...itles_name, count(titles_users.user_id) AS total, publishers...
(Publisher has a one-to-many relationship with Title models)
I understand that the answer has something to do with subqueries
Below is a simplified example of my two models.
# Many-to-Many relationship for user and titles
titles_users = db.Table('titles_users',
db.Column('user_id', db.Integer, db.ForeignKey('users.id')),
db.Column('title_id', db.Integer, db.ForeignKey('titles.id'))
)
class User(db.Model, UserMixin):
__tablename__ = 'users'
# ids
id = db.Column(db.Integer, primary_key=True)
# Attributes
email = db.Column(db.String(255), unique=True)
full_name = db.Column(db.String(255))
pull_list = db.relationship(
'Title',
secondary=titles_users,
backref=db.backref('users', lazy='dynamic'),
lazy='joined'
)
class Title(db.Model):
__tablename__ = 'titles'
#: IDs
id = db.Column(db.Integer(), primary_key=True)
#: Attributes
name = db.Column(db.String(255))
Code below would work for the model you describe:
q = (db.session.query(Title, func.count(names_users.c.user_id).label("total"))
.filter(Title.name.ilike(searchstring))
.outerjoin(names_users).group_by(Title).order_by('total DESC')
)
for x in q:
print(x)
Your error, however, includes some other data like publisher or like. So if above is not helpful, you should add more information to your question.

Sqlalchemy: One to Many relationship combined with Many to Many relationship

I've got a User and Group table with a many to many relationship
_usergroup_table = db.Table('usergroup_table', db.metadata,
db.Column('user_id', db.Integer, db.ForeignKey('user.id')),
db.Column('group_id', db.Integer, db.ForeignKey('group.id')))
class User(db.Model):
"""Handles the usernames, passwords and the login status"""
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(60), nullable=False, unique=True)
class Group(db.Model):
"""Used for unix-style access control."""
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(60), nullable=False)
users = db.relationship('User', secondary=_usergroup_table,
backref='groups')
Now i'd like to add a primary group to the user class. Of course I could just add a group_id column and a relationship to the Group class, but this has drawbacks. I'd like to get all groups when calling User.group, including primary_group. The primary group should always be part of the groups relationship.
Edit:
It seems the way to go is the association object
class User(db.Model, UserMixin):
"""Handles the usernames, passwords and the login status"""
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(60), nullable=False, unique=True)
primary_group = db.relationship(UserGroup,
primaryjoin="and_(User.id==UserGroup.user_id,UserGroup.primary==True)")
class Group(db.Model):
"""Used for unix-style access control."""
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(60), nullable=False)
class UserGroup(db.Model):
user_id = db.Column(db.Integer, db.ForeignKey('user.id'))
group_id = db.Column(db.Integer, db.ForeignKey('group.id'))
active = db.Column(db.Boolean, default=False)
user = db.relationship(User, backref='groups', primaryjoin=(user_id==User.id))
group = db.relationship(Group, backref='users', primaryjoin=(group_id==Group.id))
I could simplify this with the AssociationProxy, but how do I force only a single primary group per user?
The group_id approach you originally thought of has several advantages here to the "boolean flag" approach.
For one thing, it is naturally constrained so that there is only one primary group per user. For another, loading user.primary_group means the ORM can identify this related row by it's primary key, and can look locally in the identity map for it, or emit a simple SELECT by primary key, instead of emitting a query that has a hard-to-index WHERE clause with a boolean inside of it. Yet another is there's no need to get into the association object pattern which simplifies the usage of the association table and allows SQLAlchemy to handle loads and updates from/to this table more efficiently.
Below we use events, including a new version (as of 0.7.7) of #validates that catches "remove" events, to ensure object-level modifications to User.groups and User.primary_group are kept in sync. (If on an older version of 0.7 you can use the attribute "remove" event or the "AttributeExtension.remove" extension method if you're still on 0.6 or earlier). If you wanted to enforce this at the DB level you could possibly use triggers to verify the integrity you're looking for:
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
Base= declarative_base()
_usergroup_table = Table('usergroup_table', Base.metadata,
Column('user_id', Integer, ForeignKey('user.id')),
Column('group_id', Integer, ForeignKey('group.id')))
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(60), nullable=False, unique=True)
group_id = Column(Integer, ForeignKey('group.id'), nullable=False)
primary_group = relationship("Group")
#validates('primary_group')
def _add_pg(self, key, target):
self.groups.add(target)
return target
#validates('groups', include_removes=True)
def _modify_groups(self, key, target, is_remove):
if is_remove and target is self.primary_group:
del self.primary_group
return target
class Group(Base):
__tablename__ = 'group'
id = Column(Integer, primary_key=True)
name = Column(String(60), nullable=False)
users = relationship('User', secondary=_usergroup_table,
backref=backref('groups', collection_class=set))
e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
s = Session(e)
g1, g2, g3 = Group(name='g1'), Group(name='g2'), Group(name='g3')
u1 = User(name='u1', primary_group=g1)
u1.groups.update([g2, g3])
s.add_all([
g1, g2, g3, u1
])
s.commit()
u1.groups.remove(g1)
assert u1.primary_group is None
u1.primary_group = g2
s.commit()
How about a GroupMemberships model to hold the association, instead of _usergroup_table? A user could have many groups through Group Memberships, and a group membership can hold additional attributes, such as whether a given Group is the associated User's primary group.
EDIT
In order to enforce a limit of one primary group per user, I would use a validation in the User model, such that any attempt to assign more (or fewer) than one primary group would result in an error when the record is saved. I am not aware of a way of achieving the same result relying purely on the database's integrity system. There are any number of ways of coding the validation check - the documentation shows a nice approach using the validates() decorator.

Categories

Resources