Two declarative classes which have a parent child relationship, the youngest child is the most important child and thus a youngest_child_id column would be useful.
In this there are two relationships - a one to one from parent to child and a one to many from parent to children, but this creates multiple join paths
Something like the below:
class Parent(Base):
__tablename__ = 'parents'
id = Column(Integer, primary_key=True)
youngest_child_id = Column(Integer, foreign_key='Child.id')
youngest_child = relationship("Child", uselist=False, foreign_keys=[youngest_child_id])
children = relationship("Child", back_populates='parent')
Class Child(Base):
__tablename__ = 'children'
id = id = Column(Integer, primary_key=True)
parent_id = Column(Integer, foreign_key='Parent.id')
parent = relationship("Parent", back_populates='children')
This and a few other variations that I have created raise AmbiguousForeignKeysError:
Exception has occurred: sqlalchemy.exc.AmbiguousForeignKeysError
Could
not determine join condition between parent/child tables on
relationship Parent.children
Where is this going wrong and can this be achieved via the ORM?
You've defined foreign_keys for the youngest_child relationship, but you also have to define it for the children and parent relationships:
class Parent(Base):
__tablename__ = 'parents'
id = Column(Integer, primary_key=True)
youngest_child_id = Column(Integer, ForeignKey('children.id'))
youngest_child = relationship("Child", uselist=False, post_update=True,
foreign_keys=[youngest_child_id])
# Pass foreign_keys= as a Python executable string for lazy evaluation
children = relationship("Child", back_populates='parent',
foreign_keys='[Child.parent_id]')
class Child(Base):
__tablename__ = 'children'
id = id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('parents.id'))
parent = relationship("Parent", back_populates='children',
foreign_keys=[parent_id])
In addition you must define post_update=True on for example youngest_child in order to break the circular dependency between the models. Without it SQLAlchemy would have to insert both the parent and the child at the same time, if you do something like this:
p = Parent()
c1, c2 = Child(), Child()
p.children = [c1, c2]
p.youngest_child = c1
session.add(p)
session.commit()
With the post update in place SQLAlchemy first inserts to parents, then to children, and then updates the parent with the youngest child.
Related
I have a 3-level database, tables such as:
class GrandParent(Base):
__tablename__ = 'grandparent'
id = Column(Integer, primary_key=True)
field0 = Column(Float)
parents = relationship("Parent", back_populates="parent", cascade="all, delete-orphan")
class Parent(Base):
__tablename__ = 'parents'
id = Column(Integer, primary_key=True)
field0 = Column(Float)
parent_id = Column("parent", Integer, ForeignKey("grandparent.id"))
parent = relationship("GrandParent", back_populates="parents")
children = relationship("Child", back_populates="parent", cascade="all, delete-orphan")
class Child(Base):
__tablename__ = 'children'
id = Column(Integer, primary_key=True)
field0 = Column(Float)
parent_id = Column("parent", Integer, ForeignKey("parents.id"))
parent = relationship("Parent", back_populates="children")
I stored a record for GrandParent that holds N records of Parent, each handling M records of Child. I need to modify GrandParent instance structure such that it is convenient to erase all its children and grand-children and rebuild the tree. I have to perform these modifications in memory, before letting the user save (persist) changes to the database.
I tried:
for parent in grand_parent.children:
for child in parent.children:
session.delete(child)
session.delete(parent)
# Other code here to add children/ gran-children from scratch to the grand_parent instance
This gives strange behaviour: the new instances are replicated along with the old ones attached to grand_parent.
So I tried force in-class-object deletion:
for parent in grand_parent.children:
parent.children.clear()
grand_parent.children.clear()
Does not work either, with the same result (even if I join the two methods and delete both class-object and from session). How to handle this?
I have a many-to-many relationship similar to the one described here. Notice my Association table includes an extra_data field..
class Association(Base):
__tablename__ = 'association'
left_id = Column(ForeignKey('left.id'), primary_key=True)
right_id = Column(ForeignKey('right.id'), primary_key=True)
extra_data = Column(String(50))
class Parent(Base):
__tablename__ = 'left'
id = Column(Integer, primary_key=True)
children = relationship("Child", secondary="association", back_populates="parents")
class Child(Base):
__tablename__ = 'right'
id = Column(Integer, primary_key=True)
parents = relationship("Parent", secondary="association", back_populates="children")
If I want to fetch a particular parent object with its children, I can do
db_parent = db.query(Parent).where(Parent.id == 1).first()
print(db_parent.children[0].id) # works fine
BUT, the extra_data field is not included as an attribute of the children.
print(db_parent.children[0].extra_data)
AttributeError: 'Child' object has no attribute 'extra_data'
How can I write fetch the children of a parent such that extra_data is included as an attribute?
Fully Working Example
from sqlalchemy import create_engine, Column, Integer, String, ForeignKey
from sqlalchemy.orm import declarative_base, relationship, Session
# Make the engine
engine = create_engine("sqlite+pysqlite:///:memory:", future=True, echo=False)
# Make the DeclarativeMeta
Base = declarative_base()
class Association(Base):
__tablename__ = 'association'
left_id = Column(ForeignKey('left.id'), primary_key=True)
right_id = Column(ForeignKey('right.id'), primary_key=True)
extra_data = Column(String(50))
class Parent(Base):
__tablename__ = 'left'
id = Column(Integer, primary_key=True)
children = relationship("Child", secondary="association", back_populates="parents")
class Child(Base):
__tablename__ = 'right'
id = Column(Integer, primary_key=True)
parents = relationship("Parent", secondary="association", back_populates="children")
# Create the tables in the database
Base.metadata.create_all(engine)
# Test it
with Session(bind=engine) as session:
# add parents
p1 = Parent()
session.add(p1)
p2 = Parent()
session.add(p2)
session.commit()
# add children
c1 = Child()
session.add(c1)
c2 = Child()
session.add(c2)
session.commit()
# map children to parents
a1 = Association(left_id=p1.id, right_id=c1.id, extra_data='foo')
a2 = Association(left_id=p1.id, right_id=c2.id, extra_data='bar')
a3 = Association(left_id=p2.id, right_id=c2.id, extra_data='baz')
session.add(a1)
session.add(a2)
session.add(a3)
session.commit()
with Session(bind=engine) as session:
db_parent = session.query(Parent).where(Parent.id == 1).first()
print(db_parent.children[0].id)
print(db_parent.children[0].extra_data)
What you are asking can't be done exactly how you want using SQLAlchemy. Indeed, items in Parent.children whould be instances of Child class. If your child class has an extra_data property loaded from an association table, to which of its parent would it refer?
What I'm trying to explain is that this implicit reference to "extra_data" that you would like to have in Child, only makes sense if the Child object is referenced from a parent object.
As an example, imagine the following scenario
session.add_all(
Association(left=parent_a.id, right=child.id, extra_data="hello")
Association(left=parent_b.id, right=child.id, extra_data="world")
)
Which parent metadata would you expect in child.extra_data ?
Moreover most of the time, if you need an object as association table, it means that this object makes sense by itself. And so that you should not try to hide it. Have a look at the following concrete example
class Account(Base):
__tablename__ = "accounts"
id = Column(Integer, primary_key=True)
username = Column(String(10), nullable=False)
groups = relationship("Membership", back_populates="account")
class Group(Base):
__tablename__ = "groups"
id = Column(Integer, primary_key=True)
name = Column(String(10), nullable=False)
members = relationship("Membership", back_populates="group")
class Membership(Base):
"""Membership is our association table here"""
__tablename__ = "memberships"
id = Column(Integer, primary_key=True)
account_id = Column(Integer, ForeignKey("accounts.id"))
account = relationship("Account", back_populates="groups")
group_id = Column(Integer, ForeignKey("groups.id"))
group = relationship("Group", back_populates="members")
# extra data embed in association table
role = Column(String(10), nullable=False)
Base.metadata.create_all()
# create user "toto" that belongs to group "Funny people" with role "joker"
toto = Account(username="toto")
funny_people = Group(name="Funny people")
session.add(Membership(account=toto, group=funny_people, role="joker"))
session.commit()
Notice the difference between the two approaches. Here, Account.groups contains memberships and not directly Group objects. Then you can use it this way :
toto = session.query(Account).first()
toto.username
toto.groups[0].group.name
toto.groups[0].role
I know this is not exactly what you asked, but this is probably the closest you can have without introducing weird logic that will interfere with the proper functioning of your application
Thanks to #van for introducing me to SQLAlchemy's AssociationProxy. With AssociationProxy, I can almost get what I want, but it's still not ideal.
The idea here is to create three tables / classes as usual:
left (Parent)
right (Child)
association (Association)
Then I give Parent a children relationship attribute. I also give Association a parent and a child relationship attribute.
Lastly, I set up association proxies inside Association so that it "carries" all the stuff its related child object has that I want. Here's a working example
from sqlalchemy import create_engine, Column, Integer, String, Float, ForeignKey
from sqlalchemy.orm import declarative_base, relationship, Session
from sqlalchemy.ext.associationproxy import association_proxy
# Make the engine
engine = create_engine("sqlite+pysqlite:///:memory:", future=True, echo=True)
# Make the DeclarativeMeta
Base = declarative_base()
class Association(Base):
__tablename__ = 'association'
left_id = Column(ForeignKey('left.id'), primary_key=True)
right_id = Column(ForeignKey('right.id'), primary_key=True)
parent = relationship("Parent", back_populates="children")
child = relationship("Child")
extra_data = Column(String(50))
# Association proxies
child_name = association_proxy("child", "name")
child_weight = association_proxy("child", "weight")
class Parent(Base):
__tablename__ = 'left'
id = Column(Integer, primary_key=True)
children = relationship("Association", back_populates="parent")
class Child(Base):
__tablename__ = 'right'
id = Column(Integer, primary_key=True)
name = Column(String(100), nullable=False)
weight = Column(Float, nullable=False)
# Create the tables in the database
Base.metadata.create_all(engine)
# Test it
with Session(bind=engine) as session:
# add parents
p1 = Parent()
session.add(p1)
p2 = Parent()
session.add(p2)
session.commit()
# add children
c1 = Child(name = "A", weight = 5)
session.add(c1)
c2 = Child(name = "B", weight = 3)
session.add(c2)
session.commit()
# map children to parents
a1 = Association(left_id=p1.id, right_id=c1.id, extra_data='foo')
a2 = Association(left_id=p1.id, right_id=c2.id, extra_data='bar')
a3 = Association(left_id=p2.id, right_id=c2.id, extra_data='baz')
session.add(a1)
session.add(a2)
session.add(a3)
session.commit()
Now if I fetch a parent instance, I can reference parent.children which returns a list of children with all the attributes I need.
with Session(bind=engine) as session:
db_parent = session.query(Parent).where(Parent.id == 1).first()
print(db_parent.children[0].extra_data)
print(db_parent.children[0].child_name)
print(db_parent.children[0].child_weight)
Technically though, parent.children is returning a list of Associations where each association is acquiring attributes from its related Child instance via my association proxies. A drawback to this is that I have to label these attributes child_name and child_weight as opposed to simply name and weight, otherwise if I decided to set up the reverse relationship, it won't be obvious that name and weight are attributes of the child and not the parent.
Another solution I came up with is to define a read-only property of Parent called children which merely executes the SQL query required to fetch the exact data I need.
from sqlalchemy import create_engine, Column, Integer, String, Float, ForeignKey
from sqlalchemy.orm import declarative_base, Session, object_session
# Make the engine
engine = create_engine("sqlite+pysqlite:///:memory:", future=True, echo=True)
# Make the DeclarativeMeta
Base = declarative_base()
class Association(Base):
__tablename__ = 'association'
left_id = Column(ForeignKey('left.id'), primary_key=True)
right_id = Column(ForeignKey('right.id'), primary_key=True)
extra_data = Column(String(50))
class Parent(Base):
__tablename__ = 'left'
id = Column(Integer, primary_key=True)
#property
def children(self):
s = """
SELECT foo.* FROM (
SELECT
right.*,
association.extra_data,
association.left_id
FROM right INNER JOIN association ON right.id = association.right_id
) AS foo
INNER JOIN left ON foo.left_id = left.id
WHERE left.id = :leftid
"""
result = object_session(self).execute(s, params={'leftid': self.id}).fetchall()
return result
class Child(Base):
__tablename__ = 'right'
id = Column(Integer, primary_key=True)
name = Column(String(100), nullable=False)
weight = Column(Float, nullable=False)
# Create the tables in the database
Base.metadata.create_all(engine)
# Test it
with Session(bind=engine) as session:
# add parents
p1 = Parent()
session.add(p1)
p2 = Parent()
session.add(p2)
session.commit()
# add children
c1 = Child(name = "A", weight = 5)
session.add(c1)
c2 = Child(name = "B", weight = 3)
session.add(c2)
session.commit()
# map children to parents
a1 = Association(left_id=p1.id, right_id=c1.id, extra_data='foo')
a2 = Association(left_id=p1.id, right_id=c2.id, extra_data='bar')
a3 = Association(left_id=p2.id, right_id=c2.id, extra_data='baz')
session.add(a1)
session.add(a2)
session.add(a3)
session.commit()
Usage
with Session(bind=engine) as session:
db_parent = session.query(Parent).where(Parent.id == 1).first()
print(db_parent.children[0].extra_data) # foo
print(db_parent.children[0].name) # A
print(db_parent.children[0].weight) # 5.0
I have the multiple-level one-to-many tables, linked by the foreign key.
I frequently query the specific child according to family_id, grandparents, and parent's name.
The query result should be only one.
If a child does not exist, I'll create a new child record and link it to the given parent.
The number of the child table is much larger than the parent table.
Childs >>>> Parents >> Grandparents > families
People from different families can have the same name.
(The name in the child table can be the same because they might come from different families and different parents)
Here are the model definitions
class Families:
__tablename__ = 'families'
id = Column(Integer, primary_key=True)
family_name = Column(String, nullable=False)
grand_parents = relationship(GrandParents, backref="family")
class GrandParents:
__tablename__ = 'grand_parents'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
family_id = Column(Integer, ForeignKey("families.id"))
parents = relationship(Parents, backref="grand_parent")
class Parents:
__tablename__ = 'parents'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
grand_parent_id = Column(Integer, ForeignKey("grand_parents.id"))
childs = relationship(Childs, backref="parent")
class Childs:
__tablename__ = 'childs'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
parent_id = Column(Integer, ForeignKey("parents.id"))
For now, I use join to query the targeted child row by given family_id, grandparent's name, and parent's name
def get_child(family_id, grand_praent_name, parent_name, child_name):
child = session.query(Childs)\
.join(Parents)\
.join(GrandParents)\
.filter(GrandParents.family_id == family_id,
GrandParents.name == grand_praent_name,
Parents.name == parent_name,
Childs.name == child_name).one_or_none()
return child
But each time, I have to do this kind of query (go through all children for a specific family and update their value depending on business logic).
Is there a better approach/design/idiomatic to do this kind of query?
Your query looks good, if you only update child. You may consider using with_entities to avoid fetching to many unnecessary data:
def get_child(family_id, grand_praent_name, parent_name, child_name):
child = session.query(Childs)\
.join(Parents)\
.join(GrandParents)\
.filter(GrandParents.family_id == family_id,
GrandParents.name == grand_praent_name,
Parents.name == parent_name,
Childs.name == child_name)\
.with_entities(Childs).one_or_none()
return child
If you would like to update also parent or grandparent you should use joinedload to avoid additional implicit queries.
If you encountered performance issues for your query you should add indexes to foreign keys to improve joining, e.g.:
parent_id = Column(Integer, ForeignKey("parents.id"), index=True)
I am trying to update a simple 3 layer relational set of tables.
They are
Parent
Child
GrandChild
The SQLAlchemy code for the model looks like
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
name = Column(String)
children = relationship("Child", back_populates="parents")
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
name = Column(String)
parent_id = Column(Integer, ForeignKey('parent.id'))
parents = relationship("Parent", back_populates="children")
grandchildren = relationship("GrandChild",
back_populates="grandparent",
)
class GrandChild(Base):
__tablename__ = 'grandchild'
id = Column(Integer, primary_key=True)
name = Column(String)
parent_id = Column(Integer, ForeignKey('parent.id'))
child_id = Column(Integer, ForeignKey('child.id'))
grandparent = relationship("Child", back_populates="grandchildren")
And the Insert code looks like this....
p3 = Parent(name="P3")
c5 = Child(name="C5")
c6 = Child(name="C6")
gc1 = GrandChild(name="gc1")
gc2 = GrandChild(name="gc2")
gc3 = GrandChild(name="gc3")
gc4 = GrandChild(name="gc4")
p3.children = [c5, c6]
c5.grandchildren = [gc1]
c6.grandchildren = [gc2, gc3, gc4]
session.add_all([p2, p3])
session.commit()
The record is added - and Parent/Child are correctly linked - but GrandChildren are missing the Parent foreign key.
I have struggled in finding the correct mechanism to add this - can anyone point me in the right direction ?
You don't create the relation between grandchilds and parents.The relationship between a grandchild and a parent isn't implicit in your data model; the parent of a child doesn't automatically become the parent of all the child's grandchildren.
You have to define that relationship explicitly, i.e. add it to the GrandChild:
class GrandChild(Base):
[...]
parent = relationship("Parent")
and then create the relation on the instances:
gc1.parent = p3
gc2.parent = p3
gc3.parent = p3
gc4.parent = p3
This will add the records accordingly:
sqlalchemy.engine.base.Engine INSERT INTO grandchild (name, parent_id, child_id) VALUES (?, ?, ?)
sqlalchemy.engine.base.Engine ('gc1', 1, 1)
[...]
However, since the parent-child relationship in your data model doesn't imply any grandchild-parent relationship, you can create a parent without children, that has grandchildren.
sink = Parent(name="SINK")
gc1.parent = sink
print("Name: {}, Parent: {}, Parent.children: {}, Child.parent: {}"
.format(gc1.name, gc1.parent.name, gc1.parent.children, gc1.grandparent.parents.name))
# Name: gc1, Parent: SINK, Parent.children: [], Child.parent: P3
Based on my understanding of a three-tier-relation, I can't think of a use case where sth. like this would find an application.
If you want an implicit and consistent relationship between an parent and a grandchild through a child, drop the direct relationship between parent and grandchild:
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
name = Column(String)
children = relationship("Child", back_populates="parent")
def __repr__(self):
return "{}(name={})".format(self.__class__.__name__, self.name)
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
name = Column(String)
parent_id = Column(Integer, ForeignKey('parent.id'))
parent = relationship("Parent", back_populates="children")
children = relationship("GrandChild", back_populates="parent")
# same __repr__()
class GrandChild(Base):
__tablename__ = 'grandchild'
id = Column(Integer, primary_key=True)
name = Column(String)
child_id = Column(Integer, ForeignKey('child.id'))
parent = relationship("Child", back_populates="children")
# same __repr__()
p3 = Parent(name="P3")
c5 = Child(name="C5")
gc1 = GrandChild(name="gc1")
p3.children = [c5]
c5.children = [gc1]
You can access the grandchild's grandparent through:
print(gc1.parent.parent)
# Parent(name=P3)
The other way around is a bit more tedious though, due to the two one-to-many relationships in the hierarchy:
for child in p3.children:
for gc in child.children:
print(p3, child, gc)
# Parent(name=P3) Child(name=C5) GrandChild(name=gc1)
Is there any way to create a relationship like this (example data) between Parent and Child based on parent_id and id respectively:
Parent
parent_id: "A1234"
name: "Parent Name"
Child
id: 1234
how can I add the foreign key to the Child? The parent_id is a String. Is there a way to slice it and then cast to Integer?
Edit:
Also what if the situation happens other way round:
Child:
child_id: "A1234"
Parent:
parent_letter: "A"
parent_id: 1234
would that be something like:
primaryjoin=(child_id == (Parent.parent_letter + str(Parent.parent_id)))
what would the remote_side look like? or the entire relationship?
See Creating Custom Foreign Conditions section of documentation. Using the cast, the relationship can be setup for the model below:
class Parent(Base):
__tablename__ = 'parent'
parent_id = Column(String, primary_key=True)
name = Column(String, nullable=False)
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
parent = relationship(
Parent,
primaryjoin=("A" + cast(id, String) == Parent.parent_id),
foreign_keys=id,
remote_side=Parent.parent_id,
backref="children",
# uselist=False, # use in case of one-to-one relationship
)
In this case you can query for Parent.children or Child.parent:
p1 = session.query(Parent).get('A1234')
print(p1)
print(p1.children)
c1 = session.query(Child).get(1234)
print(c1)
print(c1.parent)
However you would still not be able to create relationship items like below:
p = Parent(
parent_id='A3333', name='with a child',
children=[Child(name='will not work')]
)
session.add(p)
session.commit() # this will fail
Edit-1: For the alternative case you mention in your comments and edit, following relationship definition should work (obviously, the model is defined differently as well):
parent = relationship(
Parent,
primaryjoin=(
foreign(child_id) ==
remote(Parent.parent_letter + cast(Parent.parent_id, String))
),
backref="children",
uselist=False,
)