SQLAlchemy Many to One Join - python

I have a many to one relationship between two SQL tables using SQLAlchemy. For example:
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
child_id = Column(Integer, ForeignKey('child.id'))
child = relationship("Child")
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
name = Column(String(100))
What I would like to be able to do is be able to add information from the Child class to the parent. I tried a join query:
result = session.query(Parent).join(Child).all()
While this query adds the appropriate Child object to the Parent object at parent.child it only returns the first parent for each child, i.e. I have four parents and two children in my database and this query only returns parents 1 and 3. How do I fix the query to return all four parents? The second I have is if I wanted to just add the child's name to the parent, not the entire child object, as parent.child_name how would I go about doing that?

How to get all parents when joining to children
The issue is that some parents do not have children, so using a normal join will exclude them. Use an outer join instead. Also, just adding a join won't actually load the children. You should specify contains_eager or joinedload to load the child with the parent.
# use contains_eager when you are joining and filtering on the relationship already
session.query(Parent).join(Parent.child).filter(Child.name == 'Max').options(contains_eager(Parent.child))
# use joinedload when you do not need to join and filter, but still want to load the relationship
session.query(Parent).options(joinedload(Parent.child))
How to add child_name to the parent
You want to use an association proxy.
from sqlalchemy.ext.associationproxy import association_proxy
class Parent(Base):
child = relationship('Child')
child_name = association_proxy('child', 'name')
# you can filter queries with proxies:
session.query(Parent).filter(Parent.child_name == 'Min')
There are some cool things you can do with association proxies, be sure to read the docs for more information.

Related

What does relationship back_populates does in Sql Alchemy [duplicate]

Can anyone explain the concepts of these two ideas and how they relate to making relationships between tables? I can't really seem to find anything that explains it clearly and the documentation feels like there's too much jargon to understand in easy concepts. For instance, in this example of a one to many relationship in the documentation:
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
children = relationship("Child", back_populates="parent")
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('parent.id'))
parent = relationship("Parent", back_populates="children")
Why does the relationship() go inside the parent class while ForeignKey goes inside the child class? And what does having back_populates exactly do to one another? Does having the placement of which class the relationship() function exist in matter?
backref is a shortcut for configuring both parent.children and child.parent relationships at one place only on the parent or the child class (not both). That is, instead of having
children = relationship("Child", back_populates="parent") # on the parent class
and
parent = relationship("Parent", back_populates="children") # on the child class
you only need one of this:
children = relationship("Child", backref="parent") # only on the parent class
or
parent = relationship("Parent", backref="children") # only on the child class
children = relationship("Child", backref="parent") will create the .parent relationship on the child class automatically. On the other hand, if you use back_populates you must explicitly create the relationships in both parent and child classes.
Why does the relationship() go inside the parent class while ForeignKey goes inside the child class?
As I said above, if you use back_populates, it needs to go on both parent and child classes. If you use backref, it needs to go on one of them only. ForeignKey needs to go on the child class, no matter where the relationship is placed, this is a fundamental concept of relational databases.
And what does having back_populates exactly do to one another?
back_populates informs each relationship about the other, so that they are kept in sync. For example if you do
p1 = Parent()
c1 = Child()
p1.children.append(c1)
print(p1.children) # will print a list of Child instances with one element: c1
print(c1.parent) # will print Parent instance: p1
As you can see, p1 was set as parent of c1 even when you didn't set it explicitly.
Does having the placement of which class the relationship() function exist in matter?
This only applies to backref, and no, you can place the relationship on the parent class (children = relationship("Child", backref="parent")) or on the child class (parent = relationship("Parent", backref="children")) and have the exact same effect.
Update: July, 2022
Apparently, use of back_populates with explicit relationship() constructs should be preferred as
explained at :
https://docs.sqlalchemy.org/en/14/orm/backref.html

FastAPI + SqlAlchemy: Left join

I am new to using both FastAPI and SqlAlchemy with PostgreSQL. I've been working on creating some models, which started out fine.
class Parent(Base):
__tablename__ = "parents"
uid = Column(Integer, primary_key=True)
children = relationship("Child", back_populates="parent")
class Child(Base):
__tablename__ = "children"
uid = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey("parents.uid"))
parent = relationship("Parent", backpopulates="children")
This part works as I would expect, and I can create Parent and Child objects, with Child models having parent_id fields as ForeignKeys that reference the Parent.uid fields.
My issue is when I now try to obtain a parent and it's children in a query. For this I use the SqlAlchemy query function:
session.query(Parent).outerjoin(Child).all()
In my mind this should give me a parent object that looks something like this:{ uid: 1, children: [{ uid: 111 }] }. However all I get is: { uid: 1 }. While it does not throw an error, it doesn't show me the child data. When I look at the query used by SqlAlchemy (using query.statement.compile(compile_kwargs={"literal_binds": True})) I get:
SELECT parent.uid, child.uid as uid_1 FROM parents LEFT OUTER JOIN children ON parent.uid = child.parent_id;
Which is about what I would expect and when I run this in the psql shell I get the expected result:
uid | uid_1
-----+-------
1 | 111
I've tried various different ways to define the relationship, both in the joins and model declarations (backrefs, declaring explicit joins such as .outerjoin(child, child.parent_id == parent.uid, etc.), but nothing I do gives me the output I am looking for from the SqlAlchemy query. Any help is very much appreciated.
When using the ORM usually you reference the relationships off the parent like this:
for parent in session.query(Parent).all():
for child in parent.children:
# do something here
pass
This executes a query every time you have to fetch the children (when using lazy loading) for a parent which is not what you want.
There are lot of loading strategies to "load" the children. One that emulates what you describes is a joined load, like this:
from sqlalchemy.orm import joined_load
q = session.query(Parent).options(joinedload(Parent.children))
for parent in q.all():
for child in parent.children:
# these children should already be loaded
pass
You would use a regular join like in your example if you needed to filter the parent based on a child column.
You can load different relationships different ways both dynamically like above and beforehand by setting the loading setting on the relationship itself. You can read about these things below:
loading_relationships
joinedload

Concepts of backref and back_populate in SQLalchemy?

Can anyone explain the concepts of these two ideas and how they relate to making relationships between tables? I can't really seem to find anything that explains it clearly and the documentation feels like there's too much jargon to understand in easy concepts. For instance, in this example of a one to many relationship in the documentation:
class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
children = relationship("Child", back_populates="parent")
class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('parent.id'))
parent = relationship("Parent", back_populates="children")
Why does the relationship() go inside the parent class while ForeignKey goes inside the child class? And what does having back_populates exactly do to one another? Does having the placement of which class the relationship() function exist in matter?
backref is a shortcut for configuring both parent.children and child.parent relationships at one place only on the parent or the child class (not both). That is, instead of having
children = relationship("Child", back_populates="parent") # on the parent class
and
parent = relationship("Parent", back_populates="children") # on the child class
you only need one of this:
children = relationship("Child", backref="parent") # only on the parent class
or
parent = relationship("Parent", backref="children") # only on the child class
children = relationship("Child", backref="parent") will create the .parent relationship on the child class automatically. On the other hand, if you use back_populates you must explicitly create the relationships in both parent and child classes.
Why does the relationship() go inside the parent class while ForeignKey goes inside the child class?
As I said above, if you use back_populates, it needs to go on both parent and child classes. If you use backref, it needs to go on one of them only. ForeignKey needs to go on the child class, no matter where the relationship is placed, this is a fundamental concept of relational databases.
And what does having back_populates exactly do to one another?
back_populates informs each relationship about the other, so that they are kept in sync. For example if you do
p1 = Parent()
c1 = Child()
p1.children.append(c1)
print(p1.children) # will print a list of Child instances with one element: c1
print(c1.parent) # will print Parent instance: p1
As you can see, p1 was set as parent of c1 even when you didn't set it explicitly.
Does having the placement of which class the relationship() function exist in matter?
This only applies to backref, and no, you can place the relationship on the parent class (children = relationship("Child", backref="parent")) or on the child class (parent = relationship("Parent", backref="children")) and have the exact same effect.
Update: July, 2022
Apparently, use of back_populates with explicit relationship() constructs should be preferred as
explained at :
https://docs.sqlalchemy.org/en/14/orm/backref.html

Performance one-to-many relationship in SQLAlchemy

I'm trying to define a one-to-many relationship with SqlAlchemy where I have Parent has many Child
class Parent(Base):
__tablename__ = "parent"
id = Column(String, primary_key = True)
children = relationship("Child")
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key = True)
feed_type_id = Column(String, ForeignKey("parent.id"))
From business rules, Parent has no much Child (between 10 and 30) and most of the time I will need access to all of them so I think that it's good idea that relationship() retrieve all children in memory in order to increase performance (First question: am I right?) but Few times I need to get a particular child but I won't do something like:
def search_bar_attr(some_value)
for bar in foo.bars:
if(bar.attr == some_value)
return bar
lazy="dynamic" returns a list that allows queries but I think it's slow against "eagerly" loaded because dynamic relationship always queries the database.
Second question: Is there some configuration that covers all my needs?
You can construct the same query that lazy="dynamic" does by using .with_parent.
class Parent(Base):
...
#property
def children_dynamic(self):
return object_session(self).query(Child).with_parent(self, Parent.children)
You can even add a function to reduce boilerplate if you have to write a lot of these:
def dynamicize(rel):
#property
def _getter(self):
return object_session(self).query(rel.parent).with_parent(self, rel)
return _getter
class Parent(Base):
...
children = relationship("Child")
children_dynamic = dynamicize(children)
You don't need to use a function like that one, you don't even need to load all of the child objects in memory.
When you want to search for a child with a certain attribute, you can do:
# get a session object, usually with sessionmaker() configured to bind to your engine instance
c = session.query(Child).filter_by(some_attribute="some value here").all() # returns a list of all child objects that match the filter
# or: to get a child who belongs to a certain parrent with a certain attribute:
# get the parent object (p)
c = session.query(Child).filter_by(feed_type_id=p.id).filter_by(some_attr="some attribute that belongs to children of the p parrent object")
No one strategy will give you everything. However, you can choose a default strategy and then override it.
My recommendation would be to:
Add lazy = "joined" to your relationship so that by default, you will get all the parents.
In cases where you want to query for a set of children dependent on properties of their parents but don't need the parent objects, use the join function on the query and filters referring both to the parent and child
In cases where you need to construct a query similar to what lazy = "dynamic" would do, use the sqlalchemy.orm.defer operator to turn off your lazy = "joined" eager loading and the loading interface( to override eager loading and then use with_parent to construct query. a query like you would have gotten with lazy = "dynamic"

SQLAlchemy one-to-many relationship (Single table with join table)

I have db that I cannot modify, it has two tables 'people' and 'relation'. The table 'people' has names, ids and the column parent (yes/no). The table 'relation' contains a foreign key 'people.id' for parent and a 'people.id' for its child. I want to join columns in the people table so I can
People.query.filter_by(id='id of the parent')
to get the name of the parent and it's childs. This is my code:
class People(db.model):
__tablename__ = 'people'
id = db.Column(db.integer(), primary_key=True
name = db.Column(db.String())
parent = db.Column(db.Integer()) ##0 for no 1 for yes
parent_id=db.relationship('Link',backref=db.backref('Link.parent_id')
class Link(db.Model):
_tablename__ = 'link'
parent_id=db.Column(db.Integer(),db.ForeignKey('people.id'),primary_key=True)
id = db.Column(db.Integer(), db.ForeignKey('people.id'), primary_key=True)
dateofbirth = db.Column(db.Integer())
SQLAlchemy tells me:
ArgumentError: relationship 'parent_id' expects a class or a mapper argument (received: <class 'sqlalchemy.sql.schema.Table'>)
Excuse me if I messed up, but it's my first question here (and also the first steps with SQLAlchemy)
Typically you would want to set up the foreign key and backref in the same table, like this:
class Link(db.Model):
_tablename__ = 'link'
parent_id = db.Column(db.Integer(),db.ForeignKey('people.id'),primary_key=True)
parent = db.relationship('People', backref='links')
Now you can access each Link entries parent via Link.parent, and you can get a list of each People entries links via People.links (assuming this is a one-to-many relationship).
Also, if People.parent is supposed to represent a boolean value then:
1.) you should follow the standard naming convention and call it something like is_parent
2.) you should declare People.parent as a db.Boolean type, not a db.Integer. In most (probably all) database implementations, using booleans instead of integers (when appropriate) is more memory efficient.
I hope this helped.

Categories

Resources