sqlalchemy: how to join several tables by one query? - python

I have the following SQLAlchemy mapped classes:
class User(Base):
__tablename__ = 'users'
email = Column(String, primary_key=True)
name = Column(String)
class Document(Base):
__tablename__ = "documents"
name = Column(String, primary_key=True)
author = Column(String, ForeignKey("users.email"))
class DocumentsPermissions(Base):
__tablename__ = "documents_permissions"
readAllowed = Column(Boolean)
writeAllowed = Column(Boolean)
document = Column(String, ForeignKey("documents.name"))
I need to get a table like this for user.email = "user#email.com":
email | name | document_name | document_readAllowed | document_writeAllowed
How can it be made using one query request for SQLAlchemy? The code below does not work for me:
result = session.query(User, Document, DocumentPermission).filter_by(email = "user#email.com").all()
Thanks,

Try this
q = Session.query(
User, Document, DocumentPermissions,
).filter(
User.email == Document.author,
).filter(
Document.name == DocumentPermissions.document,
).filter(
User.email == 'someemail',
).all()

As #letitbee said, its best practice to assign primary keys to tables and properly define the relationships to allow for proper ORM querying. That being said...
If you're interested in writing a query along the lines of:
SELECT
user.email,
user.name,
document.name,
documents_permissions.readAllowed,
documents_permissions.writeAllowed
FROM
user, document, documents_permissions
WHERE
user.email = "user#email.com";
Then you should go for something like:
session.query(
User,
Document,
DocumentsPermissions
).filter(
User.email == Document.author
).filter(
Document.name == DocumentsPermissions.document
).filter(
User.email == "user#email.com"
).all()
If instead, you want to do something like:
SELECT 'all the columns'
FROM user
JOIN document ON document.author_id = user.id AND document.author == User.email
JOIN document_permissions ON document_permissions.document_id = document.id AND document_permissions.document = document.name
Then you should do something along the lines of:
session.query(
User
).join(
Document
).join(
DocumentsPermissions
).filter(
User.email == "user#email.com"
).all()
One note about that...
query.join(Address, User.id==Address.user_id) # explicit condition
query.join(User.addresses) # specify relationship from left to right
query.join(Address, User.addresses) # same, with explicit target
query.join('addresses') # same, using a string
For more information, visit the docs.

A good style would be to setup some relations and a primary key for permissions (actually, usually it is good style to setup integer primary keys for everything, but whatever):
class User(Base):
__tablename__ = 'users'
email = Column(String, primary_key=True)
name = Column(String)
class Document(Base):
__tablename__ = "documents"
name = Column(String, primary_key=True)
author_email = Column(String, ForeignKey("users.email"))
author = relation(User, backref='documents')
class DocumentsPermissions(Base):
__tablename__ = "documents_permissions"
id = Column(Integer, primary_key=True)
readAllowed = Column(Boolean)
writeAllowed = Column(Boolean)
document_name = Column(String, ForeignKey("documents.name"))
document = relation(Document, backref = 'permissions')
Then do a simple query with joins:
query = session.query(User, Document, DocumentsPermissions).join(Document).join(DocumentsPermissions)

Expanding on Abdul's answer, you can obtain a KeyedTuple instead of a discrete collection of rows by joining the columns:
q = Session.query(*User.__table__.columns + Document.__table__.columns).\
select_from(User).\
join(Document, User.email == Document.author).\
filter(User.email == 'someemail').all()

This function will produce required table as list of tuples.
def get_documents_by_user_email(email):
query = session.query(
User.email,
User.name,
Document.name,
DocumentsPermissions.readAllowed,
DocumentsPermissions.writeAllowed,
)
join_query = query.join(Document).join(DocumentsPermissions)
return join_query.filter(User.email == email).all()
user_docs = get_documents_by_user_email(email)

Related

Filtering by nested object fields when lazy='joined' is set in relationship SQLAlchemy 1.4

class ContactType(Base):
__tablename__ = 'contact_type'
name = Column(String(255), nullable=False)
class Contact(Base):
__tablename__ = 'contact'
first_name = Column(String(255), nullable=False)
last_name = Column(String(255), nullable=False)
contact_type_id = Column(ForeignKey('contact_type.id'), nullable=False)
contact_type = relationship('ContactType', lazy='joined', innerjoin=True)
ideally, I would filter by Contact.contact_type.name , but it doesn't work that way
query = select(Contact).where(ContactType.name == 'some_type') - doesn't work
query = select(Contact).join(ContactType).where(ContactType.name == 'some_type') - works,
but since contact_type = relationship('ContractorType', lazy='joined', innerjoin=True), it already makes a join (in other cases this feature is used)
an additional join of the ContactType table is dubbed JOIN (visible when echo= True)
if I use contains_eager
query = select(Contact).options(contains_eager(Contact.contact_type)).where(ContactType.name == 'some_type') - works, but
(SAWarning: SELECT statement has a cartesian product between FROM element(s) 'contact_type. Apply join condition(s) between each element to resolve.)
Please tell me how I can do this
I asked the same question in a SQLAlchemy GitHub discussion and they suggested this:
query = (
select(Contact)
.join(Contact.contact_type)
.options(contains_eager(Contact.contact_type))
.where(ContactType.name == "some_type")
)
I checked it, and it works fine.

SQLAlchemy - Complex Sub-querying with join

I need to be able to convert this to SQLALchemy, I'm a bit new but it is required. The SQL Query is this:
SELECT name,
(
SELECT value from account_settings
WHERE name = "max_allowed_records" and accounts.id = account_settings.account_id
) AS max_allowed_records,
(
SELECT count(*) from employees
join employeelists on employees.employeelist_id = employeelists.id
where employeelists.account_id = accounts.id
group by employeelists.account_id
) AS RECORD_COUNT FROM accounts
having coalesce(max_allowed_records,0) < coalesce(RECORD_COUNT,0);
In Joined form, I think they produce the same result (tried it like this to better visualize it in SQLAlchemy Query):
SELECT accounts.name, account_settings.value AS max_allowed_records, count(employees.id) as RECORD_COUNT
FROM accounts
left join account_settings on account_settings.account_id = accounts.id and account_settings.name = "max_allowed_records"
left join employeelists on employeelists.account_id = accounts.id
left join employees on employees.employeelist_id = employeelists.id
group by accounts.id
having coalesce(account_settings.value,0) < coalesce(RECORD_COUNT,0)
So what I want here is that I should get the value from account settings of the account where in the name is max_allowed_records then compare that to how many employees that account has. Unfortunately the only link between account and the employees is the employeelist. I was wondering if you guys could guide me on what I need to do here? SQLAlchemy joins? or sub queries?
Table Definitions:
class Account(Base):
__tablename__ = "accounts"
id = synonym("raw_id")
name = Column(String(255), nullable=False)
settings = relationship("AccountSetting")
class AccountSetting(Base):
__tablename__ = "account_settings"
id = Column(Integer, primary_key=True)
account_id = Column(Integer, ForeignKey("accounts.id"))
name = Column(String(30))
value = Column(String(1000))
class EmployeeList(Base):
__tablename__ = "employeelists"
id = Column(Integer, primary_key=True)
raw_id = Column("id", Integer, primary_key=True)
name = Column(String(255))
account_id = Column(Integer, ForeignKey("accounts.id"))
class Employee(Base):
__tablename__ = "employees"
id = synonym("raw_id")
raw_id = Column("id", Integer, primary_key=True)
employeelist_id = Column(Integer, ForeignKey("employeelists.id"), nullable=False)
After the whole weekend resting my mind, I was able to convert the SQL Query to SQLAlchemy Query:
query = query.outerjoin(max_records_settings,\
and_(max_records_settings.name == "max_allowed_records",\
Account.id == max_records_settings.account_id))\
.outerjoin(EmployeeList, Account.id == EmployeeList.account_id)\
.add_columns(max_records_settings.value)\
.join(Employee, EmployeeList.id == Employee.employeelist_id)\
.group_by(Account.id)\
.having(coalesce(func.count(Employee.id),0) > (coalesce(max_records_settings.value,0)))\

How can I construct a count aggregation over a join with SqlAlchemy?

I have a table of users, a table of groups that those users may belong to, and a join table between users and groups.
This is represented in SQLAlchemy as follows:
class User(Base):
__tablename__ = 'user'
user_id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False)
email = Column(String(250), nullable=False)
groups = relationship('Group', secondary='user_group_pair')
class Group(Base):
__tablename__ = 'group'
group_id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False)
date_created = Column(String(250), nullable=False)
members = relationship('User', secondary='user_group_pair')
class User_Group_Pair(Base):
__tablename__ = 'user_group_pair'
user_group_pair_id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey('user.user_id'))
group_id = Column(Integer, ForeignKey('group.group_id'))
user = relationship(User, backref=backref("group_assoc"))
group = relationship(Group, backref=backref("user_assoc"))
I'm trying to solve the following simple problem:
I want to write a query that will return a list of users along with the number of groups that each of them belongs to.
This requires data from both User and User_Group_Pair (thus why the title of my question refers to a join), and a count aggregation grouped by user_id.
I'm not sure why this won't work:
subq = session.query(User_Group_Pair.user_id.label('user_id'), func.count(User_Group_Pair.user_group_pair_id).label('count')).\
group_by(User_Group_Pair.user_id).order_by('count ASC').subquery()
result = session.query(User).join(subq, User.user_id == subq.user_id).all()
I get this error:
'Alias' object has no attribute 'user_id'
However, note that I have labelled User_Group_Pair.user_id with the label 'user_id'... Any thoughts?
Thank you
Just change subq.user_id to subq.c.user_id (c stands for columns) to make it work:
result = session.query(User).join(subq, User.user_id == subq.c.user_id).all()
But still you will get only those users which belong to at least one group, and the number of groups is not really returned in the result of the query. The query below is an approach to solve this issue:
q = (session.query(User, func.count(Group.group_id).label("num_groups"))
.outerjoin(Group, User.groups)
.group_by(User.user_id)
)
for b, num_groups in q:
print(b, num_groups)
http://docs.sqlalchemy.org/en/rel_1_0/orm/tutorial.html#using-subqueries
subquery() method on Query produces a SQL expression construct representing a SELECT statement embedded within an alias. The columns on the statement are accessible through an attribute called c.
You can use column names with .c.column_name in your query
result = session.query(User).join(subq, User.user_id == subq.c.user_id).all()

sqlalchemy foreign keys / query joins

Hi im having some trouble with foreign key in sqlalchemy not auto incrementing on a primary key ID
Im using: python 2.7, pyramid 1.3 and sqlalchemy 0.7
Here is my models
class Page(Base):
__tablename__ = 'page'
id = Column(Integer, ForeignKey('mapper.object_id'), autoincrement=True, primary_key=True)
title = Column(String(30), unique=True)
title_slug = Column(String(75), unique=True)
text = Column(Text)
date_added = Column(DateTime)
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(100), unique=True)
email = Column(String(100), unique=True)
password = Column(String(100))
class Group(Base):
__tablename__ = 'groups'
id = Column(Integer, primary_key=True)
name = Column(String(100), unique=True)
class Member(Base):
__tablename__ = 'members'
user_id = Column(Integer, ForeignKey('user.id'), primary_key=True)
group_id = Column(Integer, ForeignKey('groups.id'), primary_key=True)
class Resource(Base):
__tablename__ = 'resource'
id = Column(Integer, primary_key=True)
tablename = Column(Text)
action = Column(Text)
class Mapper(Base):
__tablename__ = 'mapper'
resource_id = Column(Integer, ForeignKey('resource.id'), primary_key=True)
group_id = Column(Integer, ForeignKey('groups.id'), primary_key=True)
object_id = Column(Integer, primary_key=True)
and here is my RAW SQL query which i've written in SQLAlchemys ORM
'''
SELECT g.name, r.action
FROM groups AS g
INNER JOIN resource AS r
ON m.resource_id = r.id
INNER JOIN page AS p
ON p.id = m.object_id
INNER JOIN mapper AS m
ON m.group_id = g.id
WHERE p.id = ? AND
r.tablename = ?;
'''
obj = Page
query = DBSession().query(Group.name, Resource.action)\
.join(Mapper)\
.join(obj)\
.join(Resource)\
.filter(obj.id == obj_id, Resource.tablename == obj.__tablename__).all()
the raw SQL Query works fine without any relations between Page and Mapper, but SQLAlchemys ORM seem to require a ForeignKey link to be able to join them. So i decided to put the ForeignKey at Page.id since Mapper.object_id will link to several different tables.
This makes the SQL ORM query with the joins work as expected but adding new data to the Page table results in a exception.
FlushError: Instance <Page at 0x3377c90> has a NULL identity key.
If this is an auto- generated value, check that the database
table allows generation of new primary key values, and that the mapped
Column object is configured to expect these generated values.
Ensure also that this flush() is not occurring at an inappropriate time,
such as within a load() event.
here is my view code:
try:
session = DBSession()
with transaction.manager:
page = Page(title, text)
session.add(page)
return HTTPFound(location=request.route_url('home'))
except Exception as e:
print e
pass
finally:
session.close()
I really don't know why, but i'd rather have the solution in SQLalchemy than doing the RAW SQL since im making this project for learning purposes :)
I do not think autoincrement=True and ForeignKey(...) play together well.
In any case, for join to work without any ForeignKey, you can just specify the join condition in the second parameter of the join(...):
obj = Page
query = DBSession().query(Group.name, Resource.action)\
.join(Mapper)\
.join(Resource)\
.join(obj, Resource.tablename == obj.__tablename__)\
.filter(obj.id == obj_id)\
.all()

Having issues creating a query with an inner join and an outer join

I have the following model (simplified):
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
class Thing(Base):
__tablename__ = 'thing'
id = Column(Integer, primary_key=True)
class Relationship(Base):
__tablename__ = 'relationship'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('thing.id'))
parent = relationship('Thing', backref='parentrelationships', primaryjoin = "Relationship.parent_id == Thing.id")
child_id = Column(Integer, ForeignKey('thing.id'))
child = relationship('Thing', backref='childrelationships', primaryjoin = "Relationship.child_id == Thing.id")
class Vote(Base)
__tablename__ = 'vote'
id = Column(Integer, primary_key=True)
rel_id = Column(Integer, ForeignKey('relationship.id'))
rel = relationship('Relationship', backref='votes')
voter_id = Column(Integer, ForeignKey('user.id'))
voter = relationship('User', backref='votes')
I wanted to query all Relationships with a certain parent, and I also want to query votes made by a certain user on those Relationships. What I've tried:
def get_relationships(thisthing, thisuser):
return DBSession.query(Relationship, Vote).\
filter(Relationship.parent_id == thisthing.id).\
outerjoin(Vote, Relationship.id == Vote.rel_id).\
filter(Vote.voter_id == thisuser.id).\
filter(Vote.rel_id == Relationship.id).\
all()
as well as:
def get_relationships(thisthing, thisuser):
session = DBSession()
rels = session.query(Relationship).\
filter(Relationship.parent_id == thisthing.id).\
subquery()
return session.query(rels, Vote).\
outerjoin(Vote, rels.c.id == Vote.rel_id).\
filter(Vote.voter_id == thisuser.id).\
all()
I get nulls when I do either of these queries. What am I doing wrong?
Just turn on SQL logging (echo=True) and you will see that the resulting SQL query for the first option is something like:
SELECT relationship.id AS relationship_id, relationship.parent_id AS relationship_parent_id, relationship.child_id AS relationship_child_id, vote.id AS vote_id, vote.rel_id AS vote_rel_id, vote.voter_id AS vote_voter_id
FROM relationship LEFT OUTER JOIN vote ON relationship.id = vote.rel_id
WHERE relationship.parent_id = ? AND vote.voter_id = ? AND vote.rel_id = relationship.id
If you examine it, you will notice that the clause vote.rel_id = relationship.id is part of both the JOIN clause and the WHERE clause, which makes the query to filter out those Relationship rows which do not have any votes by requested user.
Solution:
Remove redundant filter(Vote.rel_id == Relationship.id). part from the query.
Edit-1: Also move (remove) the filter for the user filter(Vote.voter_id == thisuser.id) out of WHERE and into the LEFT JOIN clause: outerjoin(Vote, and_(Relationship.id == Vote.rel_id, Vote.voter_id == thisuser.id)).

Categories

Resources