I have three tables: UserTypeMapper, User, and SystemAdmin. In my get_user method, depending on the UserTypeMapper.is_admin row, I then query either the User or SystemAdmin table. The user_id row correlates to the primary key id in the User and SystemAdmin tables.
class UserTypeMapper(Base):
__tablename__ = 'user_type_mapper'
id = Column(BigInteger, primary_key=True)
is_admin = Column(Boolean, default=False)
user_id = Column(BigInteger, nullable=False)
class SystemAdmin(Base):
__tablename__ = 'system_admin'
id = Column(BigInteger, primary_key=True)
name = Column(Unicode)
email = Column(Unicode)
class User(Base):
__tablename__ = 'user'
id = Column(BigInteger, primary_key=True)
name = Column(Unicode)
email = Column(Unicode)
I want to be able to get any user – system admin or regular user – from one query, so I do a join, on either User or SystemAdmin depending on the is_admin row. For example:
DBSession.query(UserTypeMapper, SystemAdmin).join(SystemAdmin, UserTypeMapper.user_id==SystemAdmin.id).first()
and
DBSession.query(UserTypeMapper, User).join(User, UserTypeMapper.user_id==User.id).first()
This works fine; however, I then would like to be access these, like so:
>>> my_admin_obj.is_admin
True
>>> my_admin_obj.name
Bob Smith
versus
>>> my_user_obj.is_admin
False
>>> my_user_obj.name
Bob Stevens
Currently, I have to specify: my_user_obj.UserTypeMapper.is_admin and my_user_obj.User.name. From what I've been reading, I need to map the tables so that I don't need to specify which table the attribute belongs to. My problem is that I do not understand how I can specify this given that I have two potential tables that the name attribute, for example, may come from.
This is the example I am referring to: Mapping a Class against Multiple Tables
How can I achieve this? Thank you.
You have discovered why "dual purpose foreign key", is an antipattern.
There is a related problem to this that you haven't quite pointed out; there's no way to use a foreign key constraint to enforce the data be in a valid state. You want to be sure that there's exactly one of something for each row in UserTypeMapper, but that 'something' is not any one table. formally you want a functional dependance on
user_type_mapper → (system_admin× 1) ∪ (user× 0)
But most sql databses won't allow you to write a foreign key constraint expressing that.
It looks complicated because it is complicated.
instead, lets consider what we really want to say; "every system_admin should be a user; or
system_admin → user
In sql, that would be written:
CREATE TABLE user (
id INTEGER PRIMARY KEY,
name VARCHAR,
email VARCHAR
);
CREATE TABLE system_admin (
user_id INTEGER PRIMARY KEY REFERENCES user(id)
);
Or, in sqlalchemy declarative style
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
email = Column(String)
class SystemAdmin(Base):
__tablename__ = 'system_admin'
user_id = Column(ForeignKey(User.id), primary_key=True)
What sort of questions does this schema allow us to ask?
"Is there a SystemAdmin by the name of 'john doe'"?
>>> print session.query(User).join(SystemAdmin).filter(User.name == 'john doe').exists()
EXISTS (SELECT 1
FROM "user" JOIN system_admin ON "user".id = system_admin.user_id
WHERE "user".name = :name_1)
"How many users are there? How many sysadmins?"
>>> print session.query(func.count(User.id), func.count(SystemAdmin.user_id)).outerjoin(SystemAdmin)
SELECT count("user".id) AS count_1, count(system_admin.user_id) AS count_2
FROM "user" LEFT OUTER JOIN system_admin ON "user".id = system_admin.user_id
I hope you can see why the above is prefereable to the design you describe in your question; but in the off chance you don't have a choice (and only in that case, if you still feel what you've got is better, please refine your question), you can still cram that data into a single python object, which will be very difficult to work with, by providing an alternate mapping to the tables; specifically one which follows the rough structure in the first equation.
We need to mention UserTypeMapper twice, once for each side of the union, for that, we need to give aliases.
>>> from sqlalchemy.orm import aliased
>>> utm1 = aliased(UserTypeMapper)
>>> utm2 = aliased(UserTypeMapper)
For the union bodies join each alias to the appropriate table: Since SystemAdmin and User have the same columns in the same order, we don't need to describe them in detail, but if they are at all different, we need to make them "union compatible", by mentioning each column explicitly; this is left as an exercise.
>>> utm_sa = Query([utm1, SystemAdmin]).join(SystemAdmin, (utm1.user_id == SystemAdmin.id) & (utm1.is_admin == True))
>>> utm_u = Query([utm2, User]).join(User, (utm2.user_id == User.id) & (utm2.is_admin == False))
And then we join them together...
>>> print utm_sa.union(utm_u)
SELECT anon_1.user_type_mapper_1_id AS anon_1_user_type_mapper_1_id, anon_1.user_type_mapper_1_is_admin AS anon_1_user_type_mapper_1_is_admin, anon_1.user_type_mapper_1_user_id AS anon_1_user_type_mapper_1_user_id, anon_1.system_admin_id AS anon_1_system_admin_id, anon_1.system_admin_name AS anon_1_system_admin_name, anon_1.system_admin_email AS anon_1_system_admin_email
FROM (SELECT user_type_mapper_1.id AS user_type_mapper_1_id, user_type_mapper_1.is_admin AS user_type_mapper_1_is_admin, user_type_mapper_1.user_id AS user_type_mapper_1_user_id, system_admin.id AS system_admin_id, system_admin.name AS system_admin_name, system_admin.email AS system_admin_email
FROM user_type_mapper AS user_type_mapper_1 JOIN system_admin ON user_type_mapper_1.user_id = system_admin.id AND user_type_mapper_1.is_admin = 1 UNION SELECT user_type_mapper_2.id AS user_type_mapper_2_id, user_type_mapper_2.is_admin AS user_type_mapper_2_is_admin, user_type_mapper_2.user_id AS user_type_mapper_2_user_id, "user".id AS user_id, "user".name AS user_name, "user".email AS user_email
FROM user_type_mapper AS user_type_mapper_2 JOIN "user" ON user_type_mapper_2.user_id = "user".id AND user_type_mapper_2.is_admin = 0) AS anon_1
While it's theoretically possible to wrap this all up into a python class that looks a bit like standard sqlalchemy orm stuff, I would certainly not do that. working with non-table mappings, especially when they are more than simple joins (this is a union), is lots of work for zero payoff.
Related
I am trying to build an ORM mapped SQLite database. The conception of the DB seems to work as intended but I can't seem to be able to query it properly for more complex cases. I have spent the day trying to find an existing answer to my question but nothing works. I am not sure if the issue is with my mapping, my query or both. Or if maybe querying with attributes from a many to many association table with extra data works differently.
This the DB setup:
engine = create_engine('sqlite:///')
Base = declarative_base(bind=engine)
Session = sessionmaker(bind=engine)
class User(Base):
__tablename__ = 'users'
# Columns
id = Column('id', Integer, primary_key=True)
first = Column('first_name', String(100))
last = Column('last_name', String(100))
age = Column('age', Integer)
quality = Column('quality', String(100))
unit = Column('unit', String(100))
# Relationships
cases = relationship('UserCaseLink', back_populates='user_data')
def __repr__(self):
return f"<User(first='{self.first}', last='{self.last}', quality='{self.quality}', unit='{self.unit}')>"
class Case(Base):
__tablename__ = 'cases'
# Columns
id = Column('id', Integer, primary_key=True)
num = Column('case_number', String(100))
type = Column('case_type', String(100))
# Relationships
users = relationship('UserCaseLink', back_populates='case_data')
def __repr__(self):
return f"<Case(num='{self.num}', type='{self.type}')>"
class UserCaseLink(Base):
__tablename__ = 'users_cases'
# Columns
user_id = Column('user_id', Integer, ForeignKey('users.id'), primary_key=True)
case_id = Column('case_id', Integer, ForeignKey('cases.id'), primary_key=True)
role = Column('role', String(100))
# Relationships
user_data = relationship('User', back_populates='cases')
case_data = relationship('Case', back_populates='users')
if __name__ == '__main__':
Base.metadata.create_all()
session = Session()
and I would like to retrieve all the cases on which a particular person is working under a certain role.
So for example I want a list of all the cases a person named 'Alex' is working on as an 'Administrator'.
In other words I would like the result of this query:
SELECT [cases].*,
[main].[users_cases].role
FROM [main].[cases]
INNER JOIN [main].[users_cases] ON [main].[cases].[id] = [main].[users_cases].[case_id]
INNER JOIN [main].[users] ON [main].[users].[id] = [main].[users_cases].[user_id]
WHERE [main].[users].[first_name] = 'Alex'
AND [main].[users_cases].[role] = 'Administrator';
So far I have tried many things along the lines of:
cases = session.query(Case).filter(User.first == 'Alex', UserCaseLink.role == 'Administrator')
but it is not working as I would like it to.
How can I modify the ORM mapping so that it does the joining for me and allows me to query easily (something like the query I tried)?
According to your calsses, the quivalent query for:
SELECT [cases].*,
[main].[users_cases].role
FROM [main].[cases]
INNER JOIN [main].[users_cases] ON [main].[cases].[id] = [main].[users_cases].[case_id]
INNER JOIN [main].[users] ON [main].[users].[id] = [main].[users_cases].[user_id]
WHERE [main].[users].[first_name] = 'Alex'
AND [main].[users_cases].[role] = 'Administrator';
is
cases = session.query(
Case.id, Case.num,Cas.type,
UserCaseLink.role
).filter(
(Case.id==UserCaseLink.case_id)
&(User.id==UserCaseLink.user_id)
&(User.first=='Alex')
&(UserCaseLink.role=='Administrator'
).all()
also, you can:
cases = Case.query\
.join(UserCaseLink,Case.id==UserCaseLink.case_id)\
.join(User,User.id==UserCaseLink.user_id)\
.filter( (User.first=='Alex') & (User.first=='Alex') )\
.all()
Good Luck
After comment
based in your comment, I think you want something like:
cases = Case.query\
.filter( (Case.case_data.cases.first=='Alex') & (Case.case_data.cases.first=='Alex') )\
.all()
where case_data connect between Case an UserCaseLink and cases connect between UserCaseLink and User as in your relations.
But,that case causes error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with dimpor.org_type has an attribute 'org_type_id'
The missage shows that the attributes combined in filter should belong to the table class
So I ended up having to compromise.
It seems the query cannot be aware of all the relationships present in the ORM mapping at all times. Instead I had to manually give it the path between the different classes for it to find all the data I wanted:
cases = session.query(Case)\
.join(Case.users)\
.join(UserCaseLink.user_data)\
.filter(User.first == 'Alex', UserCaseLink.role == 'Administrator')\
.all()
However, as it does not meet all the criteria for my original question (ie I still have to specify the joins), I will not mark this answer as the accepted one.
I have three models (note that this is done in Flask-SQLAlchemy, but if you can only write an answer for vanilla SQLAlchemy, that is fine with me.) Irrelevant fields are removed for clarity.
class KPI(db.Model):
__tablename__ = 'kpis'
id = db.Column(db.Integer, primary_key=True)
identifier = db.Column(db.String(length=50))
class Report(db.Model):
__tablename__ = 'reports'
id = db.Column(db.Integer, primary_key=True)
class ReportKPI(db.Model):
report_id = db.Column(db.Integer, db.ForeignKey('reports.id'), primary_key=True)
kpi_id = db.Column(db.Integer, db.ForeignKey('kpis.id'), primary_key=True)
report = db.relationship('Report', backref=db.backref('values'))
kpi = db.relationship('KPI')
My goal is to find all Report objects that don’t measure a specific KPI (ie. there is no ReportKPI object whose KPI relationship has identifier set to a specific value).
One of my tries look like
Report.query \
.join(ReportKPI) \
.join(KPI) \
.filter(KPI.identifier != 'reflection')
but this gives back more Report objects that actually exist (I guess I get one for every ReportKPI that has a KPI with anything but “reflection”.)
Is the thing I want to achieve actually possible with SQLAlchemy? If so, what is the magic word (pleas doesn’t seem to work…)
An EXISTS subquery expression is a good fit for your goal. A shorthand way to write such a query would be:
Report.query.\
filter(db.not_(Report.values.any(
ReportKPI.kpi.has(identifier='reflection'))))
but this produces 2 nested EXISTS expressions, though a join in an EXISTS would do as well:
Report.query.\
filter(db.not_(
ReportKPI.query.
filter_by(report_id=Report.id).
join(KPI).
filter_by(identifier='reflection').
exists()))
Finally, a LEFT JOIN with an IS NULL is an option as well:
Report.query.\
outerjoin(db.join(ReportKPI, KPI),
db.and_(ReportKPI.report_id == Report.id,
KPI.identifier == 'reflection')).\
filter(KPI.id.is_(None))
I have a, somewhat odd, query that gets me all the items in a parent table that have no matches in its corresponding child table.
If possible, id like to turn it into an SQLAlchemy query. But I have no idea how. I can do basic gets and filters, but this one is beyond my experience so far. Any help you folks might give would be greatly appreciated.
class customerTranslations(Base):
"""parent table. holds customer names"""
__tablename__ = 'customer_translation'
id = Column(Integer, primary_key=True)
class customerEmails(Base):
"""child table. hold emails for customers in translation table"""
__tablename__ = 'customer_emails'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('customer_translation.id'))
I want to build:
SELECT * FROM customer_translation
WHERE id NOT IN (SELECT parent_id FROM customer_emails)
You have a subquery, so create one first:
all_emails_stmnt = session.query(customerEmails.parent_id).subquery()
and then you can use that to filter your other table:
translations_with_no_email = session.query(customerTranslations).filter(
~customerTranslations.id.in_(all_emails_stmnt))
This produces the same SQL (but with all the column names expanded, rather than using *, the ORM then can create your objects):
>>> all_emails_stmnt = session.query(customerEmails.parent_id).subquery()
>>> print(all_emails_stmnt)
SELECT customer_emails.parent_id
FROM customer_emails
>>> translations_with_no_email = session.query(customerTranslations).filter(
... ~customerTranslations.id.in_(all_emails_stmnt))
>>> print(translations_with_no_email)
SELECT customer_translation.id AS customer_translation_id
FROM customer_translation
WHERE customer_translation.id NOT IN (SELECT customer_emails.parent_id
FROM customer_emails)
You could also use NOT EXISTS:
from sqlalchemy.sql import exists
has_no_email_stmnt = ~exists().where(customerTranslations.id == customerEmails.parent_id)
translations_with_no_email = session.query(customerTranslations).filter(has_no_email_stmnt)
or, if you have a a backreference on the customerTranslations class pointing to emails, named emails, use .any() on the relationship and invert:
session.query(customerTranslations).filter(
~customerTranslations.emails.any())
Back in 2010 NOT EXISTS was a little slower on MySQL but you may want to re-assess if that is still the case.
I have a table that does not have a primary key. And I really do not want to apply this constraint to this table.
In SQLAlchemy, I defined the table class by:
class SomeTable(Base):
__table__ = Table('SomeTable', meta, autoload=True, autoload_with=engine)
When I try to query this table, I got:
ArgumentError: Mapper Mapper|SomeTable|SomeTable could not assemble any primary key columns for mapped table 'SomeTable'.
How to loss the constraint that every table must have a primary key?
There is only one way that I know of to circumvent the primary key constraint in SQL Alchemy - it's to map specific column or columns to your table as a primary keys, even if they aren't primary key themselves.
http://docs.sqlalchemy.org/en/latest/faq/ormconfiguration.html#how-do-i-map-a-table-that-has-no-primary-key.
There is no proper solution for this but there are workarounds for it:
Workaround 1
Adding parameter primary_key to the existing column that is not having a primary key will work.
class SomeTable(Base):
__table__ = 'some_table'
some_other_already_existing_column = Column(..., primary_key=True) # just add primary key to it whether or not this column is having primary key or not
Workaround 2
Just declare a new dummy column on the ORM layer, not in actual DB. Just define in SQLalchemy model
class SomeTable(Base):
__table__ = 'some_table'
column_not_exist_in_db = Column(Integer, primary_key=True) # just add for sake of this error, dont add in db
Disclaimer: Oracle only
Oracle databases secretly store something called rowid to uniquely define each record in a table, even if the table doesn't have a primary key. I solved my lack of primary key problem (which I did not cause!) by constructing my ORM object like:
class MyTable(Base)
__tablename__ = 'stupid_poorly_designed_table'
rowid = Column(String, primary_key=True)
column_a = Column(String)
column_b = Column(String)
...
You can see what rowid actually looks like (it's a hex value I believe) by running
SELECT rowid FROM stupid_poorly_designed_table
GO
Here is an example using __mapper_args__ and a synthetic primary_key. Because the table is time-series oriented data, there is no need for a primary key. All rows can be unique addresses with a (timestamp, pair) tuple.
class Candle(Base):
__tablename__ = "ohlvc_candle"
__table_args__ = (
sa.UniqueConstraint('pair_id', 'timestamp'),
)
#: Start second of the candle
timestamp = sa.Column(sa.TIMESTAMP(timezone=True), nullable=False)
open = sa.Column(sa.Float, nullable=False)
close = sa.Column(sa.Float, nullable=False)
high = sa.Column(sa.Float, nullable=False)
low = sa.Column(sa.Float, nullable=False)
volume = sa.Column(sa.Float, nullable=False)
pair_id = sa.Column(sa.ForeignKey("pair.id"), nullable=False)
pair = orm.relationship(Pair,
backref=orm.backref("candles",
lazy="dynamic",
cascade="all, delete-orphan",
single_parent=True, ), )
__mapper_args__ = {
"primary_key": [pair_id, timestamp]
}
MSSQL Tested
I know this thread is ancient but I spent way too long getting this to work to not share it :)
from sqlalchemy import Table, event
from sqlalchemy.ext.compiler import compiles
from sqlalchemy import Column
from sqlalchemy import Integer
class RowID(Column):
pass
#compiles(RowID)
def compile_mycolumn(element, compiler, **kw):
return "row_number() OVER (ORDER BY (SELECT NULL))"
#event.listens_for(Table, "after_parent_attach")
def after_parent_attach(target, parent):
if not target.primary_key:
# if no pkey create our own one based on returned rowid
# this is untested for writing stuff - likely wont work
logging.info("No pkey defined for table, using rownumber %s", target)
target.append_column(RowID('row_id', Integer, primary_key=True))
https://docs-sqlalchemy-org.translate.goog/en/14/faq/ormconfiguration.html?_x_tr_sl=auto&_x_tr_tl=ru&_x_tr_hl=ru#how-do-i-map-a-table-that-has-no-primary-key
One way from there:
In SQLAlchemy ORM, to map to a specific table, there must be at least one column designated as the primary key column; multi-column composite primary keys are of course also perfectly possible. These columns do not need to be known to the database as primary key columns. The columns only need to behave like a primary key, such as a non-nullable unique identifier for a row.
my code:
from ..meta import Base, Column, Integer, Date
class ActiveMinutesByDate(Base):
__tablename__ = "user_computer_active_minutes_by_date"
user_computer_id = Column(Integer(), nullable=False, primary_key=True)
user_computer_date_check = Column(Date(), default=None, primary_key=True)
user_computer_active_minutes = Column(Integer(), nullable=True)
The solution I found is to add an auto-incrementing primary key column to the table, then use that as your primary key. The database should deal with everything else beyond that.
Here's my ORM entity class. The primary key is composite cause 'id_string' may be the same for different users (identified by uid). One thing I understood from Postgres SQL error when creating a table based on this class (
ProgrammingError: (ProgrammingError) there is no unique constraint matching given keys for referenced table "sync_entities"
) is that I need to add something to parent_id_string's ForeignKey() argument. And that something is, I think, the current record's uid.
Do you suggest to try using different primary key (autoincrementing integer) or there is some other way?
class SyncEntity(Base):
__tablename__ = 'sync_entities'
__table_args__ = (ForeignKeyConstraint(['uid'], ['users.uid'], ondelete='CASCADE'), {})
uid = Column(BigInteger, primary_key=True)
id_string = Column(String, primary_key=True)
parent_id_string = Column(String, ForeignKey('sync_entities.id_string'))
children = relation('SyncEntity',
primaryjoin=('sync_entities.c.id_string==sync_entities.c.parent_id_string'),
backref=backref('parent', \
remote_side=[id_string]))
# old_parent_id = ...
version = Column(BigInteger)
mtime = Column(BigInteger)
ctime = Column(BigInteger)
name = Column(String)
non_unique_name = Column(String)
sync_timestamp = Column(BigInteger)
server_defined_unique_tag = Column(String)
position_in_parent = Column(BigInteger)
insert_after_item_id = Column(String, ForeignKey('sync_entities.id_string'))
insert_after = relation('SyncEntity',
primaryjoin=('sync_entities.c.id_string==sync_entities.c.insert_after_item_id'),
remote_side=[id_string])
deleted = Column(Boolean)
originator_cache_guid = Column(String)
originator_client_item_id = Column(String)
specifics = Column(LargeBinary)
folder = Column(Boolean)
client_defined_unique_tag = Column(String)
ordinal_in_parent = Column(LargeBinary)
You know, primary key being an auto-incremented integer is usually the best approach. Any values that seem to be unique in system, may turn out to be duplicated in future. If you relied on their uniqueness you're in deep trouble.
However, if there is a reason to require certain pair (or triple) of values in each row to be unique, just add constraint to your table, but use auto-increment integer as primary key. Then if requirements change, you can alter/remove/relax your unique constraint without making changes elsewhere.
Also - if you're using simple integer keys, your joins are simpler and can be performed faster by DBMS.
I think I came up with a good idea. Just need to create complex foreign key constructs in the __tableargs__ member like (parent_id_string, uid) and (insert_after_item_id, uid), modifying the primaryjoin statements accordingly.