How to set up complex condition composite foreign key in SQLAlchemy - python

Here's my ORM entity class. The primary key is composite cause 'id_string' may be the same for different users (identified by uid). One thing I understood from Postgres SQL error when creating a table based on this class (
ProgrammingError: (ProgrammingError) there is no unique constraint matching given keys for referenced table "sync_entities"
) is that I need to add something to parent_id_string's ForeignKey() argument. And that something is, I think, the current record's uid.
Do you suggest to try using different primary key (autoincrementing integer) or there is some other way?
class SyncEntity(Base):
__tablename__ = 'sync_entities'
__table_args__ = (ForeignKeyConstraint(['uid'], ['users.uid'], ondelete='CASCADE'), {})
uid = Column(BigInteger, primary_key=True)
id_string = Column(String, primary_key=True)
parent_id_string = Column(String, ForeignKey('sync_entities.id_string'))
children = relation('SyncEntity',
primaryjoin=('sync_entities.c.id_string==sync_entities.c.parent_id_string'),
backref=backref('parent', \
remote_side=[id_string]))
# old_parent_id = ...
version = Column(BigInteger)
mtime = Column(BigInteger)
ctime = Column(BigInteger)
name = Column(String)
non_unique_name = Column(String)
sync_timestamp = Column(BigInteger)
server_defined_unique_tag = Column(String)
position_in_parent = Column(BigInteger)
insert_after_item_id = Column(String, ForeignKey('sync_entities.id_string'))
insert_after = relation('SyncEntity',
primaryjoin=('sync_entities.c.id_string==sync_entities.c.insert_after_item_id'),
remote_side=[id_string])
deleted = Column(Boolean)
originator_cache_guid = Column(String)
originator_client_item_id = Column(String)
specifics = Column(LargeBinary)
folder = Column(Boolean)
client_defined_unique_tag = Column(String)
ordinal_in_parent = Column(LargeBinary)

You know, primary key being an auto-incremented integer is usually the best approach. Any values that seem to be unique in system, may turn out to be duplicated in future. If you relied on their uniqueness you're in deep trouble.
However, if there is a reason to require certain pair (or triple) of values in each row to be unique, just add constraint to your table, but use auto-increment integer as primary key. Then if requirements change, you can alter/remove/relax your unique constraint without making changes elsewhere.
Also - if you're using simple integer keys, your joins are simpler and can be performed faster by DBMS.

I think I came up with a good idea. Just need to create complex foreign key constructs in the __tableargs__ member like (parent_id_string, uid) and (insert_after_item_id, uid), modifying the primaryjoin statements accordingly.

Related

SQLAlchemy many-to-many association querying specific child

In the case of many-to-many relationships, an association table can be used in the form of Association Object pattern.
I have the following setup of two classes having a M2M relationship through UserCouncil association table.
class Users(Base):
name = Column(String, nullable=False)
email = Column(String, nullable=False, unique=True)
created_at = Column(DateTime, default=datetime.utcnow)
password = Column(String, nullable=False)
salt = Column(String, nullable=False)
councils = relationship('UserCouncil', back_populates='user')
class Councils(Base):
name = Column(String, nullable=False)
created_at = Column(DateTime, default=datetime.utcnow)
users = relationship('UserCouncil', back_populates='council')
class UserCouncil(Base):
user_id = Column(UUIDType, ForeignKey(Users.id, ondelete='CASCADE'), primary_key=True)
council_id = Column(UUIDType, ForeignKey(Councils.id, ondelete='CASCADE'), primary_key=True)
role = Column(Integer, nullable=False)
user = relationship('Users', back_populates='councils')
council = relationship('Councils', back_populates='users')
However, in this situation, suppose I want to search for a council with a specific name cname for a given user user1. I can do the following:
for council in user1.councils:
if council.name == cname:
dosomething(council)
Or, alternatively, this:
session.query(UserCouncil) \
.join(Councils) \
.filter((UserCouncil.user_id == user1.id) & (Councils.name == cname)) \
.first() \
.council
While the second one is more similar to raw SQL queries and performs better, the first one is simpler. Is there any other, more idiomatic way of expressing this query which is better performing while also utilizing the relationship linkages instead of explicitly writing traditional joins?
First, I think even the SQL query you bring as an example might need to go to fetch the UserCouncil.council relationship again to the DB if it is not loaded in the memory already.
I think that given you want to search directly for the Council instance given its .name and the User instance, this is exactly what you should ask for. Below is the query for that with 2 options on how to filter on user_id (you might be more familiar with the second option, so please use it):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.filter(Councils.users.any(UserCouncil.user_id == user_id)) # v1: this does not require JOIN, but produces the same result as below
# .join(UserCouncil).filter(UserCouncil.user_id == user_id) # v2: join, very similar to original SQL
)
council = session.execute(q).scalars().first()
As to making it more simple and idiomatic, I can only suggest to wrap it in a method or property on the User instance:
class Users(...):
...
def get_council_by_name(self, councils_name):
q = (
select(Councils)
.filter(Councils.name == councils_name)
.join(UserCouncil).filter(with_parent(self, Users.councils))
)
return object_session(self).execute(q).scalars().first()
so that you can later call it user.get_council_by_name('xxx')
Edit-1: added SQL queries
v1 of the first q query above will generate following SQL:
SELECT councils.id,
councils.name
FROM councils
WHERE councils.name = :name_1
AND (EXISTS
(SELECT 1
FROM user_councils
WHERE councils.id = user_councils.council_id
AND user_councils.user_id = :user_id_1
)
)
while v2 option will generate:
SELECT councils.id,
councils.name
FROM councils
JOIN user_councils ON councils.id = user_councils.council_id
WHERE councils.name = :name_1
AND user_councils.user_id = :user_id_1

(Flask-)SQLAlchemy primary key issues probably due to implicit transactions

In a project using Flask-SQLAlchemy, i get some intermittent errors and i think it might be due to not explicitly using transactions.
I have these two model classes, one for locations and another for closures:
class Location(db.Model):
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String)
code = sa.Column(sa.String, unique=True)
class LocationPath(db.Model):
ancestor_id = sa.Column(sa.Integer, sa.ForeignKey('location.id'), nullable=False, primary_key=True)
descendant_id = sa.Column(sa.Integer, sa.ForeignKey('location.id'), nullable=False, primary_key=True)
depth = sa.Column(sa.Integer, default=0, nullable=False)
In a background process, i'm doing a lot of inserts, so i'm bypassing the ORM to use Core:
location_table = Location.__table__
location_path_table = LocationPath.__table__
statement = select([location_table.c.id]).where(code == code)
result = db.session.get_bind().execute(statement)
location_id = result.first()
if location_id is None:
statement = location_table.insert().values(**kwargs)
result = db.session.get_bind().execute(statement)
new_id = result.inserted_primary_key[0]
result.close()
else:
new_id = location_id
# save new_id as an ancestor_id or a descendant_id
path = LocationPath.query.filter_by(
ancestor_id=ancestor_id,
descendant_id=descendant_id
).first()
if path is None:
statement = location_path_table.insert().values(
ancestor_id=ancestor_id,
descendant_id=descendant_id,
depth=depth)
# the line below intermittently generates either of two errors:
# - the inserted primary key (ancestor/descendant) does not exist
# - a duplicate key error where the path already exists
result = db.session.get_bind().execute(statement)
this has resulted in quite a bit of head-scratching on my part, since i get the ancestor_id or descendant_id either from a select or an insert, and i also query the database to see if the path exists before attempting to insert it.
Edit: the code above runs in a loop.

How to define a table without primary key with SQLAlchemy?

I have a table that does not have a primary key. And I really do not want to apply this constraint to this table.
In SQLAlchemy, I defined the table class by:
class SomeTable(Base):
__table__ = Table('SomeTable', meta, autoload=True, autoload_with=engine)
When I try to query this table, I got:
ArgumentError: Mapper Mapper|SomeTable|SomeTable could not assemble any primary key columns for mapped table 'SomeTable'.
How to loss the constraint that every table must have a primary key?
There is only one way that I know of to circumvent the primary key constraint in SQL Alchemy - it's to map specific column or columns to your table as a primary keys, even if they aren't primary key themselves.
http://docs.sqlalchemy.org/en/latest/faq/ormconfiguration.html#how-do-i-map-a-table-that-has-no-primary-key.
There is no proper solution for this but there are workarounds for it:
Workaround 1
Adding parameter primary_key to the existing column that is not having a primary key will work.
class SomeTable(Base):
__table__ = 'some_table'
some_other_already_existing_column = Column(..., primary_key=True) # just add primary key to it whether or not this column is having primary key or not
Workaround 2
Just declare a new dummy column on the ORM layer, not in actual DB. Just define in SQLalchemy model
class SomeTable(Base):
__table__ = 'some_table'
column_not_exist_in_db = Column(Integer, primary_key=True) # just add for sake of this error, dont add in db
Disclaimer: Oracle only
Oracle databases secretly store something called rowid to uniquely define each record in a table, even if the table doesn't have a primary key. I solved my lack of primary key problem (which I did not cause!) by constructing my ORM object like:
class MyTable(Base)
__tablename__ = 'stupid_poorly_designed_table'
rowid = Column(String, primary_key=True)
column_a = Column(String)
column_b = Column(String)
...
You can see what rowid actually looks like (it's a hex value I believe) by running
SELECT rowid FROM stupid_poorly_designed_table
GO
Here is an example using __mapper_args__ and a synthetic primary_key. Because the table is time-series oriented data, there is no need for a primary key. All rows can be unique addresses with a (timestamp, pair) tuple.
class Candle(Base):
__tablename__ = "ohlvc_candle"
__table_args__ = (
sa.UniqueConstraint('pair_id', 'timestamp'),
)
#: Start second of the candle
timestamp = sa.Column(sa.TIMESTAMP(timezone=True), nullable=False)
open = sa.Column(sa.Float, nullable=False)
close = sa.Column(sa.Float, nullable=False)
high = sa.Column(sa.Float, nullable=False)
low = sa.Column(sa.Float, nullable=False)
volume = sa.Column(sa.Float, nullable=False)
pair_id = sa.Column(sa.ForeignKey("pair.id"), nullable=False)
pair = orm.relationship(Pair,
backref=orm.backref("candles",
lazy="dynamic",
cascade="all, delete-orphan",
single_parent=True, ), )
__mapper_args__ = {
"primary_key": [pair_id, timestamp]
}
MSSQL Tested
I know this thread is ancient but I spent way too long getting this to work to not share it :)
from sqlalchemy import Table, event
from sqlalchemy.ext.compiler import compiles
from sqlalchemy import Column
from sqlalchemy import Integer
class RowID(Column):
pass
#compiles(RowID)
def compile_mycolumn(element, compiler, **kw):
return "row_number() OVER (ORDER BY (SELECT NULL))"
#event.listens_for(Table, "after_parent_attach")
def after_parent_attach(target, parent):
if not target.primary_key:
# if no pkey create our own one based on returned rowid
# this is untested for writing stuff - likely wont work
logging.info("No pkey defined for table, using rownumber %s", target)
target.append_column(RowID('row_id', Integer, primary_key=True))
https://docs-sqlalchemy-org.translate.goog/en/14/faq/ormconfiguration.html?_x_tr_sl=auto&_x_tr_tl=ru&_x_tr_hl=ru#how-do-i-map-a-table-that-has-no-primary-key
One way from there:
In SQLAlchemy ORM, to map to a specific table, there must be at least one column designated as the primary key column; multi-column composite primary keys are of course also perfectly possible. These columns do not need to be known to the database as primary key columns. The columns only need to behave like a primary key, such as a non-nullable unique identifier for a row.
my code:
from ..meta import Base, Column, Integer, Date
class ActiveMinutesByDate(Base):
__tablename__ = "user_computer_active_minutes_by_date"
user_computer_id = Column(Integer(), nullable=False, primary_key=True)
user_computer_date_check = Column(Date(), default=None, primary_key=True)
user_computer_active_minutes = Column(Integer(), nullable=True)
The solution I found is to add an auto-incrementing primary key column to the table, then use that as your primary key. The database should deal with everything else beyond that.

SQLAlchemy mapping joined tables' columns to one object

I have three tables: UserTypeMapper, User, and SystemAdmin. In my get_user method, depending on the UserTypeMapper.is_admin row, I then query either the User or SystemAdmin table. The user_id row correlates to the primary key id in the User and SystemAdmin tables.
class UserTypeMapper(Base):
__tablename__ = 'user_type_mapper'
id = Column(BigInteger, primary_key=True)
is_admin = Column(Boolean, default=False)
user_id = Column(BigInteger, nullable=False)
class SystemAdmin(Base):
__tablename__ = 'system_admin'
id = Column(BigInteger, primary_key=True)
name = Column(Unicode)
email = Column(Unicode)
class User(Base):
__tablename__ = 'user'
id = Column(BigInteger, primary_key=True)
name = Column(Unicode)
email = Column(Unicode)
I want to be able to get any user – system admin or regular user – from one query, so I do a join, on either User or SystemAdmin depending on the is_admin row. For example:
DBSession.query(UserTypeMapper, SystemAdmin).join(SystemAdmin, UserTypeMapper.user_id==SystemAdmin.id).first()
and
DBSession.query(UserTypeMapper, User).join(User, UserTypeMapper.user_id==User.id).first()
This works fine; however, I then would like to be access these, like so:
>>> my_admin_obj.is_admin
True
>>> my_admin_obj.name
Bob Smith
versus
>>> my_user_obj.is_admin
False
>>> my_user_obj.name
Bob Stevens
Currently, I have to specify: my_user_obj.UserTypeMapper.is_admin and my_user_obj.User.name. From what I've been reading, I need to map the tables so that I don't need to specify which table the attribute belongs to. My problem is that I do not understand how I can specify this given that I have two potential tables that the name attribute, for example, may come from.
This is the example I am referring to: Mapping a Class against Multiple Tables
How can I achieve this? Thank you.
You have discovered why "dual purpose foreign key", is an antipattern.
There is a related problem to this that you haven't quite pointed out; there's no way to use a foreign key constraint to enforce the data be in a valid state. You want to be sure that there's exactly one of something for each row in UserTypeMapper, but that 'something' is not any one table. formally you want a functional dependance on
user_type_mapper → (system_admin× 1) ∪ (user× 0)
But most sql databses won't allow you to write a foreign key constraint expressing that.
It looks complicated because it is complicated.
instead, lets consider what we really want to say; "every system_admin should be a user; or
system_admin → user
In sql, that would be written:
CREATE TABLE user (
id INTEGER PRIMARY KEY,
name VARCHAR,
email VARCHAR
);
CREATE TABLE system_admin (
user_id INTEGER PRIMARY KEY REFERENCES user(id)
);
Or, in sqlalchemy declarative style
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
email = Column(String)
class SystemAdmin(Base):
__tablename__ = 'system_admin'
user_id = Column(ForeignKey(User.id), primary_key=True)
What sort of questions does this schema allow us to ask?
"Is there a SystemAdmin by the name of 'john doe'"?
>>> print session.query(User).join(SystemAdmin).filter(User.name == 'john doe').exists()
EXISTS (SELECT 1
FROM "user" JOIN system_admin ON "user".id = system_admin.user_id
WHERE "user".name = :name_1)
"How many users are there? How many sysadmins?"
>>> print session.query(func.count(User.id), func.count(SystemAdmin.user_id)).outerjoin(SystemAdmin)
SELECT count("user".id) AS count_1, count(system_admin.user_id) AS count_2
FROM "user" LEFT OUTER JOIN system_admin ON "user".id = system_admin.user_id
I hope you can see why the above is prefereable to the design you describe in your question; but in the off chance you don't have a choice (and only in that case, if you still feel what you've got is better, please refine your question), you can still cram that data into a single python object, which will be very difficult to work with, by providing an alternate mapping to the tables; specifically one which follows the rough structure in the first equation.
We need to mention UserTypeMapper twice, once for each side of the union, for that, we need to give aliases.
>>> from sqlalchemy.orm import aliased
>>> utm1 = aliased(UserTypeMapper)
>>> utm2 = aliased(UserTypeMapper)
For the union bodies join each alias to the appropriate table: Since SystemAdmin and User have the same columns in the same order, we don't need to describe them in detail, but if they are at all different, we need to make them "union compatible", by mentioning each column explicitly; this is left as an exercise.
>>> utm_sa = Query([utm1, SystemAdmin]).join(SystemAdmin, (utm1.user_id == SystemAdmin.id) & (utm1.is_admin == True))
>>> utm_u = Query([utm2, User]).join(User, (utm2.user_id == User.id) & (utm2.is_admin == False))
And then we join them together...
>>> print utm_sa.union(utm_u)
SELECT anon_1.user_type_mapper_1_id AS anon_1_user_type_mapper_1_id, anon_1.user_type_mapper_1_is_admin AS anon_1_user_type_mapper_1_is_admin, anon_1.user_type_mapper_1_user_id AS anon_1_user_type_mapper_1_user_id, anon_1.system_admin_id AS anon_1_system_admin_id, anon_1.system_admin_name AS anon_1_system_admin_name, anon_1.system_admin_email AS anon_1_system_admin_email
FROM (SELECT user_type_mapper_1.id AS user_type_mapper_1_id, user_type_mapper_1.is_admin AS user_type_mapper_1_is_admin, user_type_mapper_1.user_id AS user_type_mapper_1_user_id, system_admin.id AS system_admin_id, system_admin.name AS system_admin_name, system_admin.email AS system_admin_email
FROM user_type_mapper AS user_type_mapper_1 JOIN system_admin ON user_type_mapper_1.user_id = system_admin.id AND user_type_mapper_1.is_admin = 1 UNION SELECT user_type_mapper_2.id AS user_type_mapper_2_id, user_type_mapper_2.is_admin AS user_type_mapper_2_is_admin, user_type_mapper_2.user_id AS user_type_mapper_2_user_id, "user".id AS user_id, "user".name AS user_name, "user".email AS user_email
FROM user_type_mapper AS user_type_mapper_2 JOIN "user" ON user_type_mapper_2.user_id = "user".id AND user_type_mapper_2.is_admin = 0) AS anon_1
While it's theoretically possible to wrap this all up into a python class that looks a bit like standard sqlalchemy orm stuff, I would certainly not do that. working with non-table mappings, especially when they are more than simple joins (this is a union), is lots of work for zero payoff.

sqlalchemy / table setup

I have items, warehouses, and items are in warehouses.
So I have table that has information about items (sku, description, cost ...) and a table that describes warehouses(location, code, name, ...). Now I need a way to store inventory so that I know I have X items in warehouse Y. An item can be in any warehouse.
How would I go about setting up the relationship between them and storing the qty?
class Item(DeclarativeBase):
__tablename__ = 'items'
item_id = Column(Integer, primary_key=True,autoincrement=True)
item_code = Column(Unicode(35),unique=True)
item_description = Column(Unicode(100))
item_long_description = Column(Unicode())
item_cost = Column(Numeric(precision=13,scale=4))
item_list = Column(Numeric(precision=13,scale=2))
def __init__(self,code,description,cost,list):
self.item_code = code
self.item_description = description
self.item_cost = cost
self.item_list = list
class Warehouse(DeclarativeBase):
__tablename__ = 'warehouses'
warehouse_id = Column(Integer, primary_key=True, autoincrement=True)
warehouse_code = Column(Unicode(15),unique=True)
warehouse_description = Column(Unicode(55))
If I am correct I would setup the many to many using an intermediate table something like ...
item_warehouse = Table(
'item_warehouse', Base.metadata,
Column('item_id', Integer, ForeignKey('items.item_id')),
Column('warehouse_id', Integar, ForeignKey('warehouses.warehouse_id'))
)
But i would need to start the qty available on this table but since its not its own class I am not sure how that would work.
What would be the "best" practice for modeling this and having it usable in my app?
Model:
As mentioned by #Lafada, you need an Association Object. As such I would create a SA-persistent object and not only a table:
class ItemWarehouse(Base):
# version-1:
__tablename__ = 'item_warehouse'
__table_args__ = (PrimaryKeyConstraint('item_id', 'warehouse_id', name='ItemWarehouse_PK'),)
# version-2:
#__table_args__ = (UniqueConstraint('item_id', 'warehouse_id', name='ItemWarehouse_PK'),)
#id = Column(Integer, primary_key=True, autoincrement=True)
# other columns
item_id = Column(Integer, ForeignKey('items.id'), nullable=False)
warehouse_id = Column(Integer, ForeignKey('warehouses.id'), nullable=False)
quantity = Column(Integer, default=0)
This covers the model requirement with the following:
added a PrimaryKey
added a UniqueConstraint covering the (item_id, warehouse_id) pairs.
In the code above this is solved in two ways:
version-1: uses composite primary key (which must be unique)
version-2: uses simple primary key, but also adds an explicit unique constraint [I personally prefer this option]
Relationship: Association Object
Now. You can use the Association Object as is, which will look similar to this:
w = Warehouse(...)
i = Item(name="kindle", price=...)
iw = ItemWarehouse(quantity=50)
iw.item = i
w.items.append(i)
Relationship: Association Proxy extension
or, you could go one step further and use the Composite Association Proxies example, and you may configure dictionary-like access to the association object similar to this:
w = Warehouse(...)
i = Item(name="kindle", price=...)
w[i] = 50 # sets the quantity to 50 of item _i_ in warehouse _w_
i[w] = 50 # same as above, if you configure it symmetrically
Beware: the code for the relationships definition might look really not easily readable, but the usage pattern is really nice. So if this option is too much to digest, I would start with Association Object with maybe some helper functions to add/get/update the item stocks, and eventually move to the Association Proxy extesion.
You have to use "Association Object".
I try to give you hint for your problem you have to create table like you mention in your question
item_warehouse = Table( 'item_warehouse',
Base.metadata,
Column('item_id',
Integer,
ForeignKey('items.item_id')
),
Column('warehouse_id',
Integar,
ForeignKey('warehouses.warehouse_id')
),
Column('qty',
Integer,
default=0,
),
)
Now you can add warehouse, item and qty in single object and you have to write method which will take warehouse_id and item_id and get the sum of qty for those itmes.
Hope this will help you to solve your problem.

Categories

Resources