How to Add rows using subqueries in sqlalchemy? - python

I'm using Postgresql with SQLAlchemy but it seems sqlalchemy is having trouble adding rows when using subqueries.
In my example, I want to update a counter for a specific tag in a table.
In SqlAlchemy a test run class would look like the following:
class TestRun( base ):
__tablename__ = 'test_runs'
id = sqlalchemy.Column( 'id', sqlalchemy.Integer, sqlalchemy.Sequence('user_id_seq'), primary_key=True )
tag = sqlalchemy.Column( 'tag', sqlalchemy.String )
counter = sqlalchemy.Column( 'counter', sqlalchemy.Integer )
The insertion code should then look like the following:
tag = 'sampletag'
counterquery = session.query(sqlalchemy.func.coalesce(sqlalchemy.func.max(TestRun.counter),0) + 1).\
filter(TestRun.tag == tag).\
subquery()
testrun = TestRun()
testrun.tag = tag
testrun.counter = counterquery
session.add( testrun )
session.commit()
The problem with this, is it gives a very interesting error when running this code, it's trying to run the following SQL Query:
'INSERT INTO test_runs (id, tag, counter)
VALUES (%(id)s,
%(tag)s,
SELECT coalesce(max(test_runs.counter), %(param_1)s) + %(coalesce_1)s AS anon_1
FROM test_runs
WHERE test_runs.tag = %(tag_1)s)'
{'coalesce_1': 1, 'param_1': 0, 'tag_1': 'mytag', 'tag': 'mytag', 'id': 267L}
Which looks reasonable, except it's missing parenthesis around the SELECT call. When I run the SQL query manually it gives me the same exact error that sqlalchemy gives me until I type in the parenthesis manually which fixes everything up. Seems like an unlikely bug that sqlalchemy would forget to put parenthesis when it needs to, so my question is am I missing a function to use subqueries correctly when adding rows using sqlalchemy?

Instead of using subquery() call as_scalar() method:
Return the full SELECT statement represented by this Query, converted
to a scalar subquery.
Example:
Models with classing parent-child relationship:
class Parent(Base):
__tablename__ = 'parents'
id = Column(Integer, primary_key=True)
counter = Column(Integer, nullable=False, default=0)
class Child(Base):
__tablename__ = 'children'
id = Column(Integer, primary_key=True)
parent_id = Column(ForeignKey(Parent.id), nullable=False)
parent = relationship(Parent)
Code to update counter field:
parent.counter = session.query(func.count(Child.id))\
.filter_by(parent=parent).as_scalar()
Produced SQL (copied from the log):
UPDATE parents SET counter=(SELECT count(children.id) AS count_1
FROM children
WHERE ? = children.parent_id) WHERE parents.id = ?

Related

SQLAlchemy: how to detect/suppress duplicate JOIN clauses?

Consider the following example code (using SQLAlchemy 1.4):
import os
from sqlalchemy import Column, ForeignKey, Integer, String, select
from sqlalchemy.orm import backref, declarative_base, relationship
Base = declarative_base()
class Parent(Base):
__tablename__ = "parent"
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(String)
type = Column(String)
class Child(Base):
__tablename__ = "child"
id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(String)
parent_id = Column(Integer, ForeignKey("parent.id"))
parent = relationship("Parent", backref=backref("children"))
def select_parent_name(statement):
return statement.join(Child.parent).add_columns(Parent.name)
def filter_by_parent_name(statement, parent_name):
return statement.join(Child.parent).where(Parent.name == parent_name)
def build_query():
statement = select(Child.id)
if os.getenv("SELECT_PARENT_NAME", True):
statement = select_parent_name(statement)
if os.getenv("FILTER_BY_PARENT", True):
statement = filter_by_parent_name(statement, "foo")
return statement
if __name__ == "__main__":
print(str(build_query()))
This produces invalid SQL, with the same JOIN clause represented twice:
SELECT child.id, parent.name
FROM child JOIN parent ON parent.id = child.parent_id JOIN parent ON parent.id = child.parent_id
WHERE parent.name = :name_1
If executed, it will result in:
(MySQLdb._exceptions.OperationalError) (1066, "Not unique table/alias: 'parent'")
It's a stripped-down trivial example, but the point I'm trying to demonstrate is that I'm building up an SQL statement by passing it around to different functions, each of which has a different responsibility and might require the addition of a JOIN which could have already been added to the statement.
Is there an easy way to suppress duplicate JOINs like this? Or to inspect the statement to see if the redundant JOIN is already present? Ideally this information would be easily determined from the statement object itself, rather than having to maintain and pass around that state separately.
In SQLAlchemy>=1.4 the joined tables can be found in statement._setup_joins:
joined_tables = [joins[0].parent.entity for joins in statement._setup_joins]
For SQLAlchemy<1.4 the joined tables can be found in statement._join_entities:
joined_tables = [mapper.class_ for mapper in statement._join_entities]
Reference for SQLAlchemy<1.4: Can I inspect a sqlalchemy query object to find the already joined tables?

SQLAlchemy: Query filter including extra data from association table

I am trying to build an ORM mapped SQLite database. The conception of the DB seems to work as intended but I can't seem to be able to query it properly for more complex cases. I have spent the day trying to find an existing answer to my question but nothing works. I am not sure if the issue is with my mapping, my query or both. Or if maybe querying with attributes from a many to many association table with extra data works differently.
This the DB setup:
engine = create_engine('sqlite:///')
Base = declarative_base(bind=engine)
Session = sessionmaker(bind=engine)
class User(Base):
__tablename__ = 'users'
# Columns
id = Column('id', Integer, primary_key=True)
first = Column('first_name', String(100))
last = Column('last_name', String(100))
age = Column('age', Integer)
quality = Column('quality', String(100))
unit = Column('unit', String(100))
# Relationships
cases = relationship('UserCaseLink', back_populates='user_data')
def __repr__(self):
return f"<User(first='{self.first}', last='{self.last}', quality='{self.quality}', unit='{self.unit}')>"
class Case(Base):
__tablename__ = 'cases'
# Columns
id = Column('id', Integer, primary_key=True)
num = Column('case_number', String(100))
type = Column('case_type', String(100))
# Relationships
users = relationship('UserCaseLink', back_populates='case_data')
def __repr__(self):
return f"<Case(num='{self.num}', type='{self.type}')>"
class UserCaseLink(Base):
__tablename__ = 'users_cases'
# Columns
user_id = Column('user_id', Integer, ForeignKey('users.id'), primary_key=True)
case_id = Column('case_id', Integer, ForeignKey('cases.id'), primary_key=True)
role = Column('role', String(100))
# Relationships
user_data = relationship('User', back_populates='cases')
case_data = relationship('Case', back_populates='users')
if __name__ == '__main__':
Base.metadata.create_all()
session = Session()
and I would like to retrieve all the cases on which a particular person is working under a certain role.
So for example I want a list of all the cases a person named 'Alex' is working on as an 'Administrator'.
In other words I would like the result of this query:
SELECT [cases].*,
[main].[users_cases].role
FROM [main].[cases]
INNER JOIN [main].[users_cases] ON [main].[cases].[id] = [main].[users_cases].[case_id]
INNER JOIN [main].[users] ON [main].[users].[id] = [main].[users_cases].[user_id]
WHERE [main].[users].[first_name] = 'Alex'
AND [main].[users_cases].[role] = 'Administrator';
So far I have tried many things along the lines of:
cases = session.query(Case).filter(User.first == 'Alex', UserCaseLink.role == 'Administrator')
but it is not working as I would like it to.
How can I modify the ORM mapping so that it does the joining for me and allows me to query easily (something like the query I tried)?
According to your calsses, the quivalent query for:
SELECT [cases].*,
[main].[users_cases].role
FROM [main].[cases]
INNER JOIN [main].[users_cases] ON [main].[cases].[id] = [main].[users_cases].[case_id]
INNER JOIN [main].[users] ON [main].[users].[id] = [main].[users_cases].[user_id]
WHERE [main].[users].[first_name] = 'Alex'
AND [main].[users_cases].[role] = 'Administrator';
is
cases = session.query(
Case.id, Case.num,Cas.type,
UserCaseLink.role
).filter(
(Case.id==UserCaseLink.case_id)
&(User.id==UserCaseLink.user_id)
&(User.first=='Alex')
&(UserCaseLink.role=='Administrator'
).all()
also, you can:
cases = Case.query\
.join(UserCaseLink,Case.id==UserCaseLink.case_id)\
.join(User,User.id==UserCaseLink.user_id)\
.filter( (User.first=='Alex') & (User.first=='Alex') )\
.all()
Good Luck
After comment
based in your comment, I think you want something like:
cases = Case.query\
.filter( (Case.case_data.cases.first=='Alex') & (Case.case_data.cases.first=='Alex') )\
.all()
where case_data connect between Case an UserCaseLink and cases connect between UserCaseLink and User as in your relations.
But,that case causes error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with dimpor.org_type has an attribute 'org_type_id'
The missage shows that the attributes combined in filter should belong to the table class
So I ended up having to compromise.
It seems the query cannot be aware of all the relationships present in the ORM mapping at all times. Instead I had to manually give it the path between the different classes for it to find all the data I wanted:
cases = session.query(Case)\
.join(Case.users)\
.join(UserCaseLink.user_data)\
.filter(User.first == 'Alex', UserCaseLink.role == 'Administrator')\
.all()
However, as it does not meet all the criteria for my original question (ie I still have to specify the joins), I will not mark this answer as the accepted one.

Is there a way to populate one table from multiple on SQLAlchemy

I'm trying to build a database with SQLAlchemy, my problem is that I have two tables with the same columns name and trying to populate a third table from the two others. There is below a simple diagram to illustrate:
I usually set Foreign key on one table and the relationship on the other like that :
class TableA(Base):
__tablename__ = "tableA"
id = Column(Integer, primary_key=True)
name = Column(String(100))
age = Column(Integer)
name_relation = relationship("TableC", backref='owner')
class TableC(Base):
__tablename__ = "tableC"
id = Column(Integer, primary_key=True)
name = Column(String(100), ForeignKey('tableA.name'))
age = Column(Integer)
You can see that this method can only works with two table because my ForeignKey on tableC for the name specifies the name of tableA.
Is there a way to do that ?
Thanks
In SQL, the query you'd be looking for is
INSERT INTO C (id, name, age) (
SELECT *
FROM A
UNION ALL
SELECT *
FROM B
)
As per this answer, this makes the equivalent SQLAlchemy
session = Session()
query = session.query(TableA).union_all(session.query(TableB))
stmt = TableC.insert().from_select(['id', 'name', 'age'], query)
or equivalently
stmt = TableC.insert().from_select(
['id', 'name', 'age'],
TableA.select().union_all(TableB.select())
)
After which you can execute it using connection.execute(stmt) or session.execute(stmt), depending on what you're using.

sqlalchemy: Select from table where column in QUERY

I have a situation where I am trying to count up the number of rows in a table when the column value is in a subquery. For example, lets say that I have some sql like so:
select count(*) from table1
where column1 in (select column2 from table2);
I have my tables defined like so:
class table1(Base):
__tablename__ = "table1"
__table_args__ = {'schema': 'myschema'}
acct_id = Column(DECIMAL(precision=15), primary_key=True)
class table2(Base):
__tablename__ = "table2"
__table_args__ = {'schema': 'myschema'}
ban = Column(String(length=128), primary_key=True)
The tables are reflected from the database so there are other attributes present that aren't explicitly specified in the class definition.
I can try to write my query but here is where I am getting stuck...
qry=self.session.query(func.?(...)) # what to put here?
res = qry.one()
I tried looking through the documentation here but I don't see any comparable implementation to the 'in' keyword which is a feature of many SQL dialects.
I am using Teradata as my backend if that matters.
sub_stmt = session.query(table2.some_id)
stmt = session.query(table1).filter(table1.id.in_(sub_stmt))
data = stmt.all()

SQLAlchemy Relationship / Hybrid Property to specific instance of One-To-Many

I'm trying to create a hybrid property or a relationship (either works) to pick out a single model from the "Many" side of a One-To-Many relationship.
The accepted answer for How to set one to many and one to one relationship at same time in Flask-SQLAlchemy? doesn't work for me, as I need an expression-level construct to use in additional queries.
Relevant model details are as follows:
class ItemIdentifierType(db.Model):
id = db.Column(db.Integer, primary_key=True)
code = db.Column(db.String(12))
priority = db.Column(db.Integer)
class ItemIdentifier(db.Model):
id = db.Column(db.String(8), primary_key=True)
type_id = db.Column(db.ForeignKey('item_identifier_type.id')
type = relationship('ItemIdentifierType')
item_id = db.Column(db.ForeignKey('item.id'))
item = db.relationship('Item', back_populates='identifiers')
class Item(db.Model):
id = db.Column(db.String(8), primary_key=True)
name = db.Column(db.String(40))
identifiers = db.relationship('ItemIdentifier', back_populates='instrument', lazy='dynamic')
#hybrid_property
def primary_identifier(self):
return sorted(self.identifiers, key=lambda x: x.type.priority)[0]
#primary_identifier.expression:
def primary_identifier(cls):
primary_identifiers = select([
ItemIdentifier.item_id,
ItemIdentifierType.code,
ItemIdentifier.value
]).select_from(join(ItemIdentifier, ItemIdentifierType,
ItemIdentifier.type_id == ItemIdentifierType.id))\
.order_by(ItemIdentifier.item_id,
ItemIdentifierType.priority.asc())\
.distinct(ItemIdentifier.item_id)\
.alias()
# <<< psycopg2 throws the error shown below >>>
return select([ItemIdentifierType.code, ItemIdentifier.value])\
.select_from(primary_identifiers)\
.where(primary_identifiers.c.item_id == self.id)
Error this throws when attempting to use the sql expression:
(psycopg2.ProgrammingError) subquery in FROM must have an alias
LINE 2: FROM (SELECT item_identifier_type.code AS code, instru...
^
HINT: For example, FROM (SELECT ...) [AS] foo.
[SQL: 'SELECT code AS code, value AS value
FROM (SELECT item_identifier_type.code AS code, item_identifier.value AS value
FROM item_identifier_type, item_identifier, (SELECT DISTINCT item_identifier.item_id AS item_id, item_identifier.id AS id
FROM item_identifier JOIN item_identifier_type ON item_identifier.type_id = item_identifier_type.id ORDER BY item_identifier.item_id, item_identifier_type.priority ASC, item_identifier.id) AS primary_identifiers, item
WHERE primary_identifiers.item_id = item.id) ORDER BY item.name ASC']
The following query pulls out what I'm after, no problem:
SELECT
DISTINCT ON (item_identifier.item_id)
item_identifier.item_id,
item_identifier_type.code,
item_identifier.value
FROM item_identifier
JOIN item_identifier_type
ON item_identifier.type_id = item_identifier_type.id
ORDER BY
item_identifier.item_id,
item_identifier_type.priority ASC;

Categories

Resources