I have been struggling for a day to get an SQL Select statement that works into the equivalent SQLAlchemy code. It involves two tables.
A Tags table
class Tags(Base):
__tablename__ = 't_tags'
uid = Column(Integer, primary_key=True)
category = Column(Enum('service', 'event', 'attribute', name='enum_tag_category'))
name = Column(String(32))
And a table that maps them to their originating parents
class R_Incident_Tags(Base):
__tablename__ ='r_incident_tags'
incident_uid = Column(String(48), ForeignKey('t_incident.uid'), primary_key=True)
tag_uid = Column(Integer, ForeignKey('t_tags.uid'), primary_key=True)
tag = relationship("Tags", backref="r_incident_tags")
incident_uid is a unique string to identify the parent.
The SELECT I have been struggling to represent in SQLAlchemy is as follows
SELECT DISTINCT s.name, e.name, count(e.name)
FROM "t_tags" AS s,
"t_tags" AS e,
"r_incident_tags" AS sr,
"r_incident_tags" AS er
WHERE s.category='service' AND
e.category='event' AND
e.uid = er.tag_uid AND
s.uid = sr.tag_uid AND
er.incident_uid = sr.incident_uid
GROUP BY s.name, e.name
Any assistance would be appreciated as I haven't even got close to getting something working after a whole day of effort.
Kindest Regards!
This should do the job:
s = aliased(Tags)
e = aliased(Tags)
sr = aliased(R_Incident_Tags)
er = aliased(R_Incident_Tags)
qry = (session.query(s.name, e.name, func.count(e.name)).
select_from(s, e, sr, er).
filter(s.category=='service').
filter(e.category=='event').
filter(e.uid == er.tag_uid).
filter(s.uid == sr.tag_uid).
filter(er.incident_uid == sr.incident_uid).
group_by(s.name, e.name)
)
But you could also use relationship-based JOINs instead of simple WHERE clauses.
Related
Hello SQLAlchemy masters,
I am just facing a problem with how to use SQLAlchemy ORM in python for the SQL query
SELECT systems.name,
(
SELECT date
FROM accounting A
WHERE A.ticker = C.ticker AND A.info = 'Trade_opened'
) AS entry,
C.*
FROM accounting C
JOIN systems ON C.system_id = systems.id
WHERE C.status = 'open'
And I can't use an aliased() in a right way:
H = aliased(Accounting, name='H')
C = aliased(Accounting, name='C')
his = db.session.query(H.date) \
.filter(H.ticker == C.ticker, H.info == r'Trade_opened')
sql = db.session.query(Systems.name, C, his) \
.join(Systems, C.system_id == Systems.id) \
.filter(C.status == r'Open') \
.statement
print(sql)
Can you help me, please?
I think you need:
scalar_subquery to be able to use the subquery as a column
select_from to be able to set the "left" side of the joins to be different from the first column (ie. C instead of systems).
I didn't test this with actual data so I don't know if it works correctly. It helps if you post your schema and some test data. I used Account because it has an easy plural, accounts, to setup a test.
Base = declarative_base()
class Account(Base):
__tablename__ = 'accounts'
id = Column(Integer, primary_key=True)
date = Column(Date)
ticker = Column(String(length=200))
info = Column(String(length=200))
status = Column(String(length=200))
system = relationship('System', backref='accounts')
system_id = Column(Integer, ForeignKey('systems.id'))
class System(Base):
__tablename__ = 'systems'
id = Column(Integer, primary_key=True)
name = Column(String(length=200))
with Session(engine) as session:
C = aliased(Account, name='C')
A = aliased(Account, name='A')
date_subq = session.query(A.date).filter(and_(A.ticker == C.ticker, A.info == 'Trade_opened')).scalar_subquery()
q = session.query(System.name, date_subq.label('entry'), C).select_from(C).join(C.system).filter(C.status == 'open')
print (q)
Formatted SQL:
SELECT
systems.name AS systems_name,
(SELECT "A".date
FROM accounts AS "A" WHERE "A".ticker = "C".ticker AND "A".info = %(info_1)s) AS entry,
"C".id AS "C_id",
"C".date AS "C_date",
"C".ticker AS "C_ticker",
"C".info AS "C_info",
"C".status AS "C_status",
"C".system_id AS "C_system_id"
FROM accounts AS "C"
JOIN systems ON systems.id = "C".system_id
WHERE "C".status = %(status_1)s
I'm trying to build a database with SQLAlchemy, my problem is that I have two tables with the same columns name and trying to populate a third table from the two others. There is below a simple diagram to illustrate:
I usually set Foreign key on one table and the relationship on the other like that :
class TableA(Base):
__tablename__ = "tableA"
id = Column(Integer, primary_key=True)
name = Column(String(100))
age = Column(Integer)
name_relation = relationship("TableC", backref='owner')
class TableC(Base):
__tablename__ = "tableC"
id = Column(Integer, primary_key=True)
name = Column(String(100), ForeignKey('tableA.name'))
age = Column(Integer)
You can see that this method can only works with two table because my ForeignKey on tableC for the name specifies the name of tableA.
Is there a way to do that ?
Thanks
In SQL, the query you'd be looking for is
INSERT INTO C (id, name, age) (
SELECT *
FROM A
UNION ALL
SELECT *
FROM B
)
As per this answer, this makes the equivalent SQLAlchemy
session = Session()
query = session.query(TableA).union_all(session.query(TableB))
stmt = TableC.insert().from_select(['id', 'name', 'age'], query)
or equivalently
stmt = TableC.insert().from_select(
['id', 'name', 'age'],
TableA.select().union_all(TableB.select())
)
After which you can execute it using connection.execute(stmt) or session.execute(stmt), depending on what you're using.
I have this SQL expression that I'm trying to write in SQL Alchemy
select * from candidates1 c
inner join uploaded_emails1 e
on c.id=e.candidate_id
group by e.thread_id
How would I go about doing that?
The execute method can be used to run raw SQL, like so:
from sqlalchemy import text
sql = text('select * from candidates1 c inner join uploaded_emails1 e on c.id=e.candidate_id group by e.thread_id')
result = db.engine.execute(sql)
... do stuff ...
If you have some models that you're working with, you could use the relationship field type to create a one-to-many relationship between the Candidate and the UploadedEmail, like so:
class Candidate(Base):
__tablename__ = 'candidates1'
id = Column(Integer, primary_key=True)
uploaded_emails = relationship("UploadedEmail", lazy='dynamic')
class UploadedEmail(Base):
__tablename__ = 'uploaded_emails1'
id = Column(Integer, primary_key=True)
candidate_id = Column(Integer, ForeignKey('candidate.id'))
thread_id = Column(Integer)
And in your code, you might use that like this (including the group_by)
candidate_id = 1
c = Candidate.query.filter_by(id=candidate_id).first()
thread_id_results = c.uploaded_emails.with_entities(UploadedEmail.thread_id).group_by(UploadedEmail.thread_id).all()
thread_ids = [row[0] for row in thread_id_results]
Note that you have to use the .with_entities clause to specify the columns you would like to select, and then the fact that you are specifying the thread_id column. If you don't do this, you'll get errors along the lines of "Expression #X of SELECT list is not in GROUP BY clause and contains nonaggregated column ... which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by".
Sorry I didn't provide enough information to answer the question. This ended up working:
x = db_session.query(Candidate1, Uploaded_Emails1).filter(Candidate1.id == Uploaded_Emails1.candidate_id).group_by(Uploaded_Emails1.thread_id).all()
So I have two tables Employee and Details like this.
class Employee(Base):
__tablename__ = 'employees'
id = Column(Integer, Sequence('employee_id_seq'), primary_key=True)
name = Column(String(50), nullable=False)
............
class Detail(Base):
__tablename__ = 'details'
id = Column(Integer, Sequence('detail_id_seq'), primary_key=True)
start_date = Column(String(50), nullable=False)
email = Column(String(50))
employee_id = Column(Integer, ForeignKey('employee.id'))
employee = relationship("Employee", backref=backref('details', order_by=id))
............
Now what I want to do is get all the employees and their corresponding details, here is what I tried.
for e, d in session.query(Employee, Detail).filter(Employee.id = Detail.employee_id).all():
print e.name, d.email
The problem with this is that it prints everything twice. I tried using .join() and also prints the results twice.
What I want to achieve is like
print Employee.name
print Employee.details.email
If you really care only about few columns, you can specify them in the query directly:
q = session.query(Employee.name, Detail.email).filter(Employee.id == Detail.employee_id).all()
for e, d in q:
print e, d
If you do really want to load object instances, then I would do it differently:
# query all employees
q = (session.query(Employee)
# load Details in the same query
.outerjoin(Employee.details)
# let SA know that the relationship "Employee.details" is already loaded in this query so that when we access it, SA will not do another query in the database
.options(contains_eager(Employee.details))
).all()
# navigate the results simply as defined in the relationship configuration
for e in q:
print(e)
for d in e.details:
print(" ->", d)
As to your duplicate result problem, I believe you have some "extra" in your real code which produces this error...
I'm using Postgresql with SQLAlchemy but it seems sqlalchemy is having trouble adding rows when using subqueries.
In my example, I want to update a counter for a specific tag in a table.
In SqlAlchemy a test run class would look like the following:
class TestRun( base ):
__tablename__ = 'test_runs'
id = sqlalchemy.Column( 'id', sqlalchemy.Integer, sqlalchemy.Sequence('user_id_seq'), primary_key=True )
tag = sqlalchemy.Column( 'tag', sqlalchemy.String )
counter = sqlalchemy.Column( 'counter', sqlalchemy.Integer )
The insertion code should then look like the following:
tag = 'sampletag'
counterquery = session.query(sqlalchemy.func.coalesce(sqlalchemy.func.max(TestRun.counter),0) + 1).\
filter(TestRun.tag == tag).\
subquery()
testrun = TestRun()
testrun.tag = tag
testrun.counter = counterquery
session.add( testrun )
session.commit()
The problem with this, is it gives a very interesting error when running this code, it's trying to run the following SQL Query:
'INSERT INTO test_runs (id, tag, counter)
VALUES (%(id)s,
%(tag)s,
SELECT coalesce(max(test_runs.counter), %(param_1)s) + %(coalesce_1)s AS anon_1
FROM test_runs
WHERE test_runs.tag = %(tag_1)s)'
{'coalesce_1': 1, 'param_1': 0, 'tag_1': 'mytag', 'tag': 'mytag', 'id': 267L}
Which looks reasonable, except it's missing parenthesis around the SELECT call. When I run the SQL query manually it gives me the same exact error that sqlalchemy gives me until I type in the parenthesis manually which fixes everything up. Seems like an unlikely bug that sqlalchemy would forget to put parenthesis when it needs to, so my question is am I missing a function to use subqueries correctly when adding rows using sqlalchemy?
Instead of using subquery() call as_scalar() method:
Return the full SELECT statement represented by this Query, converted
to a scalar subquery.
Example:
Models with classing parent-child relationship:
class Parent(Base):
__tablename__ = 'parents'
id = Column(Integer, primary_key=True)
counter = Column(Integer, nullable=False, default=0)
class Child(Base):
__tablename__ = 'children'
id = Column(Integer, primary_key=True)
parent_id = Column(ForeignKey(Parent.id), nullable=False)
parent = relationship(Parent)
Code to update counter field:
parent.counter = session.query(func.count(Child.id))\
.filter_by(parent=parent).as_scalar()
Produced SQL (copied from the log):
UPDATE parents SET counter=(SELECT count(children.id) AS count_1
FROM children
WHERE ? = children.parent_id) WHERE parents.id = ?