Turning SQL expression into SQLAlchemy query - python

I have this SQL expression that I'm trying to write in SQL Alchemy
select * from candidates1 c
inner join uploaded_emails1 e
on c.id=e.candidate_id
group by e.thread_id
How would I go about doing that?

The execute method can be used to run raw SQL, like so:
from sqlalchemy import text
sql = text('select * from candidates1 c inner join uploaded_emails1 e on c.id=e.candidate_id group by e.thread_id')
result = db.engine.execute(sql)
... do stuff ...
If you have some models that you're working with, you could use the relationship field type to create a one-to-many relationship between the Candidate and the UploadedEmail, like so:
class Candidate(Base):
__tablename__ = 'candidates1'
id = Column(Integer, primary_key=True)
uploaded_emails = relationship("UploadedEmail", lazy='dynamic')
class UploadedEmail(Base):
__tablename__ = 'uploaded_emails1'
id = Column(Integer, primary_key=True)
candidate_id = Column(Integer, ForeignKey('candidate.id'))
thread_id = Column(Integer)
And in your code, you might use that like this (including the group_by)
candidate_id = 1
c = Candidate.query.filter_by(id=candidate_id).first()
thread_id_results = c.uploaded_emails.with_entities(UploadedEmail.thread_id).group_by(UploadedEmail.thread_id).all()
thread_ids = [row[0] for row in thread_id_results]
Note that you have to use the .with_entities clause to specify the columns you would like to select, and then the fact that you are specifying the thread_id column. If you don't do this, you'll get errors along the lines of "Expression #X of SELECT list is not in GROUP BY clause and contains nonaggregated column ... which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by".

Sorry I didn't provide enough information to answer the question. This ended up working:
x = db_session.query(Candidate1, Uploaded_Emails1).filter(Candidate1.id == Uploaded_Emails1.candidate_id).group_by(Uploaded_Emails1.thread_id).all()

Related

sqlalchemy: Select from table where column in QUERY

I have a situation where I am trying to count up the number of rows in a table when the column value is in a subquery. For example, lets say that I have some sql like so:
select count(*) from table1
where column1 in (select column2 from table2);
I have my tables defined like so:
class table1(Base):
__tablename__ = "table1"
__table_args__ = {'schema': 'myschema'}
acct_id = Column(DECIMAL(precision=15), primary_key=True)
class table2(Base):
__tablename__ = "table2"
__table_args__ = {'schema': 'myschema'}
ban = Column(String(length=128), primary_key=True)
The tables are reflected from the database so there are other attributes present that aren't explicitly specified in the class definition.
I can try to write my query but here is where I am getting stuck...
qry=self.session.query(func.?(...)) # what to put here?
res = qry.one()
I tried looking through the documentation here but I don't see any comparable implementation to the 'in' keyword which is a feature of many SQL dialects.
I am using Teradata as my backend if that matters.
sub_stmt = session.query(table2.some_id)
stmt = session.query(table1).filter(table1.id.in_(sub_stmt))
data = stmt.all()

SQLAlchemy looses column label on chained union/except_

I have a somewhat complex query where I need to join subquery. That subquery contains except and union. In RAW sql it looks something like this
SELECT ... FROM table t
JOIN (SELECT id AS foo_id FROM foo WHERE select_me
EXCLUDE SELECT foo_id FROM bar WHERE add_or_remove = 'remove'
UNION SELECT foo_id FROM bar WHERE add_or_remove = 'add') subq
ON t.foo_id = subq.foo_id;
Where foo and bar tables are defined like this:
class Foo(Base):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True, autoincrement=True)
select_me = Column(Boolean)
class Bar(Base):
__tablename__ = 'bar'
foo_id = Column(Integer, primary_key=True)
add_or_remove = Column(Enum('add', 'remove', name='add_or_remove'), primary_key=True)
When I'm trying to make this subquery in SQLAlchemy, it looses column label when I add second union/except_.
Here is what I'm talking about:
q = session.query(Foo.id.label('foo_id')).filter(Foo.select_me)
print(q.subquery().c)
Prints ['%(140275696626880 anon)s.foo_id'] still contains correct label
q = q.union(session.query(Bar.foo_id.label('foo_id')).filter(Bar.add_or_remove == 'add'))
print(q.subquery().c)
Prints ['%(140275696767384 anon)s.foo_id'] still contains correct label
q = q.except_(session.query(Bar.foo_id.label('foo_id')).filter(Bar.add_or_remove == 'remove'))
print(q.subquery().c)
Prints ['%(140275696769064 anon)s.%(140275696769008 anon)s_foo_id'] now column is labeled with autogenerated name and I cannot use it to specify condition in join.
For now I think I can just take first column and use it. But this is hacky solution, so I wonder if this is bug in SQLAlchemy or I'm doing something wrong.

SQL to SQLAlchemy translation

I have a, somewhat odd, query that gets me all the items in a parent table that have no matches in its corresponding child table.
If possible, id like to turn it into an SQLAlchemy query. But I have no idea how. I can do basic gets and filters, but this one is beyond my experience so far. Any help you folks might give would be greatly appreciated.
class customerTranslations(Base):
"""parent table. holds customer names"""
__tablename__ = 'customer_translation'
id = Column(Integer, primary_key=True)
class customerEmails(Base):
"""child table. hold emails for customers in translation table"""
__tablename__ = 'customer_emails'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('customer_translation.id'))
I want to build:
SELECT * FROM customer_translation
WHERE id NOT IN (SELECT parent_id FROM customer_emails)
You have a subquery, so create one first:
all_emails_stmnt = session.query(customerEmails.parent_id).subquery()
and then you can use that to filter your other table:
translations_with_no_email = session.query(customerTranslations).filter(
~customerTranslations.id.in_(all_emails_stmnt))
This produces the same SQL (but with all the column names expanded, rather than using *, the ORM then can create your objects):
>>> all_emails_stmnt = session.query(customerEmails.parent_id).subquery()
>>> print(all_emails_stmnt)
SELECT customer_emails.parent_id
FROM customer_emails
>>> translations_with_no_email = session.query(customerTranslations).filter(
... ~customerTranslations.id.in_(all_emails_stmnt))
>>> print(translations_with_no_email)
SELECT customer_translation.id AS customer_translation_id
FROM customer_translation
WHERE customer_translation.id NOT IN (SELECT customer_emails.parent_id
FROM customer_emails)
You could also use NOT EXISTS:
from sqlalchemy.sql import exists
has_no_email_stmnt = ~exists().where(customerTranslations.id == customerEmails.parent_id)
translations_with_no_email = session.query(customerTranslations).filter(has_no_email_stmnt)
or, if you have a a backreference on the customerTranslations class pointing to emails, named emails, use .any() on the relationship and invert:
session.query(customerTranslations).filter(
~customerTranslations.emails.any())
Back in 2010 NOT EXISTS was a little slower on MySQL but you may want to re-assess if that is still the case.

SQLAlchemy class method for subquery

I have a table of time series data that I frequently need to get records where the date is equal to the max date in the table. In SQL this is easily accomplished via subquery, i.e.:
SELECT * from my_table where date = (select max(date) from my_table);
The model for this table would look like:
class MyTable(Base):
__tablename__ = 'my_table'
id = Column(Integer, primary_key = True)
date = Column(Date)
And I can accomplish the desired behavior in SQLAlchemy with two separate queries, ie:
maxdate = session.query(func.max(MyTable.date)).first()[0]
desired_results = session.query(MyTable).filter(MyTable.date == maxdate).all()
The problem is that I have this subquery sprinkled everywhere in my code and I feel it is an inelegant solution. Ideally I would like to write a class property or custom comparator that I can stick in the model definition, so that I can compress the subquery into a single line and reuse it constantly, something like:
session.query(MyTable).filter(MyTable.date == MyTable.max_date)
I have looked through the SQLAlchemy docs on this but haven't come up with anything that works. Does anybody have neat a solution for this kind of problem?
For posterity, here is the solution I came up with
from sqlalchemy.sql import func
from sqlalchemy import select
class MyTable(Base):
__tablename__ = 'my_table'
id = Column(Integer, primary_key = True)
date = Column(Date)
maxdate = select([func.max(date)])
desired_results = session.query(MyTable).filter(MyTable.date == MyTable.maxdate).all()

SQLAlchemy: Converting Self-Ref JOIN, COUNT, GROUP BY SELECT

I have been struggling for a day to get an SQL Select statement that works into the equivalent SQLAlchemy code. It involves two tables.
A Tags table
class Tags(Base):
__tablename__ = 't_tags'
uid = Column(Integer, primary_key=True)
category = Column(Enum('service', 'event', 'attribute', name='enum_tag_category'))
name = Column(String(32))
And a table that maps them to their originating parents
class R_Incident_Tags(Base):
__tablename__ ='r_incident_tags'
incident_uid = Column(String(48), ForeignKey('t_incident.uid'), primary_key=True)
tag_uid = Column(Integer, ForeignKey('t_tags.uid'), primary_key=True)
tag = relationship("Tags", backref="r_incident_tags")
incident_uid is a unique string to identify the parent.
The SELECT I have been struggling to represent in SQLAlchemy is as follows
SELECT DISTINCT s.name, e.name, count(e.name)
FROM "t_tags" AS s,
"t_tags" AS e,
"r_incident_tags" AS sr,
"r_incident_tags" AS er
WHERE s.category='service' AND
e.category='event' AND
e.uid = er.tag_uid AND
s.uid = sr.tag_uid AND
er.incident_uid = sr.incident_uid
GROUP BY s.name, e.name
Any assistance would be appreciated as I haven't even got close to getting something working after a whole day of effort.
Kindest Regards!
This should do the job:
s = aliased(Tags)
e = aliased(Tags)
sr = aliased(R_Incident_Tags)
er = aliased(R_Incident_Tags)
qry = (session.query(s.name, e.name, func.count(e.name)).
select_from(s, e, sr, er).
filter(s.category=='service').
filter(e.category=='event').
filter(e.uid == er.tag_uid).
filter(s.uid == sr.tag_uid).
filter(er.incident_uid == sr.incident_uid).
group_by(s.name, e.name)
)
But you could also use relationship-based JOINs instead of simple WHERE clauses.

Categories

Resources