I am having an issue constructing the SQLAlchemy code required to produce the following raw SQL query.
WITH RECURSIVE recruiters AS (
SELECT
recruiter.id
FROM
recruiter
JOIN
recruiter_member ON recruiter.id = recruiter_member.recruiter_id
WHERE
recruiter_member.user_id = 'f12c617a-415c-4f8c-add0-81a597545be8'
UNION ALL
SELECT
children.id
FROM
recrutiers AS parents,
recruiter AS children
WHERE
children.recruiter_id = parents.id
)
SELECT
*
FROM
recruiters
The models here are Recruiter and RecruiterMember. I just can't seem to get the UNION right.
Without more details, this was the best I could come up with:
from sqlalchemy import orm
parent = orm.aliased(Recruiter)
child = orm.aliased(Recruiter)
top_q = (
orm.query.Query([Recruiter.id.label('id')])
.join(RecruiterMember, Recruiter.id == RecruiterMember.recruiter_id)
.filter(RecruiterMember.user_id == 'f12c617a-415c-4f8c-add0-81a597545be8')
.cte(recursive=True))
bottom_q = (
orm.query.Query([child.id.label('id')])
.join(parent, parent.id == child.recruiter_id))
final_query = top_q.union_all(bottom_q)
orm.query.Query([final_query.c.id]).with_session(session).all()
Related
I want to query the count of bookings for a given event- if the event has bookings, I want to pull the name of the "first" person to book it.
The table looks something like: Event 1-0 or Many Booking, Booking.attendee is a 1:1 with User Table. In pure SQL I can easily do what I want by using Window Functions + CTE. Something like:
WITH booking AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY b.event_id ORDER BY b.created DESC) rn,
COUNT(*) OVER (PARTITION BY b.event_id) count
FROM
booking b JOIN "user" u on u.id = b.attendee_id
WHERE
b.status != 'cancelled'
)
SELECT e.*, a.vcount, a.first_name, a.last_name FROM event e LEFT JOIN attendee a ON a.event_id = e.id WHERE (e.seats > COALESCE(a.count, 0) and (a.rn = 1 or a.rn is null) and e.cancelled != true;
This gets everything I want. When I try to turn this into a CTE and use Peewee however, I get errors about: Relation does not exist.
Not exact code, but I'm doing something like this with some dynamic where clauses for filtering based on params.
cte = (
BookingModel.select(
BookingModel,
peewee.fn.ROW_NUMBER().over(partition_by=[BookingModel.event_id], order_by=[BookingModel.created.desc()]).alias("rn),
peewee.fn.COUNT(BookingModel.id).over(partition_by=[BookingModel.event_id]).alias("count),
UserModel.first_name,
UserModel.last_name
)
.join(
UserModel,
peewee.JOIN.LEFT_OUTER,
on(UserModel.id == BookingModel.attendee)
)
.where(BookingModel.status != "cancelled")
.cte("test")
query = (
EventModel.select(
EventModel,
UserModel,
cte.c.event_id,
cte.c.first_name,
cte.c.last_name,
cte.c.rn,
cte.c.count
)
.join(UserModel, on=(EventModel.host == UserModel.id))
.switch(EventModel)
.join(cte, peewee.JOIN.LEFT_OUTER, on=(EventModel.id == cte.c.event_id))
.where(where_clause)
.order_by(EventModel.start_time.asc(), EventModel.id.asc())
.limit(10)
.with_cte(cte)
After reading the docs twenty+ times, I can't figure out what isn't right about this. It looks like the samples... but the query will fail, because "relation "test" does not exist". I've played with "columns" being explicitly defined, but then that throws an error that "rn is ambiguous".
I'm stuck and not sure how I can get Peewee CTE to work.
I want to convert this sql query to SQLALCHEMY:
SELECT * FROM dbcloud.client_feedback as a
join (select distinct(max(submitted_on)) sub,pb_channel_id pb, mail_thread_id mail from client_feedback group by pb_channel_id, mail_thread_id) as b
where (a.submitted_on = b.sub and a.pb_channel_id = b.pb) or ( a.submitted_on = b.sub and a.mail_thread_id = b.mail )
I can't find as keyword in SQLALCHEMY
I think that what you may be looking for is .label(name).
Assuming you have a model
class MyModel(db.Model):
id = db.Column(primary_key=True)
name = db.Column()
here is an example of how .label(name) can be used
query = db.session.query(MyModel.name.label('a'))
will produce the SQL
SELECT my_model.name as a FROM my_model
I have two tables, ProjectData and Label, like this.
class ProjectData(db.Model):
__tablename__ = "project_data"
id = db.Column(db.Integer, primary_key=True)
class Label(db.Model):
__tablename__ = "labels"
id = db.Column(db.Integer, primary_key=True)
data_id = db.Column(db.Integer, db.ForeignKey('project_data.id'))
What I want to do is select all records from ProjectData that are not represented in Label - basically the opposite of a join, or a right outer join, which is not a feature SQLAlchemy offers.
I have tried to do it like this, but it doesn't work.
db.session.query(ProjectData).select_from(Label).outerjoin(
ProjectData
).all()
Finding records in one table with no match in another is known as an anti-join.
You can do this with a NOT EXISTS query:
from sqlalchemy.sql import exists
stmt = exists().where(Label.data_id == ProjectData.id)
q = db.session.query(ProjectData).filter(~stmt)
which generates this SQL:
SELECT project_data.id AS project_data_id
FROM project_data
WHERE NOT (
EXISTS (
SELECT *
FROM labels
WHERE labels.data_id = project_data.id
)
)
Or by doing a LEFT JOIN and filtering for null ids in the other table:
q = (db.session.query(ProjectData)
.outerjoin(Label, ProjectData.id == Label.data_id)
.filter(Label.id == None)
)
which generates this SQL:
SELECT project_data.id AS project_data_id
FROM project_data
LEFT OUTER JOIN labels ON project_data.id = labels.data_id
WHERE labels.id IS NULL
If you know your desired SQL statement to run, you can utilize the 'text' function from sqlalchemy in order to execute a complex query
https://docs.sqlalchemy.org/en/13/core/sqlelement.html
from sqlalchemy import text
t = text("SELECT * "
"FROM users "
"where user_id=:user_id "
).params(user_id=user_id)
results = db.session.query(t)
I'd like to get a select by a sub-query but, I don't know how I will do that. I searched for every world of internet but not found what i want.
The select is:
SELECT order_status.*
FROM `order`
LEFT OUTER JOIN
(
SELECT *
FROM (
SELECT *
FROM order_status
ORDER BY created_date DESC LIMIT 1) s
WHERE status IN ('NEW', 'FINISH','SENDED','PROCESSING')
) AS order_status ON order.id = order_status.order_id;
my code:
subqy = self.session.query(OrderStatus).order_by(OrderStatus.created_date.desc()).limit(1).subquery()
query = self.session.query(Order).outerjoin(subqy)
return query.filter(and_(in_(conditions))).all()
I'd change the query a little bit to remove one subquery:
SELECT order_status.*
FROM `order`
LEFT OUTER JOIN
(
SELECT *
FROM order_status
ORDER BY created_date DESC
LIMIT 1
) AS order_status ON order.id = order_status.order_id
AND order_status.status IN ('NEW', 'FINISH','SENDED','PROCESSING')
Then the code becomes
subquery = self.session\
.query(OrderStatus)\
.order_by(OrderStatus.created_date.desc())\
.limit(1)\
.subquery()
query = self.session.query(Order)\
.outerjoin(subquery,
(subquery.c.order_id == Order.id)
& subquery_a.c.status.in_(('NEW', 'FINISH','SENDED','PROCESSING')))
return query.all()
I think the thing you missed is that for subqueries, you need to access the columns through table.c.column, instead of Table.column, as you're used to.
I'm trying to translate the following query into a SQLAlchemy ORM query:
SELECT applications.*,
appversions.version
FROM applications
JOIN appversions
ON appversions.id = (SELECT id
FROM appversions
WHERE appversions.app_id = applications.id
ORDER BY sort_ver DESC
LIMIT 1)
The model for the tables are as follows:
Base = declarative_base()
class Application(Base):
__tablename__ = 'applications'
id = Column(Integer, primary_key = True)
group = Column(Unicode(128))
artifact = Column(Unicode(128))
versions = relationship("AppVersion", backref = "application")
class AppVersion(Base):
__tablename__ = 'versions'
id = Column(Integer, primary_key = True)
app_id = Column(Integer, ForeignKey('applications.id'))
version = Column(Unicode(64))
sort_ver = Column(Unicode(64))
And the query I've so far come up with is:
subquery = select([AppVersion.id]). \
where(AppVersion.app_id == Application.id). \
order_by(AppVersion.sort_ver). \
limit(1). \
alias()
query = session.query(Application). \
join(AppVersion, AppVersion.id == subquery.c.id) \
.all()
However, this is producing the following SQL statement and error:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such column: anon_1.id
[SQL: SELECT applications.id AS applications_id, applications."group" AS applications_group, applications.artifact AS applications_artifact
FROM applications JOIN versions ON versions.id = anon_1.id]
I have tried various different methods to produce the subquery and attempting to 'tack on' the sub-SELECT command, but without any positive impact.
Is there a way to coerce the SQLAlchemy query builder to correctly append the sub-SELECT?
With thanks to #Ilja Everilä for the nudge in the right direction, the code to generate the correct query is:
subquery = select([AppVersion.id]). \
where(AppVersion.app_id == Application.id). \
order_by(AppVersion.sort_ver). \
limit(1). \
correlate(Application)
query = session.query(Application). \
join(AppVersion, AppVersion.id == subquery) \
.all()
The main change is to use the correlate() method, which alters how SQLAlchemy constructs the subquery.
To explain why this works requires some understanding of how SQL subqueries are categorised and handled. The best explanation I have found is from https://www.geeksforgeeks.org/sql-correlated-subqueries/:
With a normal nested subquery, the inner SELECT query runs first and executes once, returning values to be used by the main query. A correlated subquery, however, executes once for each candidate row considered by the outer query. In other words, the inner query is driven by the outer query.