I want to query the count of bookings for a given event- if the event has bookings, I want to pull the name of the "first" person to book it.
The table looks something like: Event 1-0 or Many Booking, Booking.attendee is a 1:1 with User Table. In pure SQL I can easily do what I want by using Window Functions + CTE. Something like:
WITH booking AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY b.event_id ORDER BY b.created DESC) rn,
COUNT(*) OVER (PARTITION BY b.event_id) count
FROM
booking b JOIN "user" u on u.id = b.attendee_id
WHERE
b.status != 'cancelled'
)
SELECT e.*, a.vcount, a.first_name, a.last_name FROM event e LEFT JOIN attendee a ON a.event_id = e.id WHERE (e.seats > COALESCE(a.count, 0) and (a.rn = 1 or a.rn is null) and e.cancelled != true;
This gets everything I want. When I try to turn this into a CTE and use Peewee however, I get errors about: Relation does not exist.
Not exact code, but I'm doing something like this with some dynamic where clauses for filtering based on params.
cte = (
BookingModel.select(
BookingModel,
peewee.fn.ROW_NUMBER().over(partition_by=[BookingModel.event_id], order_by=[BookingModel.created.desc()]).alias("rn),
peewee.fn.COUNT(BookingModel.id).over(partition_by=[BookingModel.event_id]).alias("count),
UserModel.first_name,
UserModel.last_name
)
.join(
UserModel,
peewee.JOIN.LEFT_OUTER,
on(UserModel.id == BookingModel.attendee)
)
.where(BookingModel.status != "cancelled")
.cte("test")
query = (
EventModel.select(
EventModel,
UserModel,
cte.c.event_id,
cte.c.first_name,
cte.c.last_name,
cte.c.rn,
cte.c.count
)
.join(UserModel, on=(EventModel.host == UserModel.id))
.switch(EventModel)
.join(cte, peewee.JOIN.LEFT_OUTER, on=(EventModel.id == cte.c.event_id))
.where(where_clause)
.order_by(EventModel.start_time.asc(), EventModel.id.asc())
.limit(10)
.with_cte(cte)
After reading the docs twenty+ times, I can't figure out what isn't right about this. It looks like the samples... but the query will fail, because "relation "test" does not exist". I've played with "columns" being explicitly defined, but then that throws an error that "rn is ambiguous".
I'm stuck and not sure how I can get Peewee CTE to work.
Related
What I would like returned is all the seat_ids in the performance table that have a booking_id that matches all the booking_ids where night = 1 in the booking table - is an INNER JOIN the best way to do it?
Or is it more along the lines of """SELECT seat_id FROM performance WHERE booking_id=(SELECT * FROM booking WHERE night = ?""", (night_number))
With the above I get sqlite3.OperationalError: incomplete input error.
connection = sqlite3.connect('collyers_booking_system.db')
cursor = connection.cursor()
cursor.execute(booking_table)
cursor.execute(performance_table)
connection.commit()
booking_table = """CREATE TABLE IF NOT EXISTS
booking(
booking_id TEXT PRIMARY KEY,
customer_id INTEGER,
night INTEGER,
cost REAL,
FOREIGN KEY (customer_id) REFERENCES customer(customer_id)
)"""
performance_table = """CREATE TABLE IF NOT EXISTS
performance(
performance_id TEXT PRIMARY KEY,
seat_id TEXT,
booking_id INTEGER,
FOREIGN KEY (seat_id) REFERENCES seat(seat_id),
FOREIGN KEY (booking_id) REFERENCES booking(booking_id),
)"""
night_number = 1
cursor.execute("""SELECT seat_id FROM performance INNER JOIN booking ON night=?""", (night_number))
booked_seats = cursor.fetchall()
print(booked_seats)
With this I get ValueError: parameters are of unsupported type error.
First, if this is your actual code, there is a typo in the CREATE statement of the table performance.
You must remove the , at the end of:
FOREIGN KEY (booking_id) REFERENCES booking(booking_id),
Then, here:
cursor.execute("""SELECT seat_id FROM performance WHERE booking_id=(SELECT * FROM booking WHERE night = ?""", (night_number))
you missed a closing parenthesis for the sql statement and the subquery may return more than 1 rows, so instead of = you should use IN.
Also, the parameter night_number should passed as a tuple and not just a number, by adding a , inside the paraentheses:
cursor.execute("""SELECT seat_id FROM performance WHERE booking_id IN (SELECT * FROM booking WHERE night = ?)""", (night_number,))
For the join you need a proper ON clause, that links the tables and a , to create the tuple for night_number:
sql = """
SELECT p.seat_id
FROM performance p INNER JOIN booking b
ON b.booking_id = p. booking_id
WHERE b.night=?
"""
cursor.execute(sql, (night_number,))
Both ways, the operator IN and the join will work.
There is another option which sometimes performs better and this is EXISTS:
sql = """
SELECT p.seat_id
FROM performance p
WHERE EXISTS (
SELECT 1 FROM booking b
WHERE b.night=? AND b.booking_id = p.booking_id
)
"""
cursor.execute(sql, (night_number,))
You are comparing a list result with an integer.
this SELECT * FROM booking WHERE night = ? => returns an N rows
and you are wating for an Integer SELECT seat_id FROM performance WHERE booking_id=?.
You have to use something like this :
SELECT seat_id FROM performance WHERE booking_id in (SELECT * FROM booking WHERE night = ?""", (night_number))
Goal
I am aiming to insert database records into MySQL using Python. But with an extra detail, I'll explain as I go along..
This is my current script (Fully functional & working):
#Get data from SQL
sqlCursor = mjmConnection.cursor()
sqlCursor.execute("SELECT sol.id, p.id, p.code,p.description, p.searchRef1, so.number, c.code, c.name, sol.requiredQty \
FROM salesorderline sol JOIN \
salesorder so \
ON sol.salesorderid = so.id JOIN \
product p \
ON sol.productid = p.id JOIN \
customer c \
ON so.customerid = c.id \
WHERE so.orderdate > DATEADD(dd,-35,CAST(GETDATE() AS date));")
#Send recieved data from SQL query from above to MySQL database
print("Sending MJM records to MySQL Database")
mjmCursorMysql = productionConnection.cursor()
for x in sqlCursor.fetchall():
a,b,c,d,e,f,g,h,i = x
mjmCursorMysql.execute("INSERT ignore INTO mjm_python (id, product_id, product_code, product_description, product_weight, \
salesorder_number, customer_code, customer_name, requiredQty) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s);", (a,b,c,d,e,f,g,h,i))
productionConnection.commit()
mjmCursorMysql.close()
sqlCursor.close()
What it does
The above script does the following:
Gets data from SQL Server
Inserts that data into MySQL
I have specifically used IGNORE in the MySQL query, to prevent duplicate id numbers.
Data will look like this:
Next..
Now - i'd like to add a column name sales_id_increment. This will start from 1 and increment for each same salesorder_number and reset back to 1 when there is a different salesorder_number. So I am wanting it to look something like this:
Question
How do I achieve this? Where do I need to look, in my Python script or the MySQL query?
You can get this column when you select the rows from SQL Server with window functions ROW_NUMBER() or DENSE_RANK() (if there are duplicate ids):
SELECT sol.id, p.id, p.code,p.description, p.searchRef1, so.number, c.code, c.name, sol.requiredQty,
ROW_NUMBER() OVER (PARTITION BY so.number ORDER BY sol.id) sales_id_increment
FROM salesorderline sol
JOIN salesorder so ON sol.salesorderid = so.id
JOIN product p ON sol.productid = p.id
JOIN customer c ON so.customerid = c.id
WHERE so.orderdate > DATEADD(dd,-35,CAST(GETDATE() AS date));
I'd like to get a select by a sub-query but, I don't know how I will do that. I searched for every world of internet but not found what i want.
The select is:
SELECT order_status.*
FROM `order`
LEFT OUTER JOIN
(
SELECT *
FROM (
SELECT *
FROM order_status
ORDER BY created_date DESC LIMIT 1) s
WHERE status IN ('NEW', 'FINISH','SENDED','PROCESSING')
) AS order_status ON order.id = order_status.order_id;
my code:
subqy = self.session.query(OrderStatus).order_by(OrderStatus.created_date.desc()).limit(1).subquery()
query = self.session.query(Order).outerjoin(subqy)
return query.filter(and_(in_(conditions))).all()
I'd change the query a little bit to remove one subquery:
SELECT order_status.*
FROM `order`
LEFT OUTER JOIN
(
SELECT *
FROM order_status
ORDER BY created_date DESC
LIMIT 1
) AS order_status ON order.id = order_status.order_id
AND order_status.status IN ('NEW', 'FINISH','SENDED','PROCESSING')
Then the code becomes
subquery = self.session\
.query(OrderStatus)\
.order_by(OrderStatus.created_date.desc())\
.limit(1)\
.subquery()
query = self.session.query(Order)\
.outerjoin(subquery,
(subquery.c.order_id == Order.id)
& subquery_a.c.status.in_(('NEW', 'FINISH','SENDED','PROCESSING')))
return query.all()
I think the thing you missed is that for subqueries, you need to access the columns through table.c.column, instead of Table.column, as you're used to.
I am having an issue constructing the SQLAlchemy code required to produce the following raw SQL query.
WITH RECURSIVE recruiters AS (
SELECT
recruiter.id
FROM
recruiter
JOIN
recruiter_member ON recruiter.id = recruiter_member.recruiter_id
WHERE
recruiter_member.user_id = 'f12c617a-415c-4f8c-add0-81a597545be8'
UNION ALL
SELECT
children.id
FROM
recrutiers AS parents,
recruiter AS children
WHERE
children.recruiter_id = parents.id
)
SELECT
*
FROM
recruiters
The models here are Recruiter and RecruiterMember. I just can't seem to get the UNION right.
Without more details, this was the best I could come up with:
from sqlalchemy import orm
parent = orm.aliased(Recruiter)
child = orm.aliased(Recruiter)
top_q = (
orm.query.Query([Recruiter.id.label('id')])
.join(RecruiterMember, Recruiter.id == RecruiterMember.recruiter_id)
.filter(RecruiterMember.user_id == 'f12c617a-415c-4f8c-add0-81a597545be8')
.cte(recursive=True))
bottom_q = (
orm.query.Query([child.id.label('id')])
.join(parent, parent.id == child.recruiter_id))
final_query = top_q.union_all(bottom_q)
orm.query.Query([final_query.c.id]).with_session(session).all()
how can i have a subquery in django's queryset? for example if i have:
select name, age from person, employee where person.id = employee.id and
employee.id in (select id from employee where employee.company = 'Private')
this is what i have done yet.
Person.objects.value('name', 'age')
Employee.objects.filter(company='Private')
but it not working because it returns two output...
as mentioned by ypercube your use case doesn't require subquery.
but anyway since many people land into this page to learn how to do sub-query here is how its done.
employee_query = Employee.objects.filter(company='Private').only('id').all()
Person.objects.value('name', 'age').filter(id__in=employee_query)
Source:
http://mattrobenolt.com/the-django-orm-and-subqueries/
ids = Employee.objects.filter(company='Private').values_list('id', flat=True)
Person.objects.filter(id__in=ids).values('name', 'age')
The correct answer on your question is here https://docs.djangoproject.com/en/2.1/ref/models/expressions/#subquery-expressions
As an example:
>>> from django.db.models import OuterRef, Subquery
>>> newest = Comment.objects.filter(post=OuterRef('pk')).order_by('-created_at')
>>> Post.objects.annotate(newest_commenter_email=Subquery(newest.values('email')[:1]))
You can create subqueries in Django by using an unevaluated queryset to filter your main queryset. In your case, it would look something like this:
employee_query = Employee.objects.filter(company='Private')
people = Person.objects.filter(employee__in=employee_query)
I'm assuming that you have a reverse relationship from Person to Employee named employee. I found it helpful to look at the SQL query generated by a queryset when I was trying to understand how the filters work.
print people.query
As others have said, you don't really need a subquery for your example. You could just join to the employee table:
people2 = Person.objects.filter(employee__company='Private')
hero_qs = Hero.objects.filter(category=OuterRef("pk")).order_by("-benevolence_factor")
Category.objects.all().annotate(most_benevolent_hero=Subquery(hero_qs.values('name')[:1]))
the generated sql
SELECT "entities_category"."id",
"entities_category"."name",
(SELECT U0."name"
FROM "entities_hero" U0
WHERE U0."category_id" = ("entities_category"."id")
ORDER BY U0."benevolence_factor" DESC
LIMIT 1) AS "most_benevolent_hero"
FROM "entities_category"
For more details, see this article.
Take good care with onlyif your subqueries don't select the primary key.
Example:
class Customer:
pass
class Order:
customer: Customer
pass
class OrderItem:
order: Order
is_recalled: bool
Customer has Orders
Order has OrderItems
Now we are trying to find all customers with at least one recalled order-item.(1)
This will not work properly
order_ids = OrderItem.objects \
.filter(is_recalled=True) \
.only("order_id")
customer_ids = OrderItem.objects \
.filter(id__in=order_ids) \
.only('customer_id')
# BROKEN! BROKEN
customers = Customer.objects.filter(id__in=customer_ids)
The code above looks very fine, but it produces the following query:
select * from customer where id in (
select id -- should be customer_id
from orders
where id in (
select id -- should be order_id
from order_items
where is_recalled = true))
Instead one should use select
order_ids = OrderItem.objects \
.filter(is_recalled=True) \
.select("order_id")
customer_ids = OrderItem.objects \
.filter(id__in=order_ids) \
.select('customer_id')
customers = Customer.objects.filter(id__in=customer_ids)
(1) Note: in a real case we might consider 'WHERE EXISTS'