How to reset `increment` value back to 1 when data changes? - python

Goal
I am aiming to insert database records into MySQL using Python. But with an extra detail, I'll explain as I go along..
This is my current script (Fully functional & working):
#Get data from SQL
sqlCursor = mjmConnection.cursor()
sqlCursor.execute("SELECT sol.id, p.id, p.code,p.description, p.searchRef1, so.number, c.code, c.name, sol.requiredQty \
FROM salesorderline sol JOIN \
salesorder so \
ON sol.salesorderid = so.id JOIN \
product p \
ON sol.productid = p.id JOIN \
customer c \
ON so.customerid = c.id \
WHERE so.orderdate > DATEADD(dd,-35,CAST(GETDATE() AS date));")
#Send recieved data from SQL query from above to MySQL database
print("Sending MJM records to MySQL Database")
mjmCursorMysql = productionConnection.cursor()
for x in sqlCursor.fetchall():
a,b,c,d,e,f,g,h,i = x
mjmCursorMysql.execute("INSERT ignore INTO mjm_python (id, product_id, product_code, product_description, product_weight, \
salesorder_number, customer_code, customer_name, requiredQty) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s);", (a,b,c,d,e,f,g,h,i))
productionConnection.commit()
mjmCursorMysql.close()
sqlCursor.close()
What it does
The above script does the following:
Gets data from SQL Server
Inserts that data into MySQL
I have specifically used IGNORE in the MySQL query, to prevent duplicate id numbers.
Data will look like this:
Next..
Now - i'd like to add a column name sales_id_increment. This will start from 1 and increment for each same salesorder_number and reset back to 1 when there is a different salesorder_number. So I am wanting it to look something like this:
Question
How do I achieve this? Where do I need to look, in my Python script or the MySQL query?

You can get this column when you select the rows from SQL Server with window functions ROW_NUMBER() or DENSE_RANK() (if there are duplicate ids):
SELECT sol.id, p.id, p.code,p.description, p.searchRef1, so.number, c.code, c.name, sol.requiredQty,
ROW_NUMBER() OVER (PARTITION BY so.number ORDER BY sol.id) sales_id_increment
FROM salesorderline sol
JOIN salesorder so ON sol.salesorderid = so.id
JOIN product p ON sol.productid = p.id
JOIN customer c ON so.customerid = c.id
WHERE so.orderdate > DATEADD(dd,-35,CAST(GETDATE() AS date));

Related

Peewee: Relation does not exist when querying with CTE

I want to query the count of bookings for a given event- if the event has bookings, I want to pull the name of the "first" person to book it.
The table looks something like: Event 1-0 or Many Booking, Booking.attendee is a 1:1 with User Table. In pure SQL I can easily do what I want by using Window Functions + CTE. Something like:
WITH booking AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY b.event_id ORDER BY b.created DESC) rn,
COUNT(*) OVER (PARTITION BY b.event_id) count
FROM
booking b JOIN "user" u on u.id = b.attendee_id
WHERE
b.status != 'cancelled'
)
SELECT e.*, a.vcount, a.first_name, a.last_name FROM event e LEFT JOIN attendee a ON a.event_id = e.id WHERE (e.seats > COALESCE(a.count, 0) and (a.rn = 1 or a.rn is null) and e.cancelled != true;
This gets everything I want. When I try to turn this into a CTE and use Peewee however, I get errors about: Relation does not exist.
Not exact code, but I'm doing something like this with some dynamic where clauses for filtering based on params.
cte = (
BookingModel.select(
BookingModel,
peewee.fn.ROW_NUMBER().over(partition_by=[BookingModel.event_id], order_by=[BookingModel.created.desc()]).alias("rn),
peewee.fn.COUNT(BookingModel.id).over(partition_by=[BookingModel.event_id]).alias("count),
UserModel.first_name,
UserModel.last_name
)
.join(
UserModel,
peewee.JOIN.LEFT_OUTER,
on(UserModel.id == BookingModel.attendee)
)
.where(BookingModel.status != "cancelled")
.cte("test")
query = (
EventModel.select(
EventModel,
UserModel,
cte.c.event_id,
cte.c.first_name,
cte.c.last_name,
cte.c.rn,
cte.c.count
)
.join(UserModel, on=(EventModel.host == UserModel.id))
.switch(EventModel)
.join(cte, peewee.JOIN.LEFT_OUTER, on=(EventModel.id == cte.c.event_id))
.where(where_clause)
.order_by(EventModel.start_time.asc(), EventModel.id.asc())
.limit(10)
.with_cte(cte)
After reading the docs twenty+ times, I can't figure out what isn't right about this. It looks like the samples... but the query will fail, because "relation "test" does not exist". I've played with "columns" being explicitly defined, but then that throws an error that "rn is ambiguous".
I'm stuck and not sure how I can get Peewee CTE to work.

Update SQL parameter using Python

I want to ask that if I can update the parameter in SQL query using python. I want to read a SQL query first, and process the SQL outputted data using python. However, I need to specify the data that I want to filter in SQL query, and I wonder if there is any way that I can update the parameter in python instead of updating the SQL query.
The SQL query is like the following:
set nocount on;
declare #pdate datetime
set #pdate = '2022-12-31'
select
cast(L.Date as datetime) as Date
, Amount
, AccountNumber
, Property
, County
, ZipCode
, Price
, Owner
from Account.Detail L
inner join Owner.Detail M
on L.Date = M.Date
and L.Number = M.Number
inner join Purchase.Detail P
on L.Date = P.Date
and L.Purchase.Number = P.Purchase.Number
where L.Date = #pdate
and Purchase.Number not in ('CL1', 'CL2')
and Amount > 0
And I want to run the python code like following:
import pyodbc
server = 'my_server_name'
database = 'my_database_name'
connection = pyodbc.connect(Trusted_Connection = "yes", DRIVER = "{SQL Server}", SERVER = server, DATABASE = database)
cursor = connection.cursor()
query = open('Pathway_for_SQL_Query.sql').read()
data = pd.read_sql(query, connection)
connection.close()
I need to declare the #pdate in SQL query every time, I want to ask if I can update the #pdate using Python?
Instead of parsing and replacing an SQL script, you could use bind variables and have Python control the value (note the "?" in the query):
pdate = "some value"
# query could be read from file, given here for simplicity
query = """
select
cast(L.Date as datetime) as Date
, Amount
, AccountNumber
, Property
, County
, ZipCode
, Price
, Owner
from Account.Detail L
inner join Owner.Detail M
on L.Date = M.Date
and L.Number = M.Number
inner join Purchase.Detail P
on L.Date = P.Date
and L.Purchase.Number = P.Purchase.Number
where L.Date = ?
and Purchase.Number not in ('CL1', 'CL2')
and Amount > 0
"""
data = pd.read_sql(query, connection, params=(pdate,))

Create Tempview from sql query

df_sales = spark.sql(
"SELECT \
s.TRANS_DT, \
s.STORE_KEY, \
s.PROD_KEY, \
s.SALES_QTY
FROM sales s \
JOIN inventory i ON s.cal_dt=i.cal_dt and s.store_key=i.store_key and s.prod_key=i.prod_key;"
)
I created sql query from 2 tempview (inventory and sales). How to convert df_sales sql query to tempview again and I can create a new SQL query.
Read this and you can write your own code as follows:
spark.sql(
"""
SELECT
s.TRANS_DT,
s.STORE_KEY,
s.PROD_KEY,
s.SALES_QTY
FROM sales s
JOIN inventory i ON s.cal_dt=i.cal_dt and s.store_key=i.store_key and s.prod_key=i.prod_key;
"""
).createOrReplaceTempView("tmpViewName")

SQLAlchemy select_from a single table

In trying to replicate a MySQL query in SQL Alchemy, I've hit a snag in specifying which tables to select from.
The query that works is
SELECT c.*
FROM attacks AS a INNER JOIN hosts h ON a.host_id = h.id
INNER JOIN cities c ON h.city_id = c.id
GROUP BY c.id;
I try to accomplish this in SQLAlchemy using the following function
def all_cities():
session = connection.globe.get_session()
destination_city = aliased(City, name='destination_city')
query = session.query(City). \
select_from(Attack).\
join((Host, Attack.host_id == Host.id)).\
join((destination_city, Host.city_id == destination_city.id)).\
group_by(destination_city.id)
print query
results = [result.serialize() for result in query]
session.close()
file(os.path.join(os.path.dirname(__file__), "servers.geojson"), 'a').write(geojson.feature_collection(results))
When printing the query, I end up with ALMOST the right query
SELECT
cities.id AS cities_id,
cities.country_id AS cities_country_id,
cities.province AS cities_province,
cities.latitude AS cities_latitude,
cities.longitude AS cities_longitude,
cities.name AS cities_name
FROM cities, attacks
INNER JOIN hosts ON attacks.host_id = hosts.id
INNER JOIN cities AS destination_city ON hosts.city_id = destination_city.id
GROUP BY destination_city.id
However, you will note that it is selecting from cities, attacks...
How can I get it to select only from the attacks table?
The line here :
query = session.query(City)
is querying the City table also that's why you are getting the query as
FROM cities, attacks

MySQL + python table FETCH module

name=input("input CUSTOMERID to search :")
# Prepare SQL query to view all records of a specific person from
# the SALESPRODUCTS TABLE LINKED WITH SALESPERSON TABLE.
sql = "SELECT * selling_products.customer \
FROM customer \
WHERE customer_products.CUSTOMERID == name"
# Execute the SQL command
cursor.execute(sql)
# Fetch all the rows the sql result of SQL1.
results = cursor.fetchall()
print("\n\n****** TABLE MASTERLIST*********")
print("CUSTOMERID \t PRODUCTID \t DATEOFPURCHASE")
print("**************")
for row in results:
print (row[0],row[1],row[2])
Python would compile the code above, but it will not return any output. Help would be very much appreciated :)
i think you sql should be:
sql = """SELECT * selling_products.customer
FROM customer
WHERE customer_products.CUSTOMERID == {name}""".format(name=name)

Categories

Resources