I have some troubles implementing the following query with peewee:
SELECT *
FROM (
SELECT datas.*, (rank() over(partition by tracking_id order by date_of_data DESC)) as rank_result
FROM datas
WHERE tracking_id in (1, 2, 3, 4, 5, 6)
)
WHERE rank_result < 3;
I have tried to do the following:
subquery = (Datas.select(Datas.tracking, Datas.value, Datas.date_of_data,
fn.rank().over(partition_by=[Datas.tracking],
order_by=[Datas.date_of_data.desc()]).alias('rank'))
.where(Datas.tracking.in_([1, 2, 3, 4, 5, 6])))
result = (Datas.select()
.from_(subquery)
.where(SQL('rank') < 3))
but since I'm doing "Model.select()" i'm getting all the fields in the SQL SELECT which i don't want and which doesn't make my query work.
Here is the schema of my table:
CREATE TABLE IF NOT EXISTS "datas"
(
"id" INTEGER NOT NULL PRIMARY KEY,
"tracking_id" INTEGER NOT NULL,
"value" INTEGER NOT NULL,
"date_of_data" DATETIME NOT NULL,
FOREIGN KEY ("tracking_id") REFERENCES "follower" ("id")
);
CREATE INDEX "datas_tracking_id" ON "datas" ("tracking_id");
Thanks!
You probably want to use the .select_from() method on the subquery:
subq = (Datas.select(Datas.tracking, Datas.value, Datas.date_of_data,
fn.rank().over(partition_by=[Datas.tracking],
order_by=[Datas.date_of_data.desc()]).alias('rank'))
.where(Datas.tracking.in_([1, 2, 3, 4, 5, 6])))
result = subq.select_from(
subq.c.tracking, subq.c.value, subq.c.date_of_data,
subq.c.rank).where(subq.c.rank < 3)
Produces:
SELECT "t1"."tracking", "t1"."value", "t1"."date_of_data", "t1"."rank"
FROM (
SELECT "t2"."tracking",
"t2"."value",
"t2"."date_of_data",
rank() OVER (
PARTITION BY "t2"."tracking"
ORDER BY "t2"."date_of_data" DESC) AS "rank"
FROM "datas" AS "t2"
WHERE ("t2"."tracking" IN (?, ?, ?, ?, ?, ?))) AS "t1"
WHERE ("t1"."rank" < ?)
Related
I have connected to Teradata with sqlalchemy and am looking to execute multiple SQL statements at once. The queries are simply, but here would be an example
INSERT INTO TABLE_A
SELECT * FROM TABLE_B WHERE ID IN (1, 2, 3, 4, 5)
;
INSERT INTO TABLE_A
SELECT * FROM TABLE_B WHERE ID IN (6, 7, 8, 9, 10)
;
I want both of these queries to kick off at the same time instead of running the first one then the second one.
My sqlalchemy connection is as follow
query = f"""
INSERT INTO TABLE_A
SELECT * FROM TABLE_B WHERE ID IN (1, 2, 3, 4, 5)
;
INSERT INTO TABLE_A
SELECT * FROM TABLE_B WHERE ID IN (6, 7, 8, 9, 10)
;
"""
create_engine(<connection string>)
pd.read_sql(query, create_eingine)
Because MySQL Left join limited 61, maybe this is table:
SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;
-- ----------------------------
-- Table structure for test
-- ----------------------------
DROP TABLE IF EXISTS `test`;
CREATE TABLE `test` (
`id` int(11) DEFAULT NULL,
`pid` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
-- ----------------------------
-- Records of test
-- ----------------------------
BEGIN;
INSERT INTO `test` VALUES (1, 3);
INSERT INTO `test` VALUES (1, 4);
INSERT INTO `test` VALUES (2, 4);
INSERT INTO `test` VALUES (3, 5);
INSERT INTO `test` VALUES (3, 6);
COMMIT;
SET FOREIGN_KEY_CHECKS = 1;
This is MySQL SQL:
SELECT
t1.pid AS lev1,
t2.pid AS lev2,
t3.pid AS lev3,
t4.pid AS lev4
FROM
test AS t1
LEFT JOIN test AS t2 ON ( t2.id = t1.pid )
LEFT JOIN test AS t3 ON ( t3.id = t2.pid )
LEFT JOIN test AS t4 ON ( t4.id = t3.pid )
LEFT JOIN test AS t5 ON ( t5.id = t4.pid )
WHERE t1.id = 1 ;
I want to output like this but not using MYSQL LEFT JOIN:
lev1,lev2,lev3,lev4
3, 6,
3, 5,
4, ,
If python can achieve, I also need!
The limit of 61 is there for a reason and your query might run somewhat slowly, but something like the following should work for your needs:
SELECT
t1.id AS lev1, t2.id AS lev2, t3.id AS lev3, t4.id AS lev4
FROM
t1, t2, t3, t4
WHERE
t2.pid = t1.id AND t3.pid = t1.id AND t4.pid = t1.id
I'm trying to get top 10 prescriptions prescribed from the database into a dataframe using pd.read_sql(sql, uri), but it returned with the following error:
~\AppData\Local\Continuum\anaconda3\envs\GISProjects\lib\site-packages\sqlalchemy\engine\result.py in _non_result(self, default)
1168 if self._metadata is None:
1169 raise exc.ResourceClosedError(
-> 1170 "This result object does not return rows. "
1171 "It has been closed automatically."
1172 )
ResourceClosedError: This result object does not return rows. It has been closed automatically.
My query has local variables to track ranking so that it'll return top 10 prescription by practice. It works if I run it in MySql Workbench but not when I use pd.read_sql()
sql = """
SET #current_practice = 0;
SET #practice_rank = 0;
select practice, bnf_code_9, total_items, practice_rank
FROM (select a.practice,
a.bnf_code_9,
a.total_items,
#practice_rank := IF(#current_practice = a.practice, #practice_rank + 1, 1) AS practice_rank,
#current_practice := a.practice
FROM (select rp.practice, rp.bnf_code_9, sum(rp.items) as total_items
from rx_prescribed rp
where ignore_flag = '0'
group by practice, bnf_code_9) a
order by a.practice, a.total_items desc) ranked
where practice_rank <= 10;
"""
df = pd.read_sql(sql, uri)
I expect it to return the data and into pandas dataframe but it returned with error. I assume it was from the first statement, which sets the local variable. The first two statements are necessary so that the data returns with top 10.
It works fine without the first two statements, however, it would return with '1' in all rows for the practice_rank column rather than expected values of 1, 2 ,3 and so on.
Is there a way I can run multiple statements and return the results from the last statement executed?
Short answer
The stack of programs that are called in the pandas.read_sql() statement is: pandas > SQLAlchemy > MySQLdb or pymysql > MySql database. The database drivers mysqlclient (mysqldb) and pymysql don't like multiple SQL statements in a single execute() call. Split them up into separate calls.
Solution
import pandas as pd
from sqlalchemy import create_engine
# mysqldb is the default, use mysql+pymysql to use the pymysql driver
# URI format: mysql<+driver>://<user:password#>localhost/database
engine = create_engine('mysql://localhost/test')
# First two lines starting with SET removed
sql = '''
SELECT practice, bnf_code_9, total_items, practice_rank
FROM (
SELECT
a.practice,
a.bnf_code_9,
a.total_items,
#practice_rank := IF(#current_practice = a.practice, #practice_rank + 1, 1) AS practice_rank,
#current_practice := a.practice
FROM (
SELECT
rp.practice, rp.bnf_code_9, sum(rp.items) AS total_items
FROM rx_prescribed rp
WHERE ignore_flag = '0'
GROUP BY practice, bnf_code_9
) a
ORDER BY a.practice, a.total_items DESC
) ranked
WHERE practice_rank <= 10;
'''
with engine.connect() as con:
con.execute('SET #current_practice = 0;')
con.execute('SET #practice_rank = 0;')
df = pd.read_sql(sql, con)
print(df)
Results in:
practice bnf_code_9 total_items practice_rank
0 2 3 6.0 1
1 6 1 9.0 1
2 6 2 4.0 2
3 6 4 3.0 3
4 17 1 0.0 1
5 42 42 42.0 1
I used the following code to create a test database for your problem.
DROP TABLE IF EXISTS rx_prescribed;
CREATE TABLE rx_prescribed (
id INT AUTO_INCREMENT PRIMARY KEY,
practice INT,
bnf_code_9 INT,
items INT,
ignore_flag INT
);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (2, 3, 4, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (2, 3, 2, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (6, 1, 9, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (6, 2, 4, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (6, 4, 3, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (9, 11, 1, 1);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (17, 1, 0, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (42, 42, 42, 0);
Tested on MariaDB 10.3.
I am creating one table per user in my database and later storing data specific to that user. Since I have 100+ users, I was looking to automate the table creation process in my Python code.
Much like how I can automate a row insertion in a table, I tried to automate table insertion.
Row insertion code:
PAYLOAD_TEMPLATE = (
"INSERT INTO metadata "
"(to_date, customer_name, subdomain, internal_users)"
"VALUES (%s, %s, %s, %s)"
)
How I use it:
connection = mysql.connector.connect(**config)
cursor = connection.cursor()
# Opening csv table to feed data
with open('/csv-table-path', 'r') as weeklyInsight:
reader = csv.DictReader(weeklyInsight)
for dataDict in reader:
# Changing date to %m/%d/%Y format
to_date = dataDict['To'][:5] + "20" + dataDict['To'][5:]
payload_data = (
datetime.strptime(to_date, '%m/%d/%Y'),
dataDict['CustomerName'],
dataDict['Subdomain'],
dataDict['InternalUsers']
)
cursor.execute(PAYLOAD_TEMPLATE, payload_data)
How can I create a 'TABLE_TEMPLATE' that can be executed in a similar way to create a table?
I wish to create it such that I can execute the template code from my cursor after replacing certain fields with others.
TABLE_TEMPLATE = (
" CREATE TABLE '{customer_name}' (" # Change customer_name for new table
"'To' DATE NOT NULL,"
"'Users' INT(11) NOT NULL,"
"'Valid' VARCHAR(3) NOT NULL"
") ENGINE=InnoDB"
)
There is no technical¹ need to create a separate table for each client. It is simpler and cleaner to have a single table, e.g.
-- A simple users table; you probably already have something like this
create table users (
id integer not null auto_increment,
name varchar(50),
primary key (id)
);
create table weekly_numbers (
id integer not null auto_increment,
-- By referring to the id column of our users table we link each
-- row with a user
user_id integer references users(id),
`date` date not null,
user_count integer(11) not null,
primary key (id)
);
Let's add some sample data:
insert into users (id, name)
values (1, 'Kirk'),
(2, 'Picard');
insert into weekly_numbers (user_id, `date`, user_count)
values (1, '2017-06-13', 5),
(1, '2017-06-20', 7),
(2, '2017-06-13', 3),
(1, '2017-06-27', 10),
(2, '2017-06-27', 9),
(2, '2017-06-20', 12);
Now let's look at Captain Kirk's numbers:
select `date`, user_count
from weekly_numbers
-- By filtering on user_id we can see one user's numbers
where user_id = 1
order by `date` asc;
¹There may be business reasons to keep your users' data separate. A common use case would be isolating your clients' data, but in that case a separate database per client seems like a better fit.
I want to select values from MySQL as follows
do_not_select = [1,2,3]
cursor = database.cursor()
cursor.executemany("""SELECT * FROM table_a WHERE id != %s""",(do_not_select))
data = cursor.fetchall()
The query return all the values in the db apart form the first id (1). I don't want it to select id 1,2 or 3 however.
Is this possible using the executemany command..?
Give NOT IN a go:
do_not_select = [1, 2, 3]
cursor.execute("""SELECT * FROM table_a
WHERE id NOT IN ({}, {}, {})""".format(do_not_select[0],
do_not_select[1],
do_not_select[2]))
data.cursor.fetchall()
I suspect (though I haven't tested this) that this would work better id do_not_select was a tuple, then I think you could just fire it straight into your query:
do_not_select = (1, 2, 3)
cursor.execute("""SELECT * FROM table_a
WHERE id NOT IN {}""".format(do_not_select))
data.cursor.fetchall()
I'd be interested to know if this works - if you try it please let me know :)