Running multiple Teradata SQL queries at once with python

Running multiple Teradata SQL queries at once with python - python

I have connected to Teradata with sqlalchemy and am looking to execute multiple SQL statements at once. The queries are simply, but here would be an example
INSERT INTO TABLE_A
SELECT * FROM TABLE_B WHERE ID IN (1, 2, 3, 4, 5)
;
INSERT INTO TABLE_A
SELECT * FROM TABLE_B WHERE ID IN (6, 7, 8, 9, 10)
;
I want both of these queries to kick off at the same time instead of running the first one then the second one.
My sqlalchemy connection is as follow
query = f"""
INSERT INTO TABLE_A
SELECT * FROM TABLE_B WHERE ID IN (1, 2, 3, 4, 5)
;
INSERT INTO TABLE_A
SELECT * FROM TABLE_B WHERE ID IN (6, 7, 8, 9, 10)
;
"""
create_engine(<connection string>)
pd.read_sql(query, create_eingine)

Related

Select from a query with peewee

I have some troubles implementing the following query with peewee:
SELECT *
FROM (
SELECT datas.*, (rank() over(partition by tracking_id order by date_of_data DESC)) as rank_result
FROM datas
WHERE tracking_id in (1, 2, 3, 4, 5, 6)
)
WHERE rank_result < 3;
I have tried to do the following:
subquery = (Datas.select(Datas.tracking, Datas.value, Datas.date_of_data,
fn.rank().over(partition_by=[Datas.tracking],
order_by=[Datas.date_of_data.desc()]).alias('rank'))
.where(Datas.tracking.in_([1, 2, 3, 4, 5, 6])))
result = (Datas.select()
.from_(subquery)
.where(SQL('rank') < 3))
but since I'm doing "Model.select()" i'm getting all the fields in the SQL SELECT which i don't want and which doesn't make my query work.
Here is the schema of my table:
CREATE TABLE IF NOT EXISTS "datas"
(
"id" INTEGER NOT NULL PRIMARY KEY,
"tracking_id" INTEGER NOT NULL,
"value" INTEGER NOT NULL,
"date_of_data" DATETIME NOT NULL,
FOREIGN KEY ("tracking_id") REFERENCES "follower" ("id")
);
CREATE INDEX "datas_tracking_id" ON "datas" ("tracking_id");
Thanks!

You probably want to use the .select_from() method on the subquery:
subq = (Datas.select(Datas.tracking, Datas.value, Datas.date_of_data,
fn.rank().over(partition_by=[Datas.tracking],
order_by=[Datas.date_of_data.desc()]).alias('rank'))
.where(Datas.tracking.in_([1, 2, 3, 4, 5, 6])))
result = subq.select_from(
subq.c.tracking, subq.c.value, subq.c.date_of_data,
subq.c.rank).where(subq.c.rank < 3)
Produces:
SELECT "t1"."tracking", "t1"."value", "t1"."date_of_data", "t1"."rank"
FROM (
SELECT "t2"."tracking",
"t2"."value",
"t2"."date_of_data",
rank() OVER (
PARTITION BY "t2"."tracking"
ORDER BY "t2"."date_of_data" DESC) AS "rank"
FROM "datas" AS "t2"
WHERE ("t2"."tracking" IN (?, ?, ?, ?, ?, ?))) AS "t1"
WHERE ("t1"."rank" < ?)

Updating string based on dictionary key value in python where key is partial string

I have a string that I want to update with values based on specific keys. My challenge is that that one of the values in my dictionary is partially a string value
parameters = {"$1":"'name'",
"$2":"hierarchy LIKE '%|user_name|%'"}
String that I'd like to replace $1 and $2 with the values is:
query = """
SELECT
$1 AS org,
FROM
table
WHERE
input ='Active'
AND $2
GROUP BY 1, 2, 3, 4, 5, 6
"""
I then run it through the below code:
for key in parameters.keys():
query = query.replace(key, parameters[key])
It works fine for replacing $1 with 'name' but it does not replace $2 appropriately. We end up getting:
print(query)
output:
"""
SELECT
'name' AS org,
FROM
table
WHERE
input ='Active'
AND hierarchy
GROUP BY 1, 2, 3, 4, 5, 6
"""
The issue is when replacing $2, it leaves out the full string. Any tips?

You can use re.sub:
import re
print(re.sub('\$\d+', lambda x:parameters[x.group()], query))
Output
SELECT
'name' AS org,
FROM
table
WHERE
input ='Active'
AND hierarchy LIKE '%|user_name|%'
GROUP BY 1, 2, 3, 4, 5, 6

parameters = {
"$1": "'name'",
"$2": "hierarchy LIKE '%|user_name|%'",
}
str.replace
query = """
SELECT
$1 AS org,
FROM
table
WHERE
input ='Active'
AND $2
GROUP BY 1, 2, 3, 4, 5, 6
"""
for key, value in parameters.items():
query = query.replace(key, value)
The old way: str % (params)
query = """
SELECT
%s AS org,
FROM
table
WHERE
input ='Active'
AND %s
GROUP BY 1, 2, 3, 4, 5, 6
""" % (parameters['$1'], parameters['$2'])
Better: str.format
query = """
SELECT
{$1} AS org,
FROM
table
WHERE
input ='Active'
AND {$2}
GROUP BY 1, 2, 3, 4, 5, 6
""".format(**parameters)
New way: f'string'
query = f"""
SELECT
{parameters['$1']} AS org,
FROM
table
WHERE
input ='Active'
AND {parameters['$2']}
GROUP BY 1, 2, 3, 4, 5, 6
"""
Tip: Do not use bare string formatting when working with SQL. Pay a lot of attention for these parts of code since SQL injections are widely known vulnerabilities.
Tip: Check if your python version is outdated. (New stable version: 3.9), but don't be rash with upgrading it anyways

Is there a way to prevent JOIN from joining new data sets with old?

conn = sqlite3.connect("lite.db")
cur = conn.cursor()
cur.execute("SELECT * FROM database JOIN database1")
rows = cur.fetchall()
for row in rows:
print(row)
tree.insert("", tk.END, values=row)
conn.close()
I have two SQLite3 databases, database and database1. When I join database1 to database and the user enters all of the information to be stored in the database it joins them as expected. However, if the user adds a new set of data the old set joins the new set again.
E.g. user enters: Frank in database and Apple in database1
Displayed results: Frank Apple
user enter: William in database and Orange in database1
Displayed results: Frank Apple
Frank Orange
William Orange
William Apple
How do I stop the database from modifying any other stored values?
Expected: Frank Apple
William Orange

You don't JOIN databases, you JOIN tables. To use multiple databases in a connection you open/connect one database and ATTACH the other databases with a schema so as to differentiate them, the originally connected database will have a schema name of main.
If you don't provide conditions for a JOIN then every permutation/combination will be the result.
You use ON the_condition(s)
As a Demo of ATTACH and JOIN with and without conditions :-
import sqlite3
# First database
connection1 = sqlite3.connect("database")
connection1.execute("CREATE TABLE IF NOT EXISTS table1 (id INTEGER PRIMARY KEY, name TEXT)")
connection1.commit()
connection1.execute("DELETE FROM table1")
connection1.execute("INSERT INTO table1 (name) VALUES('Frank'),('Mary'),('Joan')")
connection1.commit()
cursor = connection1.execute("SELECT * FROM table1")
print("\nData in table1 (in 1st Database)")
rows = cursor.fetchall()
for row in rows:
print(row)
# Second database
connection2 = sqlite3.connect("database1")
connection2.execute("CREATE TABLE IF NOT EXISTS table2 (id INTEGER PRIMARY KEY, name TEXT)")
connection2.commit()
connection2.execute("DELETE FROM table2")
connection2.execute("INSERT INTO table2 (name) VALUES('Apple'),('Banana'),('Pear')")
connection2.commit()
cursor = connection2.execute("SELECT * FROM table2")
print("\nData in table 2 (in 2nd Database)")
rows = cursor.fetchall()
for row in rows:
print(row)
connection2.close()
# Attach 2nd database to first
connection1.execute("ATTACH DATABASE 'database1' AS schema_database2")
connection1.commit()
# Use a JOIN
cursor = connection1.execute("SELECT * FROM main.table1 JOIN schema_database2.table2 ON table1.id = table2.id")
print("\nJoin Example 1 (schema not needed)")
rows = cursor.fetchall()
for row in rows:
print(row)
# If there is no ambiguity of names (table names) then just table name can be used
print("\nJoin Example 2 (schema not needed)")
cursor = connection1.execute("SELECT * FROM table1 JOIN table2 ON table1.id = table2.id")
rows = cursor.fetchall()
for row in rows:
print(row)
print("\nNatural (not recommended ) Join")
cursor = connection1.execute("SELECT * FROM table1 JOIN table2")
rows = cursor.fetchall()
for row in rows:
print(row)
Demo Result :-
Data in table1 (in 1st Database)
(1, 'Frank')
(2, 'Mary')
(3, 'Joan')
Data in table 2 (in 2nd Database)
(1, 'Apple')
(2, 'Banana')
(3, 'Pear')
Join Example 1 (schema not needed)
(1, 'Frank', 1, 'Apple')
(2, 'Mary', 2, 'Banana')
(3, 'Joan', 3, 'Pear')
Join Example 2 (schema not needed)
(1, 'Frank', 1, 'Apple')
(2, 'Mary', 2, 'Banana')
(3, 'Joan', 3, 'Pear')
Without JOIN conditions (joins everything to everything)
(1, 'Frank', 1, 'Apple')
(1, 'Frank', 2, 'Banana')
(1, 'Frank', 3, 'Pear')
(2, 'Mary', 1, 'Apple')
(2, 'Mary', 2, 'Banana')
(2, 'Mary', 3, 'Pear')
(3, 'Joan', 1, 'Apple')
(3, 'Joan', 2, 'Banana')
(3, 'Joan', 3, 'Pear')
Process finished with exit code 0

SQLAlchemy SQL parameter list substitution with pyodbc

I am trying to bind a list to a parameter in a raw SQL query in sqlalchemy. This post suggests a great way to do so with psycopg2 as below.
some_ids = [1, 2, 3, 4]
query = "SELECT * FROM my_table WHERE id = ANY(:ids);"
engine.execute(sqlalchemy.sql.text(query), ids=some_ids)
However, this does not seems to work for my environment for SQL Server with pyodbc. Only one "?" gets added instead of 4.
sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError)
('Invalid parameter type. param-index=0 param-type=tuple', 'HY105')
[SQL: 'SELECT * FROM my_table WHERE id = ANY(?);'] [parameters: ((1, 2, 3, 4),)]
Is there any way to make this work? I would like to avoid manually creating placeholders if possible.
sqlalchemy version=1.0.13, pyodbc version=4.0.16

This seems to be working for me:
from sqlalchemy import create_engine
from sqlalchemy.sql import table, column, select, literal_column
engine = create_engine('mssql+pyodbc://SQLmyDb')
cnxn = engine.connect()
some_ids = [2, 3]
sql = select([literal_column('*')]).where(column('ID').in_(some_ids)).select_from(table('People'))
print(sql)
print('')
params = {":ID_" + str(x+1): some_ids[x] for x in range(len(some_ids))}
rows = cnxn.execute(sql, params).fetchall()
print(rows)
which prints the generated SQL statement, followed by the results of the query
SELECT *
FROM "People"
WHERE "ID" IN (:ID_1, :ID_2)
[(2, 'Anne', 'Elk'), (3, 'Gord', 'Thompson')]

Python MySQL executemany in WHERE clause

I want to select values from MySQL as follows
do_not_select = [1,2,3]
cursor = database.cursor()
cursor.executemany("""SELECT * FROM table_a WHERE id != %s""",(do_not_select))
data = cursor.fetchall()
The query return all the values in the db apart form the first id (1). I don't want it to select id 1,2 or 3 however.
Is this possible using the executemany command..?

Give NOT IN a go:
do_not_select = [1, 2, 3]
cursor.execute("""SELECT * FROM table_a
WHERE id NOT IN ({}, {}, {})""".format(do_not_select[0],
do_not_select[1],
do_not_select[2]))
data.cursor.fetchall()
I suspect (though I haven't tested this) that this would work better id do_not_select was a tuple, then I think you could just fire it straight into your query:
do_not_select = (1, 2, 3)
cursor.execute("""SELECT * FROM table_a
WHERE id NOT IN {}""".format(do_not_select))
data.cursor.fetchall()
I'd be interested to know if this works - if you try it please let me know :)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Running multiple Teradata SQL queries at once with python - python

Related

Select from a query with peewee

Updating string based on dictionary key value in python where key is partial string

Is there a way to prevent JOIN from joining new data sets with old?

SQLAlchemy SQL parameter list substitution with pyodbc

Python MySQL executemany in WHERE clause

Categories

Resources