Using format in two queries in sql - python

I have a query I use except.I want to send the table path in format when running the select query.
query_2="""select *
from {}.{}
where date(etl_date) = current_date
except select *
from {}_test.{}
where date(etl_date)=current_date"""
.format(liste[0],liste[1])
But naturally I get an error like this.
IndexError: tuple index out of range
How else can I use the format function here? Thanks...

Do not use simple format for SQL queries; use sql.Identifier for tables, fields and use the second argument of the execute method to pass variables (if needed).
from psycopg2.sql import Identifier, SQL
connection = psycopg2.connect("...")
cursor = connection.cursor()
suffix = "_test"
identifiers = [Identifier("some_schema"), Identifier("some_table"), Identifier("other_schema%s" % suffix), Identifier("other_table")]
query_2 = SQL("""select * from {}.{} where date(etl_date) = current_date
except select * from {}.{} where date(etl_date)=current_date""").format(*identifiers)
print(query_2.as_string(cursor)) # if you want to see the final query
cursor.execute(query_2)
Output
select * from "some_schema"."some_table" where date(etl_date) = current_date
except select * from "other_schema_test"."other_table" where date(etl_date)=current_date
This assumes you have multiple schemas in the same database as you can't easily do cross database queries in PostgreSQL.

Related

Use of variable in SQL query

I want to use a variable in the Select Query but it give me a fail. Someone can guide me how to insert this variable in my SQL query ?
project = test
# Our SQL Query
Query_1 = """
SELECT * FROM project.dataplex_dq.dq-results`
"""
# labelling our query job
query_job_1 = client.query(Query_1)
# results as a dataframe
Table = query_job_1.result().to_dataframe()

Illegal Variable Name/Number when Passing in Python List

I'm trying to run SQL statements through Python on a list.
By passing in a list, in this case date. Since i want to run multiple SELECT SQL queries and return them.
I've tested this by passing in integers, however when trying to pass in a date I am getting ORA-01036 error. Illegal variable name/number. I'm using an Oracle DB.
cursor = connection.cursor()
date = ["'01-DEC-21'", "'02-DEC-21'"]
sql = "select * from table1 where datestamp = :date"
for item in date:
cursor.execute(sql,id=item)
res=cursor.fetchall()
print(res)
Any suggestions to make this run?
You can't name a bind variable date, it's an illegal name. Also your named variable in cursor.execute should match the bind variable name. Try something like:
sql = "select * from table1 where datestamp = :date_input"
for item in date:
cursor.execute(sql,date_input=item)
res=cursor.fetchall()
print(res)
Some recommendation and warnings to your approach:
you should not depend on your default NLS date setting, while binding a String (e.g. "'01-DEC-21'") to a DATE column. (You probably need also remone one of the quotes).
You should ommit to fetch data in a loop if you can fetch them in one query (using an IN list)
use prepared statement
Example
date = ['01-DEC-21', '02-DEC-21']
This generates the query that uses bind variables for your input list
in_list = ','.join([f" TO_DATE(:d{ind},'DD-MON-RR','NLS_DATE_LANGUAGE = American')" for ind, d in enumerate(date)])
sql_query = "select * from table1 where datestamp in ( " + in_list + " )"
The sql_query generate is
select * from table1 where datestamp in
( TO_DATE(:d0,'DD-MON-RR','NLS_DATE_LANGUAGE = American'), TO_DATE(:d1,'DD-MON-RR','NLS_DATE_LANGUAGE = American') )
Note that the INlist contains one bind variable for each member of your input list.
Note also the usage of to_date with explicite mask and fixing the language to avoid problems with interpretation of the month abbreviation. (e.g. ORA-01843: not a valid month)
Now you can use the query to fetch the data in one pass
cur.prepare(sql_query)
cur.execute(None, date)
res = cur.fetchall()

Insert Python DuckDB table into SQL statement

I am trying to use a registered virtual table as a table in a SQL statement using a connection to another database. I can't just turn the column into a string and use that, I need the table/dataframe itself to work in the statement and join with the other tables in the SQL statment. I'm trying this out on an Access database to start. This is what I have so far:
import pyodbc
import pandas as pd
import duckdb
conn = duckdb.connect()
starterset = pd.read_excel (r'e:\Data Analytics\Python_Projects\Applications\DB_Test.xlsx')
conn.register("test_starter", starterset)
IDS = conn.execute("SELECT * FROM test_starter WHERE ProjectID > 1").fetchdf()
StartDate = '1/1/2015'
EndDate = '12/1/2021'
# establish the connection
connt = pyodbc.connect(r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=E:\Databases\Offline.accdb;')
cursor = conn.cursor()
# Run the query
query = ("Select ProjectID, Revenue, ClosedDate from Projects INNER JOIN " + IDS + " Z on Z.ProjectID = Projects.ProjectID "
"where ClosedDate between #" + StartDate + "# and #" + EndDate + "# AND Revenue > 0 order by ClosedDate")
sfd
df = pd.read_sql(query, connt)
df.to_excel(r'TEMP.xlsx', index=False)
os.system("start EXCEL.EXE TEMP.xlsx")
# Close the connection
cursor.close()
connt.close()
I have a list of IDs in the excel sheet that I'm trying to use as a filter from the database query. Ultimately, this will form into several criteria from the same table: dates, revenue, and IDs among others.
Honestly, I'm surprised I'm having so much trouble doing this. In SAS, with PROC SQL, it's so easy, but I can't get a dataframe to interface within the SQL parameters how I need it to. Am I making a syntax mistake?
Most common error so far is "UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U55'), dtype('<U55')) -> dtype('<U55')", but the types are the same.
It looks like you are pushing the contents of a DataFrame into an Access database query. I don't think there is a native way to do this in Pandas. The technique I use is database vendor specific, but I just build up a text string as either a CTE/WITH Clause or a temporary table.
Ex:
"""WITH my_data as (
SELECT 'raw_text_within_df' as df_column1, 'raw_text_within_df' as df_column2
UNION ALL
SELECT 'raw_text_within_df' as df_column1, 'raw_text_within_df' as df_column2
UNION ALL
...
)
[Your original query here]
"""

Alter query according to user selection in sqlite python

I have a sqlite database named StudentDB which has 3 columns Roll number, Name, Marks. Now I want to fetch only the columns that user selects in the IDE. User can select one column or two or all the three. How can I alter the query accordingly using Python?
I tried:
import sqlite3
sel={"Roll Number":12}
query = 'select * from StudentDB Where({seq})'.format(seq=','.join(['?']*len(sel))),[i for k,i in sel.items()]
con = sqlite3.connect(database)
cur = con.cursor()
cur.execute(query)
all_data = cur.fetchall()
all_data
I am getting:
operation parameter must be str
You should control the text of the query. The where clause shall allways be in the form WHERE colname=value [AND colname2=...] or (better) WHERE colname=? [AND ...] if you want to build a parameterized query.
So you want:
query = 'select * from StudentDB Where ' + ' AND '.join('"{}"=?'.format(col)
for col in sel.keys())
...
cur.execute(query, tuple(sel.values()))
In your code, the query is now a tuple instead of str and that is why the error.
I assume you want to execute a query like below -
select * from StudentDB Where "Roll number"=?
Then you can change the sql query like this (assuming you want and and not or) -
query = "select * from StudentDB Where {seq}".format(seq=" and ".join('"{}"=?'.format(k) for k in sel.keys()))
and execute the query like -
cur.execute(query, tuple(sel.values()))
Please make sure in your code the provided database is defined and contains the database name and studentDB is indeed the table name and not database name.

How to use dynamic variables(tuple) in pandasql query?

number_tuple = (1,4,6,3)
sensex_quaterly_df = psql.sqldf("SELECT * FROM sensex_df
WHERE 'Num' IN ('number_tuple')")
"HERE number_tuple has the values that I want to retrieve from sensex_df database"
Because pandasql allows you to run SQL on data frames, you can build SQL with concatenated values of tuple into comma-separated string using string.join().
number_tuple = (1,4,6,3)
in_values = ", ".join(str(i) for i in number_tuple)
sql = f"SELECT * FROM sensex_df WHERE Num IN ({in_values})"
sensex_quaterly_df = psql.sqldf(sql)
However, concatenated SQL strings is not recommended if you use an actual relational database as backend. If so, use parameterization where you develop a prepared SQL statement with placeholders like %s of ? and in subsequent step binding values. Below demonstrates with pandas read_sql:
number_tuple = (1,4,6,3)
in_values = ", ".join('?' for i in number_tuple)
sql = f"SELECT * FROM sensex_df WHERE Num IN ({in_values})"
sensex_quaterly_df = pd.read_sql(sql, conn, params=number_tuple)

Categories

Resources