how to insert variables in read_sql_query using python

how to insert variables in read_sql_query using python - python

I am trying to retrieve data from sqlite3 with the help of variables. It is working fine with execute() statement but i would like to retrieve columns also and for that purpose i am using read_sql_query() but i am unable to pass variables in read_sql_query(), please follow below code:
def cal():
tab = ['LCOLOutput']
column_name = 'CUSTOMER_EMAIL_ID'
xyz = '**AVarma1#ra.rockwell.com'
for index, m in enumerate(tab):
table_name = m
sq = "SELECT * FROM ? where ?=?;" , (table_name, column_name, xyz,)
df = pandas.read_sql_query(sq,conn)
writer =
pandas.ExcelWriter('D:\pandas_simple.xlsx',engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()

You need to change the syntax with the method read_sql_query() from pandas, check the doc.
For sqlite, it should work with :
sq = "SELECT * FROM ? where ?=?;"
param = (table_name, column_name, xyz,)
df = pandas.read_sql_query(sq,conn, params=param)
EDIT :
otherwise try with the following formatting for the table :
sq = "SELECT * FROM {} where ?=?;".format(table_name)
param = (column_name, xyz,)
df = pandas.read_sql_query(sq,conn, params=param)
Check this answer explaining why table cannot be passed as parameter directly.

Related

Write variable after reading .sql query

When I have to pass a parameter before running a sql query, I usually do
date = '20220101'
query = f'''SELECT * FROM TABLE WHERE DATE = '{date}''''
On an attempt to reduce the lenght of code, I created a query.sql file with the query above but I'm failing to pass the date variable inside my query, before running the sql.
For reading I'm using
sql_query = open("query.sql", "r")
sql_as_string = sql_query.read()
df = pd.read_sql(sql_as_string, conn)
Is there a way around, instead of pasting the whole SQL query at my .py code?
I'm using pyodbc, ODBC Driver 17 for SQL Server

Use a parametrized query, not string formatting.
The file should just contain the query, with a ? placeholder for the variable.
SELECT * FROM TABLE WHERE DATE = ?
Then you can do
with open("query.sql", "r") as f:
sql_query = f.read()
df = pd.read_sql(sql_query, conn, params=(date, ))

Using variable in Snowflake SQL in Python script

I am trying to create a view that contains a variable in Snowflake SQL. The whole thing is being done in Python script. Initially, I tried the binding variable approach but binding does not work in view creation SQL. Is there any other way I can proceed with this? I have given the code below.
Code:
import snowflake.connector as sf
import pandas
ctx = sf.connect (
user = 'floatinginthecloud89',
password = '',
account = 'nq13914.southeast-asia.azure',
warehouse = 'compute_wh',
database = 'util_db',
schema = 'public'
)
print("Got the context object")
cs = ctx.cursor()
print("Got the cursor object")
column1 = 'attr_name';
try:
row = cs.execute("select listagg(('''' || attr_name || ''''), ',') from util_db.public.TBL_DIM;")
rows = cs.fetchall()
for row in rows:
print(row)
print(rows)
row1 = cs.execute("""CREATE OR REPLACE table util_db.public.HIERARCHY_VIEW_2 AS SELECT * FROM (SELECT MSTR.PROD_CODE AS PROD_CODE,DIM.ATTR_NAME AS ATTR_NAME,MSTR.ATTR_VALUE AS ATTR_VALUE FROM TBL_DIM DIM INNER JOIN TBL_MSTR MSTR ON DIM.ATTR_KEY=MSTR.ATTR_KEY ) Q
PIVOT (MAX (Q.ATTR_VALUE) FOR Q.ATTR_NAME IN (*row))
AS P
ORDER BY P.PROD_CODE;""")
rows1 = cs.fetchall()
for row1 in rows1:
print(row1)
finally:
cs.close()
ctx.close()
Error:
File "C:\Users\Anand Singh\anaconda3\lib\site-packages\snowflake\connector\errors.py", line 179, in default_errorhandler
raise error_class(
ProgrammingError: 001003 (42000): SQL compilation error:
syntax error line 2 at position 65 unexpected 'row'.

Looking at the Python binding example
and your code it appears, you need
row1 = cs.execute("""CREATE OR REPLACE table util_db.public.HIERARCHY_VIEW_2 AS
SELECT * FROM (
SELECT MSTR.PROD_CODE AS PROD_CODE,DIM.ATTR_NAME AS ATTR_NAME,MSTR.ATTR_VALUE AS ATTR_VALUE
FROM TBL_DIM DIM
INNER JOIN TBL_MSTR MSTR
ON DIM.ATTR_KEY=MSTR.ATTR_KEY
) Q
PIVOT (MAX (Q.ATTR_VALUE) FOR Q.ATTR_NAME IN (%s))
AS P
ORDER BY P.PROD_CODE;""", row)
but *row will pass the many argugments to I have changed to build the string or comman seperated as a single string.

More pythonic way to implement this is using f-string
row1 = cs.execute(f"""CREATE OR REPLACE table util_db.public.HIERARCHY_VIEW_2 AS
SELECT * FROM (
SELECT MSTR.PROD_CODE AS PROD_CODE,DIM.ATTR_NAME AS ATTR_NAME,MSTR.ATTR_VALUE AS ATTR_VALUE
FROM TBL_DIM DIM
INNER JOIN TBL_MSTR MSTR
ON DIM.ATTR_KEY=MSTR.ATTR_KEY
) Q
PIVOT (MAX (Q.ATTR_VALUE) FOR Q.ATTR_NAME IN ({row}))
AS P
ORDER BY P.PROD_CODE;""")
It is also more readable especially if you have multiple parameters in the f-string

Issue resolved! Thanks a lot, Simeon for your help.
import snowflake.connector as sf
import pandas
ctx = sf.connect (
user = 'floatinginthecloud89',
password = 'AzureSn0flake#123',
account = 'nq13914.southeast-asia.azure',
warehouse = 'compute_wh',
database = 'util_db',
schema = 'public'
)
print("Got the context object")
cs = ctx.cursor()
print("Got the cursor object")
column1 = 'attr_name';
try:
row = cs.execute("select listagg(('''' || attr_name || ''''), ',') from util_db.public.TBL_DIM;")
rows = cs.fetchall()
for row in rows:
print(row)
print(rows)
row1 = cs.execute("""CREATE OR REPLACE table util_db.public.HIERARCHY_VIEW_2 AS
SELECT * FROM (
SELECT MSTR.PROD_CODE AS PROD_CODE,DIM.ATTR_NAME AS ATTR_NAME,MSTR.ATTR_VALUE AS ATTR_VALUE
FROM TBL_DIM DIM
INNER JOIN TBL_MSTR MSTR
ON DIM.ATTR_KEY=MSTR.ATTR_KEY
) Q
PIVOT (MAX (Q.ATTR_VALUE) FOR Q.ATTR_NAME IN (%s))
AS P
ORDER BY P.PROD_CODE;""", ','.join(row))
rows1 = cs.fetchall()
for row1 in rows1:
print(row1)

Dropping some data while saving dataframe into csv file

I am running redshift query which is having 40 millions of record. But when I am saving into csv file it is showing only 7 thousands of record. Could you please help me how to solve this?
Example:
Code:
conn = gcso_conn1()
with conn.cursor() as cur:
query = "select * from (select a.src_nm Source_System ,b.day_id Date,b.qty Market_Volume,b.cntng_unt Volume_Units,b.sls_in_lcl_crncy Market_Value,b.crncy_cd Value_Currency,a.panel Sales_Channel,a.cmpny Competitor_Name,a.lcl_mnfcr Local_Manufacturer ,a.src_systm_id SKU_PackID_ProductNumber,upper(a.mol_list) Molecule_Name,a.brnd_nm BrandName_Intl,a.lcl_prod_nm BrandName_Local,d.atc3_desc Brand_Indication,a.prsd_strngth_1_nbr Strength,a.prsd_strngth_1_unt Strength_Units,a.pck_desc Pack_Size_Number,a.prod_nm Product_Description,c.iso3_cntry_cd Country_ISO_Code,c.cntry_nm Country_Name from gcso_prd_cpy.dim_prod a join gcso_prd_cpy.fct_sales b on (a.SRC_NM='IMS' and b.SRC_NM='IMS' and a.prod_id = b.prod_id) join gcso_prd_cpy.dim_cntry c on (a.cntry_id = c.cntry_id) left outer join gcso_prd_cpy.dim_thrc_area d on (a.prod_id = d.prod_id) WHERE a.SRC_NM='IMS' and c.iso3_cntry_cd in ('JPN','IND','CAN','USA') and upper(a.mol_list) in ('AMBRISENTAN', 'BERAPROST','BOSENTAN') ORDER BY b.day_id ) a"
#print(query)
cur.execute(query)
result = cur.fetchall()
conn.commit()
column = [i[0] for i in cur.description]
sqldf = pd.DataFrame(result, columns= column)
print(sqldf.count())
#print(df3)
sqldf.to_csv(Output_Path, index= False, sep= '\001', encoding = 'utf-8')

Everything should work correctly. I think the main problem is debugging using count(). You expect number of records but docs says:
Count non-NA cells for each column or row.
Better to use when debugging DataFrame:
print(len(df))
print(df.shape)
print(df.info())
Also you can do it easier using read_sql:
import pandas as pd
from sqlalchemy import create_engine
header = True
for chunk in pd.read_sql(
'your query here - SELECT * FROM... ',
con=create_engine('creds', echo=True), # set creds - postgres+psycopg2://user:password#host:5432/db_name
chunksize=1000, # read by chunks
):
file_path = '/tmp/path_to_your.csv'
chunk.to_csv(
file_path,
header=header,
mode='a',
index=False,
)
header = False

how to pass columname as params with pandas read_sql_query function

I am writing a sql query where I want to pass a WHERE condition with parameters in pandas.read_sql_query.
It works fine for the value but I encounters problems with the variable.
My workaround is a concated string which I pass to pandas, but I don't like to see my code so.
I already figured out, that the column name of the table is written wrong. It is e.g. 'colname' instead of colname.
I wrote the sql as string:
command=("SELECT * FROM review r "
"WHERE 1=1 "
"AND "+selected_var+"= "+selected_val
)
And then i passed it to pandas
self.reviews = pd.read_sql_query(command, con = self.cnxn)
But I would like to include it without workaround.
import pandas as pd
import mysql.connector
self.reviews = pd.read_sql_query("""
SELECT *
FROM review r
WHERE 1=1
AND %(sel_var)s = %(sel_val)s;
""", con = self.cnxn, params = {'sel_var': selected_var,
'sel_val': selected_val
})
I expect that the query shows results without writing everything as command string.

What about string formatting?
input_params = {'sel_var': selected_var,
'sel_val': selected_val}
self.reviews = pd.read_sql_query(""" SELECT * FROM review r WHERE 1=1
AND {sel_var}={sel_val};""".format(**input_params),
con = self.cnxn)

Connect the different tables form one Sqlite file by using python

So sorry, i have one sqlite file and it includes many tables(Like table_A to table_Z)
Sorry, i only can use the c.execute('SELECT ST_Name FROM table_A') and do table_B again.
How can i use the loop to do it, i already search all day, but i don't get the answer, please help me.
Thx!!
About my code, please refer below
import sqlite3
import numpy as np
Sqlite_Path = 'D:\Student.sqlite'
conn = sqlite3.connect(Sqlite_Path)
c = conn.cursor()
c.execute('SELECT ST_Name FROM Table_A')
data = c.fetchall()
# do something
c.execute('SELECT ST_Name FROM Table_B')
data = c.fetchall()
# do something again

This looks like a homework to me, but anyway:
tables = ["Table_A ", "Table_B", "Table_C"]
for table in tables:
c.execute('SELECT ST_Name FROM {}'.format(table))
data = c.fetchall()
# do something

If you don't know all the tables or for some other reason you can do something like this:
tables = [r[0] for r in db.execute('select name from sqlite_master where name like "table_%" and type = "table"')]
for table in tables:
stmt = 'SELECT * FROM {};'.format(table)
c = db.execute(stmt)
rows = c.fetch_all()
# ... do something with results
this returns ALL table names in your sqlitefile

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to insert variables in read_sql_query using python - python

Related

Write variable after reading .sql query

Using variable in Snowflake SQL in Python script

Dropping some data while saving dataframe into csv file

how to pass columname as params with pandas read_sql_query function

Connect the different tables form one Sqlite file by using python

Categories

Resources