How to execute multiple queries in pandas? - python

How to execute the following queries with sqlalchemy?
import pandas as pd
import urllib
from sqlalchemy import create_engine
from sqlalchemy.types import NVARCHAR
params = urllib.parse.quote_plus(r'DRIVER={SQL Server};SERVER=localhost\SQLEXPRESS;Trusted_Connection=yes;DATABASE=my_db;autocommit=true;MultipleActiveResultSets=True')
conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
engine = create_engine(conn_str, encoding = 'utf-8-sig')
with engine.connect() as con:
con.execute('Declare #latest_date nvarchar(8);')
con.execute('SELECT #latest_date = max(date) FROM my_table')
df = pd.read_sql_query('SELECT * from my_db where date = #latest_date', conn_str)
However, an error occured:
sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Must declare the scalar variable "#latest_date". (137) (SQLExecDirectW)')
How to solve this problem?
Thanks.

You don't need to declare a variable and use so many queries, you can do it just with one query:
SELECT *
FROM my_db
WHERE date = (SELECT max(date)
FROM my_db)
And then you can use, i use backticks because date is a reserved word:
with engine.connect() as con:
query="SELECT * FROM my_db WHERE `date` = (SELECT max(`date`) FROM my_db)"
df = pd.read_sql(query, con=con)

Related

Query String Composition in Psycopg2

I am trying to run a SQL "SELECT" query in Postgres from Python using Psycopg2. I am trying to compose the query string as below, but getting error message, using psycopg2 version 2.9.
from psycopg2 import sql
tablename = "mytab"
schema = "public"
query = sql.SQL("SELECT table_name from information_schema.tables where table_name = {tablename} and table_schema = {schema};")
query = query.format(tablename=sql.Identifier(tablename), schema=sql.Identifier(schema))
cursor.execute(query)
result = cursor.fetchone()[0]
Error:
psycopg2.error.InFailedSqlTransaction: current transaction is aborted, commands ignored until end of transaction block
Can someone please help. Thanks.
In the (a bit strange) query
select table_name
from information_schema.tables
where table_name = 'mytab'
and table_schema = 'public';
'mytab' and 'public' are literals, not identifiers. For comparison, mytab is an identifier here:
select *
from mytab;
Thus your format statement should look like this:
query = query.format(tablename=sql.Literal(tablename), schema=sql.Literal(schema))
Note that the quoted error message is somewhat misleading as it is about executing a query other than what is shown in the question.
Since this query is only dealing with dynamic values it can be simplified to:
import psycopg2
con = psycopg2.connect(<params>)
cursor = con.cursor()
tablename = "mytab"
schema = "public"
# Regular placeholders
query = """SELECT
table_name
from
information_schema.tables
where
table_name = %s and table_schema = %s"""
cursor.execute(query, [tablename, schema])
result = cursor.fetchone()[0]
# Named placeholders
query = """SELECT
table_name
from
information_schema.tables
where
table_name = %(table)s and table_schema = %(schema)s"""
cursor.execute(query, {"table": tablename, "schema": schema})
result = cursor.fetchone()[0]

Pandas DataFrame to PostgresSql (pandas.io.sql.DatabaseError)

Am new to Postgres. Anyone can tell how to have it work?
What I want to do is to write Pandas datataframe to PostgreSQL database. I have already created a database 'customer' and table 'users'.
I am creating a simple Pandas dataframe as follows:
data = {'Col1':[1,2,3,4,5], 'Col2':[1,2,3,4,5]}
df = pd.DataFrame(data)
After that I am creating Postgres database connection to my 'customer' database follows:
conn = psycopg2.connect(
database="customer", user='postgres', password='password', host='127.0.0.1', port= '5432')
Then, I am using the following command to insert records from dataframe into table 'users':
df.to_sql('users', conn, if_exists='replace')
conn.commit()
conn.close()
Error that I am getting is:
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': syntax error at or near ";"
LINE 1: ...ELECT name FROM sqlite_master WHERE type='table' AND name=?;
^
df.to_sql() does not work for "conn" in psycopg2. It is for "engine" in sqlalchemy. For psycopg2, try insert instead:
Step 1: Creation of an empty table
First you need to create a cursor and then create a table:
cursor = conn.cursor()
cursor.execute("CREATE TABLE users_table (col1 integer, col2 integer)")
conn.commit()
Step 2: Insert pandas df to the users_table
tuples = [tuple(x) for x in df.to_numpy()]
cols = ','.join(list(df.columns))
query = "INSERT INTO %s(%s) VALUES(%%s,%%s)" % (users_table, cols) #two columns
cursor.executemany(query, tuples)
conn.commit()
If you want to use df.to_sql():
from sqlalchemy import create_engine
engine = create_engine('postgresql+psycopg2://user:password#hostname/database_name')
df.to_sql('users', engine)
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html

How to import subset from MS Access based on condition criteria

I'm trying to use Python to create a dataframe which consists of certain rows (based on condition criteria) extracted from an MS Access table.
I can't seem to get the condition to work.
The MS Access table has column names such as Date, Course, Horse etc.
I want to, for example, get all the rows with Date = "01-Dec-2021" and Course = "Kempton".
I have managed to get the following code working with one criterion:
import pyodbc
connStr = (r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};" r"DBQ=C:\Users\chris\Documents\UKHR\SFF_Cum\SFFCum_py.accdb;")
conn = pyodbc.connect(connStr)
cursor = conn.cursor()
sql = "select * FROM SFF_cumQ_O where Course = ?"
cursor.execute(sql, ["Kempton"])
#print(cursor.fetchone())
print(cursor.fetchall())
cursor.close()
conn.close()
Here is my import of the rows based on Date = "01-Dec-2021" and Course = "Kempton"
import pyodbc
connStr = (r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};" r"DBQ=C:\Users\chris\Documents\UKHR\SFF_Cum\SFFCum_py.accdb;")
conn = pyodbc.connect(connStr)
cursor = conn.cursor()
sql = "select * FROM SFF_cumQ_O WHERE Date = '01-Dec-2021' and Course = 'Kempton'"
cursor.execute(sql)
print(cursor.fetchall())
However, when I try to import the rows based on Date = "01-Dec-2021" and Course = "Kempton" I run into this error :
"Exception has occurred: Error
('07002', '[07002] [Microsoft][ODBC Microsoft Access Driver] Too few parameters. Expected 1. (-3010) (SQLExecDirectW)')"
I found the problem: the criteria needed to be bracketed.
Final code looks like this:
Note the table name is not necessary with the field name. So SFF_cumQ_O.Course can just be Course.
import pyodbc
connStr = (r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};" r"DBQ=C:\Users\chris\Documents\UKHR\SFF_Cum\SFFCum_py.accdb;")
conn = pyodbc.connect(connStr)
cursor = conn.cursor()
sql = "select * FROM SFF_cumQ_O WHERE ((SFF_cumQ_O.Course)='Kempton') AND ((SFF_cumQ_O.RaceDate)='01-Dec-21')"
#sql = "select * FROM SFF_cumQ_O WHERE Date = '01-Dec-21' and Course = ?"
#cursor.execute(sql, ["Kempton"])
cursor.execute(sql)
print(cursor.fetchall())
cursor.close()
conn.close()

SQL Alchemy Parametrized Query , binding table name as parameter gives error

I am using parametrized query utilizing Text object in SQL alchemy and are getting different result.
Working example:
import sqlalchemy as sqlal
from sqlalchemy.sql import text
db_table = 'Cars'
id_cars = 8
query = text("""SELECT *
FROM Cars
WHERE idCars = :p2
""")
self.engine.execute(query, {'p2': id_cars})
Example that produces sqlalchemy.exc.ProgrammingError: (pymysql.err.ProgrammingError) (1064, "You have an error in your SQL syntax)
import sqlalchemy as sqlal
from sqlalchemy.sql import text
db_table = 'Cars'
id_cars = 8
query = text("""SELECT *
FROM :p1
WHERE idCars = :p2
""")
self.engine.execute(query, {'p1': db_table, 'p2': id_cars})
Any idea on how I can run the query with a dynamic table name that are also protected from sql injection?
I use PostgreSQL and psycopg2 backend. I was able to do it using:
from psycopg2 import sql
from sqlalchemy import engine
connection: sqlalchemy.engine.Connection
connection.connection.cursor().execute(
sql.SQL('SELECT * FROM {} where idCars = %s').format(sql.Identifier(db_table)),
(id_cars, )
)
For PyMySQL it's not supported.
You could just use the benefits of writing in python:
Library to use:
import sqlalchemy
from sqlalchemy import create_engine, MetaData, Table, func, event
from sqlalchemy.sql import text
from urllib.parse import quote_plus
connection (that I did not see in your example - here connection with sql azure):
params = urllib.parse.quote_plus(r'...')
conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
engine_azure = create_engine(conn_str, echo=True)
Your example:
db_table = 'Cars'
id_cars = 8
query = text('SELECT * FROM ' + db_table + 'WHERE idCars = ' + id_cars)
connection = engine.connect()
connection.execute(query)
connection.close()

Storing a dataframe with sqlalchemy, pyodbc: SQL syntax error

I would to like to store a dataframe into a Teradata database using the command pandas.to_sql, but get a SQL syntax error. Error appears to come from the built-in method, I don't know how to deal with it.
My code:
import pandas as pd
import datetime as dt
import sqlalchemy, pyodbc
todays_date = dt.datetime.now().date()
index = pd.date_range(todays_date-dt.timedelta(10), periods=10, freq='D')
columns = ['A','B', 'C']
df_ = pd.DataFrame(index=index, columns=columns)
df_ = df_.fillna(0)
engine = sqlalchemy.create_engine("mssql+pyodbc://" + user + ":" + passwd + "#" +dsnname)
df_.to_sql(name= 'TableTest', con = engine, if_exists='replace')
And the error I get:
ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] [Teradata][ODBC Teradata Driver][Teradata Database] Syntax error: expected something between '(' and ')'. (-3706) (SQLExecDirectW)") [SQL: 'SELECT schema_name()']
Here's a two part answer:
Install sqlalchemy-teradata.
Create the engine and the table as follows:
engine = sqlalchemy.create_engine("teradata://" + user + ":" + passwd + "#" +dsnname)
df.to_sql(name= 'TableTest', con = engine, index=False, schema='database_name', if_exists='replace')

Categories

Resources