reading and executing sql queries into pandas data frame - python

I have a long-assed sql query that runs quite well in Python, into a data frame
but I have hundreds of them, so I tried creating a function that reads my files and executes them.
The sql statements look like this:
"SELECT IIf(Left([Milestone_Next_Expected],4)='Proc',1, \
....\
120 lines
....\
dbo.MY_data_value"
This is the function
def Execute_SQL_from_a_File(filename,home,conn1):
FORMAT1 = '%Y%m%d%H%M'
fd = open(filename, 'r')
sqlFile = fd.read()
fd.close()
KIC53 = pd.read_sql(sqlFile, conn1)
f_out = home + out1 + ".xls"
writer = pd.ExcelWriter(f_out)
KIC53.to_excel(writer,f_out)
writer.save()
This is what calls the function:
Execute_SQL_from_a_File(QRYHOME + "qryBook" + str(BNUM) + "_" + str(IND) + ".sql", BNUM, home, conn1)
when I run the query as function I received this error:
: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]The
identifier that starts with 'SELECT
IIf(Left([Milestone_Next_Expected],4)='Proc',1,
\\\nIIf(Left([Milestone_Next_Expected],4)='Subm',2,
\\\nIIf(Left([Milestone_N' is too long. Maximum length is 128.")
I cant figure out why I'm getting the length error, because I can run the same query by creating sqlFile as one long string:
"SELECT IIf(Left([Milestone_Next_Expected],4)='Proc',1, \
....\
120 lines
....\
dbo.MY_data_value"
ANY help would be greatly appreciated!

The correct method is:
1. the sql script script does not require any line continuation symbols, "\"
and does not need to be encased in quotes
2. The correct way to read the input file is:
file=open(filename,'r')
SQLfile = s = " ".join(file.readlines())
Now, when the code is executed, via pd.read_sql_query(SQLfile, conn1)
there are no errors

Related

Write variable after reading .sql query

When I have to pass a parameter before running a sql query, I usually do
date = '20220101'
query = f'''SELECT * FROM TABLE WHERE DATE = '{date}''''
On an attempt to reduce the lenght of code, I created a query.sql file with the query above but I'm failing to pass the date variable inside my query, before running the sql.
For reading I'm using
sql_query = open("query.sql", "r")
sql_as_string = sql_query.read()
df = pd.read_sql(sql_as_string, conn)
Is there a way around, instead of pasting the whole SQL query at my .py code?
I'm using pyodbc, ODBC Driver 17 for SQL Server
Use a parametrized query, not string formatting.
The file should just contain the query, with a ? placeholder for the variable.
SELECT * FROM TABLE WHERE DATE = ?
Then you can do
with open("query.sql", "r") as f:
sql_query = f.read()
df = pd.read_sql(sql_query, conn, params=(date, ))

Inserted data is different from actual data being sent

I am trying to insert data into my Postgres Database, I am able to insert some data which is something else but not the actual data
This is my data generator and data sender into my query execution function(
(Python-Falsk)
def runid():
cmdStr = 'cd /root/Dart/log; ls -rt DartRunner*.log | tail -1 '
ret = execCmdLocal(cmdStr)
logName = ret[1].strip()
runId = ""
print('The DartRunner Log generated is: %s'%logName)
with open('/root/Dart/log/' + logName, "r") as fd:
for line in fd:
if 'runId' in line:
runId = line.split()[-1]
print('Run Id: %s'%runId)
break
print (runId) # output : Run Id: 180628-22
post_runid(runId) # output is given in below link
return jsonify({"run_id": runId})
This my database(postgres) execution method:
(Python)
def post_runid(run_id):
query = "insert into runid(runid) values(%s)"
cur.execute(query %run_id)
conn.commit()
My output looks something like this:
The above two rows are manually inserted by me but the below two rows are executed from the code, the below two rows must be same as the above ones but for some reason they are not as the original data but being generated in series
Changing %s to '%s' in the query fixed the problem.
def post_runid(run_id):
query = "insert into runid(runid) values('%s')"
cur.execute(query, (run_id,))
conn.commit()

Python script to move data from a SQL server db to an oracle db keeps giving 'ORA-01036: illegal variable name/number'

So I am using python to pull data from a sql server with a simple select that grabs 15 columns. The data looks like this
2016-06-01 05:45:06.003,5270,240,1,1,0,5000,1,null,null,7801009661561,0,null,null,null
The columns on the oracle table are all number except for the first column which is date. The sizes are all correct.
After I get all the data i run it through this little function to get rid of the pyodbc.row types.
def BuildBindList(recordsToWrite):
closingRecords = []
for rec in recordsToWrite:
closingRecords.append((rec[0], rec[1], rec[2], rec[3], rec[4], rec[5], rec[6], rec[7], rec[7], rec[8], rec[9], rec[10], rec[11], rec[12], rec[13], rec[14]))
return closingRecords
I get a list of tuples.
Then to write to the oracle table I wrote this function that takes in the list of tuples.
def write_to_table(recordsToWrite):
SQL = '''INSERT INTO ####### (DATETIME, ID, TZ, DOMAINID, EVENTNAME, REASONCODE, TARGETID, STATE, KEY, PERIPHERALKEY, RECOVERYKEY, DIRECTION, ROUTERDAY, ROUTERCKEY, ROUTERNUMBER)
VALUES(:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14, :15)'''
try:
trgtcrsr.prepare(SQL)
except cx_Oracle.DatabaseError, exception:
print ('Failed to prepare cursor')
print Exception(exception)
exit (1)
try:
trgtcrsr.executemany(None, recordsToWrite)
except cx_Oracle.DatabaseError, exception:
print ('Failed to insert rows')
print Exception(exception)
exit (1)
target_connection .commit()
target_connection .close()
I make the oracle connection like this
try:
cnn = cx_Oracle.connect(user="####", password = "####", dsn = "####")
trgtcrsr = cnn.cursor()
print "Connected to Oracle"
except Exception as e:
print e
raise RuntimeError("Could not connect to Oracle")
The connection works fine. But when the line trgtcrsr.executemany(None, recordsToWrite) is executed it gives me a 'ORA-01036: illegal variable name/number' error
I have another script that uses the same method of writing a list of tuples to an oracle table with the trgtcrsr.prepare(SQL)/trgtcrsr.executemany(None, recordsToWrite) method and it works fine (granted its oracle to oracle) writing to oracle so I am not sure why I keep getting this error. I have tried changing data types and googling the error but cant find anything similar.
Any ideas?
rec[7] appears twice in the function BuildBindList().
I'm guessing this will cause the insert to fail as you passed it 16 columns to instantiate 15 bind variables in the insert statement.
**#PYTHON SCRIPT TO COPY DATA FROM ORACLE TO SQL SERVER**
import cx_Oracle
import pyodbc
#Server Variables
orServer = '10.xxx.x.xxx'
orPort = 'xxxx'
orService = 'MYSERV'
orUser = 'ORMYUSER'
orPassword = 'orpassword'
sqlServer = 'SQLSERVER'
sqlDatabase = 'MYDB'
#SQL Server Connection
sqlConn = pyodbc.connect('Driver={SQL Server};'
'Server='+sqlServer+';'
'Database='+sqlDatabase+';'
'Trusted_Connection=yes;')
sqlCursor = sqlConn.cursor()
#Oracle Connection
dsn_tns = cx_Oracle.makedsn(orServer, orPort, service_name= orService )
orConn = cx_Oracle.connect(user= orUser, password= orPassword, dsn=dsn_tns)
orCursor = orConn.cursor()
#Get data from Oracle Server
orCursor.execute("""SELECT ID
,NAME
,SEX
,ADDRESS
PHONE
FROM DetailsTable"""
)
orColumns =['ID',
'NAME',
'SEX',
'ADDRESS',
'PHONE']
#Creating Strings for insert statement to load data into SQL Server
cValues = str(orColumns).replace("[","(").replace("]",")").replace("'","")
x = len(orColumns)
i = 0
ab = []
while i < x:
i = i+ 1
ab.append("?")
values = str(ab).replace("[","(").replace("]",")").replace("'","")
#Load data to SQL server
sqlCursor.executemany("INSERT INTO [MYDB].[dbo].[DetailsTable]"+ cValues+ " VALUES "+ values ,orCursor)
sqlConn.commit()
sqlConn.close()
orConn.close()

Passing a folder location as an SQL parameter in python causes an error

I am fairly new to python and the only SQL I know is from this project so forgive the lack of technical knowledge:
def importFolder(self):
user = getuser()
filename = askopenfilename(title = "Choose an image from the folder to import", initialdir='C:/Users/%s' % user)
for i in range (0,len(filename) - 1):
if filename[-i] == "/":
folderLocation = filename[:len(filename) - i]
break
cnxn = pyodbc.connect('DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:\Users\Public\dbsDetectorBookingSystem.accdb')
cursor = cnxn.cursor()
cursor.execute("SELECT * FROM tblRuns")
cursor.execute("insert into tblRuns(RunID,RunFilePath,TotalAlphaCount,TotalBetaCount,TotalGammaCount) values (%s,%s,0,0,0)" %(str(self.runsCount + 1), folderLocation))
cnxn.commit()
self.runsCount = cursor.rowcount
rowString = str(self.runsCount) + " " + folderLocation + " " + str(0) + " " + str(0) + " " + str(0) + " " + str(0)
self.runsTreeView.insert("","end", text = "", values = (rowString))
That is one routine from my current program meant to create a new record which is mostly empty apart from an index and a file location. This location needs to be saved as a string however when it is passed as a paramenter to the SQL string the following error occurs:
cursor.execute("insert into tblRuns(RunID,RunFilePath,TotalAlphaCount,TotalBetaCount,TotalGammaCount) values (%s,%s,0,0,0)" %(str(self.runsCount + 1), folderLocation))
ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Microsoft Access Driver] Syntax error (missing operator) in query expression 'C:/Users/Jacob/Documents/USB backup'. (-3100) (SQLExecDirectW)") I assume this is because the SQL recognises a file path and wantsto user it. Does anybody know how to fix this?
You're not using the db-api correctly. Instead of using string formatting to pass your query params - which is error-prone (as you just noticed) AND a security issue, you want to pass them as arguments to cursor.execute(), ie:
sql = "insert into tblRuns(RunID, RunFilePath, TotalAlphaCount, TotalBetaCount, TotalGammaCount) values (%s, %s, 0, 0, 0)"
cursor.execute(sql, (self.runsCount + 1, folderLocation))
Note that we DONT use string formatting here (no "%" between sql and the params)
NB : note that the placeholder for parameterized queries depends on your db connector. python-MySQLdb uses % but your one may use a ? or anything else.
wrt/ your exact problem: since you didn't put quotes around your placeholders, the sql query you send looks something like:
"insert into tblRuns(
RunID, RunFilePath,
TotalAlphaCount, TotalBetaCount, TotalGammaCount
)
values (1,/path/to/folder,0,0,0)"
Which cannot work, obviously (it needs quotes around /path/to/folder to be valid SQL).
By passing query parameters the right way, your db connector will take care of all the quoting and escaping.

reading external sql script in python

I am working on a learning how to execute SQL in python (I know SQL, not Python).
I have an external sql file. It creates and inserts data into three tables 'Zookeeper', 'Handles', 'Animal'.
Then I have a series of queries to run off the tables. The below queries are in the zookeeper.sql file that I load in at the top of the python script. Example for the first two are:
--1.1
SELECT ANAME,zookeepid
FROM ANIMAL, HANDLES
WHERE AID=ANIMALID;
--1.2
SELECT ZNAME, SUM(TIMETOFEED)
FROM ZOOKEEPER, ANIMAL, HANDLES
WHERE AID=ANIMALID AND ZOOKEEPID=ZID
GROUP BY zookeeper.zname;
These all execute fine in SQL. Now I need to execute them from within Python. I have been given and completed code to read in the file. Then execute all the queries in the loop.
The 1.1 and 1.2 is where I am getting confused. I believe in the loop this is the line where I should put in something to run the first and then second query.
result = c.execute("SELECT * FROM %s;" % table);
but what? I think I am missing something very obvious. I think what is throwing me off is % table. In query 1.1 and 1.2, I am not creating a table, but rather looking for a query result.
My entire python code is below.
import sqlite3
from sqlite3 import OperationalError
conn = sqlite3.connect('csc455_HW3.db')
c = conn.cursor()
# Open and read the file as a single buffer
fd = open('ZooDatabase.sql', 'r')
sqlFile = fd.read()
fd.close()
# all SQL commands (split on ';')
sqlCommands = sqlFile.split(';')
# Execute every command from the input file
for command in sqlCommands:
# This will skip and report errors
# For example, if the tables do not yet exist, this will skip over
# the DROP TABLE commands
try:
c.execute(command)
except OperationalError, msg:
print "Command skipped: ", msg
# For each of the 3 tables, query the database and print the contents
for table in ['ZooKeeper', 'Animal', 'Handles']:
**# Plug in the name of the table into SELECT * query
result = c.execute("SELECT * FROM %s;" % table);**
# Get all rows.
rows = result.fetchall();
# \n represents an end-of-line
print "\n--- TABLE ", table, "\n"
# This will print the name of the columns, padding each name up
# to 22 characters. Note that comma at the end prevents new lines
for desc in result.description:
print desc[0].rjust(22, ' '),
# End the line with column names
print ""
for row in rows:
for value in row:
# Print each value, padding it up with ' ' to 22 characters on the right
print str(value).rjust(22, ' '),
# End the values from the row
print ""
c.close()
conn.close()
Your code already contains a beautiful way to execute all statements from a specified sql file
# Open and read the file as a single buffer
fd = open('ZooDatabase.sql', 'r')
sqlFile = fd.read()
fd.close()
# all SQL commands (split on ';')
sqlCommands = sqlFile.split(';')
# Execute every command from the input file
for command in sqlCommands:
# This will skip and report errors
# For example, if the tables do not yet exist, this will skip over
# the DROP TABLE commands
try:
c.execute(command)
except OperationalError, msg:
print("Command skipped: ", msg)
Wrap this in a function and you can reuse it.
def executeScriptsFromFile(filename):
# Open and read the file as a single buffer
fd = open(filename, 'r')
sqlFile = fd.read()
fd.close()
# all SQL commands (split on ';')
sqlCommands = sqlFile.split(';')
# Execute every command from the input file
for command in sqlCommands:
# This will skip and report errors
# For example, if the tables do not yet exist, this will skip over
# the DROP TABLE commands
try:
c.execute(command)
except OperationalError, msg:
print("Command skipped: ", msg)
To use it
executeScriptsFromFile('zookeeper.sql')
You said you were confused by
result = c.execute("SELECT * FROM %s;" % table);
In Python, you can add stuff to a string by using something called string formatting.
You have a string "Some string with %s" with %s, that's a placeholder for something else. To replace the placeholder, you add % ("what you want to replace it with") after your string
ex:
a = "Hi, my name is %s and I have a %s hat" % ("Azeirah", "cool")
print(a)
>>> Hi, my name is Azeirah and I have a Cool hat
Bit of a childish example, but it should be clear.
Now, what
result = c.execute("SELECT * FROM %s;" % table);
means, is it replaces %s with the value of the table variable.
(created in)
for table in ['ZooKeeper', 'Animal', 'Handles']:
# for loop example
for fruit in ["apple", "pear", "orange"]:
print(fruit)
>>> apple
>>> pear
>>> orange
If you have any additional questions, poke me.
A very simple way to read an external script into an sqlite database in python is using executescript():
import sqlite3
conn = sqlite3.connect('csc455_HW3.db')
with open('ZooDatabase.sql', 'r') as sql_file:
conn.executescript(sql_file.read())
conn.close()
First make sure that a table exists if not, create a table then follow the steps.
import sqlite3
from sqlite3 import OperationalError
conn = sqlite3.connect('Client_DB.db')
c = conn.cursor()
def execute_sqlfile(filename):
c.execute("CREATE TABLE clients_parameters (adress text, ie text)")
#
fd = open(filename, 'r')
sqlFile = fd.readlines()
fd.close()
lvalues = [tuple(v.split(';')) for v in sqlFile[1:] ]
try:
#print(command)
c.executemany("INSERT INTO clients_parameters VALUES (?, ?)", lvalues)
except OperationalError as msg:
print ("Command skipped: ", msg)
execute_sqlfile('clients.sql')
print(c.rowcount)
according me, it is not possible
solution:
import .sql file on mysql server
after
import mysql.connector
import pandas as pd
and then you use .sql file by convert to dataframe

Categories

Resources