cx_Oracle - Unable to Bind Date Value within SQL - python

I'm using cx_Oracle and I'm trying to bind a date variable value inside SQL but I'm not able the resolve the errors. Can someone please offer insight on how to fix it?
The code below gives me an error: "DatabaseError: ORA-00936: missing expression"
dat_ptd_sql = """
select
univ_prop_id,
chain_id
from BA4DBOP1.zs_ptd_stack
where chain_ord = 1
and sale_valtn_dt >= date :cu_perf_beg
"""
cudb_cur.execute(dat_ptd_sql, cu_perf_beg = "'2022-09-01'")

Another option is to use an actual date inside Python:
import datetime
dat_ptd_sql = """
select
univ_prop_id,
chain_id
from BA4DBOP1.zs_ptd_stack
where chain_ord = 1
and sale_valtn_dt >= date :cu_perf_beg
"""
cudb_cur.execute(dat_ptd_sql, cu_perf_beg = datetime.datetime(2022, 9, 1))

Try something like this. Note the to_date(). This example uses python-oracledb, which is the new name for the latest version of cx_Oracle.
import getpass
import os
import oracledb
un = os.environ.get('PYTHON_USERNAME')
cs = os.environ.get('PYTHON_CONNECTSTRING')
pw = getpass.getpass(f'Enter password for {un}#{cs}: ')
connection = oracledb.connect(user=un, password=pw, dsn=cs)
with connection.cursor() as cursor:
sql = """select
ename
from emp
where empno = 7654
and hiredate >= to_date(:bv, 'YYYY-MM-DD')"""
for r in cursor.execute(sql, bv='1981-04-01'):
print(r)

Related

Python looping to obtain different dataframes from a SQL database

I'm trying to connect to an SQL database and, within a loop, create separate dataframes for each different instance of Id, containing all the data related to that Id. I've tried a number of ways, without any success so far. I'm pretty new to all of this, so I'm probably making some rookie mistakes.
Attempt 1:
import pandas as pd
import pyodbc
conn = pyodbc.connect('Driver={SQL Server};'
'Server=Server_name;'
'Database=Database;'
'UID=Username;'
'PWD=password;'
'Trusted_Connection=yes;')
Name = ['HR','ZA','PR','FW']
for x in Name:
SQL = '''
SELECT *
FROM Database
WHERE Id = {x}'''.format(x = x)
cursor = conn.cursor()
cursor.execute(SQL)
df = pd.read_sql_query(SQL)
On this code, I get an 'invalid column name' programming error on the first Name 'HL'.
Attempt 2:
import pandas as pd
import pyodbc
conn = pyodbc.connect('Driver={SQL Server};'
'Server=Server_name;'
'Database=Database;'
'UID=Username;'
'PWD=password;'
'Trusted_Connection=yes;')
SQL = '''
SELECT *
FROM Database
conn.autocommit = True
cursor.execute(SQL)
for [Id] in cursor:
df = pd.Dataframe(SQL,conn)
On this code, I get a 'ValueError: too many values to unpack (expected 1)' - on the for statement.
I want to put a lot more code in the for loop so I need it to be set up to work through each Id. I hope that makes sense. Any guidance would be greatly appreciated. Thanks
UPDATE:
Thanks for all comments/answers. For some reason I just couldn't get it to work in either of the formats above so I took it back to where I started from now I understand how to include the syntax for the loop variable. The following now works:
import pandas as pd
import pyodbc
conn = pyodbc.connect('Driver={SQL Server};'
'Server=Server_name;'
'Database=Database;'
'UID=Username;'
'PWD=password;'
'Trusted_Connection=yes;')
Name = ['HR','ZA','PR','FW']
for x in Name:
SQL = pd.read_sql_query(
'''
SELECT *
FROM Database_table
WHERE Id = '{x}'
'''.format(x = x), conn)
df = pd.DataFrame(SQL)
I think that if you try a variation on your first attempt like:
for x in Name:
SQL = '''
SELECT *
FROM Database
WHERE Id = ?'''
cursor = conn.cursor()
cursor.execute(SQL)
df = pd.read_sql_query(SQL, params={x})
It should probably work :)

How to Pass a Variable to SEARCH database in Python - Sqlite

I am trying to pass a variable to search for the row from SQLite DB and print out the results. Here is the code below thats causing the problem:
find_domain = 'domain.com'
def searchdomain(locate):
row = sql.execute("SELECT * FROM blocked_domains WHERE name = ?;",(locate,))
print(row)
searchdomain(find_domain)
No error comes up, it just come back blank.
Ensure that you have created a cursor object for data retrieval:
import sqlite3
conn = sqlite3.connect('tablename.db')
data = list(conn.cursor().execute("SELECT * FROM blocked_domains WHERE name = ?;", (locate,)))

pyodbc the sql contains 0 parameter markers but 1 parameters were supplied' 'hy000'

I am using Python 3.6, pyodbc, and connect to SQL Server.
I am trying make connection to a database, then creating a query with parameters.
Here is the code:
import sys
import pyodbc
# connection parameters
nHost = 'host'
nBase = 'base'
nUser = 'user'
nPasw = 'pass'
# make connection start
def sqlconnect(nHost,nBase,nUser,nPasw):
try:
return pyodbc.connect('DRIVER={SQL Server};SERVER='+nHost+';DATABASE='+nBase+';UID='+nUser+';PWD='+nPasw)
print("connection successfull")
except:
print ("connection failed check authorization parameters")
con = sqlconnect(nHost,nBase,nUser,nPasw)
cursor = con.cursor()
# make connection stop
# if run WITHOUT parameters THEN everything is OK
ask = input ('Go WITHOUT parameters y/n ?')
if ask == 'y':
# SQL without parameters start
res = cursor.execute('''
SELECT * FROM TABLE
WHERE TABLE.TIMESTAMP BETWEEN '2017-03-01T00:00:00.000' AND '2017-03-01T01:00:00.000'
''')
# SQL without parameters stop
# print result to console start
row = res.fetchone()
while row:
print (row)
row = res.fetchone()
# print result to console stop
# if run WITH parameters THEN ERROR
ask = input ('Go WITH parameters y/n ?')
if ask == 'y':
# parameters start
STARTDATE = "'2017-03-01T00:00:00.000'"
ENDDATE = "'2017-03-01T01:00:00.000'"
# parameters end
# SQL with parameters start
res = cursor.execute('''
SELECT * FROM TABLE
WHERE TABLE.TIMESTAMP BETWEEN :STARTDATE AND :ENDDATE
''', {"STARTDATE": STARTDATE, "ENDDATE": ENDDATE})
# SQL with parameters stop
# print result to console start
row = res.fetchone()
while row:
print (row)
row = res.fetchone()
# print result to console stop
When I run the program without parameters in SQL, it works.
When I try running it with parameters, an error occurred.
Parameters in an SQL statement via ODBC are positional, and marked by a ?. Thus:
# SQL with parameters start
res = cursor.execute('''
SELECT * FROM TABLE
WHERE TABLE.TIMESTAMP BETWEEN ? AND ?
''', STARTDATE, ENDDATE)
# SQL with parameters stop
Plus, it's better to avoid passing dates as strings. Let pyodbc take care of that using Python's datetime:
from datetime import datetime
...
STARTDATE = datetime(year=2017, month=3, day=1)
ENDDATE = datetime(year=2017, month=3, day=1, hour=0, minute=0, second=1)
then just pass the parameters as above. If you prefer string parsing, see this answer.
If you're trying to use pd.to_sql() like me I fixed the problem by passing a parameter called chunksize.
df.to_sql("tableName", engine ,if_exists='append', chunksize=50)
hope this helps
i tryied and have a lot of different errors: 42000, 22007, 07002 and others
The work version is bellow:
import sys
import pyodbc
import datetime
# connection parameters
nHost = 'host'
nBase = 'DBname'
nUser = 'user'
nPasw = 'pass'
# make connection start
def sqlconnect(nHost,nBase,nUser,nPasw):
try:
return pyodbc.connect('DRIVER={SQL Server};SERVER='+nHost+';DATABASE='+nBase+';UID='+nUser+';PWD='+nPasw)
except:
print ("connection failed check authorization parameters")
con = sqlconnect(nHost,nBase,nUser,nPasw)
cursor = con.cursor()
# make connection stop
STARTDATE = '11/2/2017'
ENDDATE = '12/2/2017'
params = (STARTDATE, ENDDATE)
# SQL with parameters start
sql = ('''
SELECT * FROM TABLE
WHERE TABLE.TIMESTAMP BETWEEN CAST(? as datetime) AND CAST(? as datetime)
''')
# SQL with parameters stop
# print result to console start
query = cursor.execute(sql, params)
row = query.fetchone()
while row:
print (row)
row = query.fetchone()
# print result to console stop
say = input ('everething is ok, you can close console')
I fixed this issue with code if you are using values through csv.
for i, row in read_csv_data.iterrows():
cursor.execute('INSERT INTO ' + self.schema + '.' + self.table + '(first_name, last_name, email, ssn, mobile) VALUES (?,?,?,?,?)', tuple(row))
I had a similar issue. Saw that downgrading the version of PyODBC to 4.0.6 and SQLAlchemy to 1.2.9 fixed the error,using Python 3.6

Sqlite3 naming db file with a variable in python

How can I use the current date to name my db file so when it runs it creates a db file which is named after the current date. This is what I have so far:
import sqlite3
import time
timedbname = time.strftime("%d/%m/%Y")
# Connecting to the database file
conn = sqlite3.connect(???)
with this error its the same with '/' or '-' or '.' in "%d/%m/%Y":
conn = sqlite3.connect(timedbname, '.db')
TypeError: a float is required
27.01.2016
Try using:
time.strftime("%d-%m-%Y")
I guess it doesn't work because of the slashes in the generated date.
You can't have dashes in your table name. Also it can't start with a digit.
import sqlite3
from datetime import date
timedbname = '_' + str(date.today()).replace('-','_')
# Connecting to the database file
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
cursor.execute('''CREATE TABLE %s (col1 int, col2 int)''' % (timedbname))
cursor.execute('''INSERT INTO %s VALUES (1, 2)''' % (timedbname))
cursor.execute('''SELECT * FROM %s'''%timedbname).fetchall()
This worked:
import sqlite3
import time
timedbname = time.strftime("_" + "%d.%m.%Y")
conn = sqlite3.connect(timedbname + '.db')

How to convert SQL Query result to PANDAS Data Structure?

Any help on this problem will be greatly appreciated.
So basically I want to run a query to my SQL database and store the returned data as Pandas data structure.
I have attached code for query.
I am reading the documentation on Pandas, but I have problem to identify the return type of my query.
I tried to print the query result, but it doesn't give any useful information.
Thanks!!!!
from sqlalchemy import create_engine
engine2 = create_engine('mysql://THE DATABASE I AM ACCESSING')
connection2 = engine2.connect()
dataid = 1022
resoverall = connection2.execute("
SELECT
sum(BLABLA) AS BLA,
sum(BLABLABLA2) AS BLABLABLA2,
sum(SOME_INT) AS SOME_INT,
sum(SOME_INT2) AS SOME_INT2,
100*sum(SOME_INT2)/sum(SOME_INT) AS ctr,
sum(SOME_INT2)/sum(SOME_INT) AS cpc
FROM daily_report_cooked
WHERE campaign_id = '%s'",
%dataid
)
So I sort of want to understand what's the format/datatype of my variable "resoverall" and how to put it with PANDAS data structure.
Here's the shortest code that will do the job:
from pandas import DataFrame
df = DataFrame(resoverall.fetchall())
df.columns = resoverall.keys()
You can go fancier and parse the types as in Paul's answer.
Edit: Mar. 2015
As noted below, pandas now uses SQLAlchemy to both read from (read_sql) and insert into (to_sql) a database. The following should work
import pandas as pd
df = pd.read_sql(sql, cnxn)
Previous answer:
Via mikebmassey from a similar question
import pyodbc
import pandas.io.sql as psql
cnxn = pyodbc.connect(connection_info)
cursor = cnxn.cursor()
sql = "SELECT * FROM TABLE"
df = psql.frame_query(sql, cnxn)
cnxn.close()
If you are using SQLAlchemy's ORM rather than the expression language, you might find yourself wanting to convert an object of type sqlalchemy.orm.query.Query to a Pandas data frame.
The cleanest approach is to get the generated SQL from the query's statement attribute, and then execute it with pandas's read_sql() method. E.g., starting with a Query object called query:
df = pd.read_sql(query.statement, query.session.bind)
Edit 2014-09-30:
pandas now has a read_sql function. You definitely want to use that instead.
Original answer:
I can't help you with SQLAlchemy -- I always use pyodbc, MySQLdb, or psychopg2 as needed. But when doing so, a function as simple as the one below tends to suit my needs:
import decimal
import pyodbc #just corrected a typo here
import numpy as np
import pandas
cnn, cur = myConnectToDBfunction()
cmd = "SELECT * FROM myTable"
cur.execute(cmd)
dataframe = __processCursor(cur, dataframe=True)
def __processCursor(cur, dataframe=False, index=None):
'''
Processes a database cursor with data on it into either
a structured numpy array or a pandas dataframe.
input:
cur - a pyodbc cursor that has just received data
dataframe - bool. if false, a numpy record array is returned
if true, return a pandas dataframe
index - list of column(s) to use as index in a pandas dataframe
'''
datatypes = []
colinfo = cur.description
for col in colinfo:
if col[1] == unicode:
datatypes.append((col[0], 'U%d' % col[3]))
elif col[1] == str:
datatypes.append((col[0], 'S%d' % col[3]))
elif col[1] in [float, decimal.Decimal]:
datatypes.append((col[0], 'f4'))
elif col[1] == datetime.datetime:
datatypes.append((col[0], 'O4'))
elif col[1] == int:
datatypes.append((col[0], 'i4'))
data = []
for row in cur:
data.append(tuple(row))
array = np.array(data, dtype=datatypes)
if dataframe:
output = pandas.DataFrame.from_records(array)
if index is not None:
output = output.set_index(index)
else:
output = array
return output
1. Using MySQL-connector-python
# pip install mysql-connector-python
import mysql.connector
import pandas as pd
mydb = mysql.connector.connect(
host = 'host',
user = 'username',
passwd = 'pass',
database = 'db_name'
)
query = 'select * from table_name'
df = pd.read_sql(query, con = mydb)
print(df)
2. Using SQLAlchemy
# pip install pymysql
# pip install sqlalchemy
import pandas as pd
import sqlalchemy
engine = sqlalchemy.create_engine('mysql+pymysql://username:password#localhost:3306/db_name')
query = '''
select * from table_name
'''
df = pd.read_sql_query(query, engine)
print(df)
MySQL Connector
For those that works with the mysql connector you can use this code as a start. (Thanks to #Daniel Velkov)
Used refs:
Querying Data Using Connector/Python
Connecting to MYSQL with Python in 3 steps
import pandas as pd
import mysql.connector
# Setup MySQL connection
db = mysql.connector.connect(
host="<IP>", # your host, usually localhost
user="<USER>", # your username
password="<PASS>", # your password
database="<DATABASE>" # name of the data base
)
# You must create a Cursor object. It will let you execute all the queries you need
cur = db.cursor()
# Use all the SQL you like
cur.execute("SELECT * FROM <TABLE>")
# Put it all to a data frame
sql_data = pd.DataFrame(cur.fetchall())
sql_data.columns = cur.column_names
# Close the session
db.close()
# Show the data
print(sql_data.head())
Here's the code I use. Hope this helps.
import pandas as pd
from sqlalchemy import create_engine
def getData():
# Parameters
ServerName = "my_server"
Database = "my_db"
UserPwd = "user:pwd"
Driver = "driver=SQL Server Native Client 11.0"
# Create the connection
engine = create_engine('mssql+pyodbc://' + UserPwd + '#' + ServerName + '/' + Database + "?" + Driver)
sql = "select * from mytable"
df = pd.read_sql(sql, engine)
return df
df2 = getData()
print(df2)
This is a short and crisp answer to your problem:
from __future__ import print_function
import MySQLdb
import numpy as np
import pandas as pd
import xlrd
# Connecting to MySQL Database
connection = MySQLdb.connect(
host="hostname",
port=0000,
user="userID",
passwd="password",
db="table_documents",
charset='utf8'
)
print(connection)
#getting data from database into a dataframe
sql_for_df = 'select * from tabledata'
df_from_database = pd.read_sql(sql_for_df , connection)
Like Nathan, I often want to dump the results of a sqlalchemy or sqlsoup Query into a Pandas data frame. My own solution for this is:
query = session.query(tbl.Field1, tbl.Field2)
DataFrame(query.all(), columns=[column['name'] for column in query.column_descriptions])
resoverall is a sqlalchemy ResultProxy object. You can read more about it in the sqlalchemy docs, the latter explains basic usage of working with Engines and Connections. Important here is that resoverall is dict like.
Pandas likes dict like objects to create its data structures, see the online docs
Good luck with sqlalchemy and pandas.
Simply use pandas and pyodbc together. You'll have to modify your connection string (connstr) according to your database specifications.
import pyodbc
import pandas as pd
# MSSQL Connection String Example
connstr = "Server=myServerAddress;Database=myDB;User Id=myUsername;Password=myPass;"
# Query Database and Create DataFrame Using Results
df = pd.read_sql("select * from myTable", pyodbc.connect(connstr))
I've used pyodbc with several enterprise databases (e.g. SQL Server, MySQL, MariaDB, IBM).
This question is old, but I wanted to add my two-cents. I read the question as " I want to run a query to my [my]SQL database and store the returned data as Pandas data structure [DataFrame]."
From the code it looks like you mean mysql database and assume you mean pandas DataFrame.
import MySQLdb as mdb
import pandas.io.sql as sql
from pandas import *
conn = mdb.connect('<server>','<user>','<pass>','<db>');
df = sql.read_frame('<query>', conn)
For example,
conn = mdb.connect('localhost','myname','mypass','testdb');
df = sql.read_frame('select * from testTable', conn)
This will import all rows of testTable into a DataFrame.
Long time from last post but maybe it helps someone...
Shorted way than Paul H:
my_dic = session.query(query.all())
my_df = pandas.DataFrame.from_dict(my_dic)
Here is mine. Just in case if you are using "pymysql":
import pymysql
from pandas import DataFrame
host = 'localhost'
port = 3306
user = 'yourUserName'
passwd = 'yourPassword'
db = 'yourDatabase'
cnx = pymysql.connect(host=host, port=port, user=user, passwd=passwd, db=db)
cur = cnx.cursor()
query = """ SELECT * FROM yourTable LIMIT 10"""
cur.execute(query)
field_names = [i[0] for i in cur.description]
get_data = [xx for xx in cur]
cur.close()
cnx.close()
df = DataFrame(get_data)
df.columns = field_names
pandas.io.sql.write_frame is DEPRECATED.
https://pandas.pydata.org/pandas-docs/version/0.15.2/generated/pandas.io.sql.write_frame.html
Should change to use pandas.DataFrame.to_sql
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html
There is another solution.
PYODBC to Pandas - DataFrame not working - Shape of passed values is (x,y), indices imply (w,z)
As of Pandas 0.12 (I believe) you can do:
import pandas
import pyodbc
sql = 'select * from table'
cnn = pyodbc.connect(...)
data = pandas.read_sql(sql, cnn)
Prior to 0.12, you could do:
import pandas
from pandas.io.sql import read_frame
import pyodbc
sql = 'select * from table'
cnn = pyodbc.connect(...)
data = read_frame(sql, cnn)
best way I do this
db.execute(query) where db=db_class() #database class
mydata=[x for x in db.fetchall()]
df=pd.DataFrame(data=mydata)
If the result type is ResultSet, you should convert it to dictionary first. Then the DataFrame columns will be collected automatically.
This works on my case:
df = pd.DataFrame([dict(r) for r in resoverall])
Here is a simple solution I like:
Put your DB connection info in a YAML file in a secure location (do not version it in the code repo).
---
host: 'hostname'
port: port_number_integer
database: 'databasename'
user: 'username'
password: 'password'
Then load the conf in a dictionary, open the db connection and load the result set of the SQL query in a data frame:
import yaml
import pymysql
import pandas as pd
db_conf_path = '/path/to/db-conf.yaml'
# Load DB conf
with open(db_conf_path) as db_conf_file:
db_conf = yaml.safe_load(db_conf_file)
# Connect to the DB
db_connection = pymysql.connect(**db_conf)
# Load the data into a DF
query = '''
SELECT *
FROM my_table
LIMIT 10
'''
df = pd.read_sql(query, con=db_connection)

Categories

Resources