I am trying to use python sqlalchemy to query our PostgreSQL database view using ODBC but I am getting the error
{ProgrammingError}(pyodbc.ProgrammingError) ('42883', '[42883] ERROR: function schema_name() does not exist;\nError while executing the query (1) (SQLExecDirectW)')
[SQL: SELECT schema_name()]
(Background on this error at: https://sqlalche.me/e/14/f405)
Using the code below, I successfully create the connection engine but executing the query seems to be the problem.
When using 'pyodbc' or 'psycopg2' establishing the connection and querying data does work perfectly, but with a warning
'UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy
warnings.warn('
as to why we are looking into establishing the connection the sqlalchemy-way
import config
import sqlalchemy
if __name__ == '__main__':
connection_string = (config.odbc('database_odbc.txt'))
connection_url = sqlalchemy.engine.url.URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})
conn = sqlalchemy.create_engine(connection_url)
query_string = """SELECT [column name in view] FROM public.[name of view]"""
df1 = pd.read_sql(query_string, conn)
print(df1.to_string())
conn.close()
print('Database connection closed.')
As mentioned, the query runs perfectly using the other methods. I already tried different syntax of the database view including
SELECT [column name in view] FROM [database name].public.[name of view]
SELECT [column name in view] FROM [name of view]
and more without success.
Any help is appreciated, thank you!
Thank you #Gord Thompson,
I followed the default postgresql syntax at https://docs.sqlalchemy.org/en/14/core/engines.html
engine = create_engine('postgresql://scott:tiger#localhost/mydatabase')
now the code looks like
import sqlalchemy
if __name__ == '__main__':
engine = create_engine('postgresql://[user]:[password]#[host]/[db]')
conn = engine.connect()
query_string = """SELECT [column name in view] FROM public.[name of view]"""
df1 = pd.read_sql(query_string, conn)
print(df1.to_string())
conn.close()
print('Database connection closed.')
and now it works perfectly, thank you!
Related
I am trying to use Pandas and Sql Alchemy. This is basically what I am trying to do. If I drop the table, it will create it but I want it to append and not have to do table renaming. I have tried updating and changing versions of all the libraries. I am at a loss. If I start with no table it creates it, then i run the code again and it crashes. The error message just says the table already exists, which I know, that is why I am telling it to append. Also, before the load i am reading data using PYMSSQL and it reads fine to a dataframe
Python Command
def writeDFtoSSDatabase(tgtDefiniton,df):
try:
if int(tgtDefiniton.loadBatchSize) > 0:
batchSize = int(tgtDefiniton.loadBatchSize)
else:
batchSize = 1000
#Domain error using SQL Alchemy
logging.debug("Writting Dataframe to SQL Server database")
#hardcoded type beccause that is only type for now
with createDBConnection(tgtDefiniton.tgtDatabaseServer
,tgtDefiniton.tgtDatabaseDatabase
,tgtDefiniton.tgtDatabaseUser
,tgtDefiniton.tgtDatabasePassword,tgtDefiniton.tgtDataType).connect().execution_options(schema_translate_map={
None: tgtDefiniton.tgtDatabaseSchema}) as conn:
logging.debug("Writting DF to Database table {0}".format(tgtDefiniton.tgtDatabaseTable))
logging.debug("ifTableExists: {0}.".format(tgtDefiniton.ifTableExists))
if tgtDefiniton.ifTableExists == "append":
logging.debug('Appending Data')
df.to_sql(tgtDefiniton.tgtDatabaseTable,con=conn,if_exists='append',chunksize = batchSize,index=False)
elif tgtDefiniton.ifTableExists == "replace":
logging.debug('Replacing Table and Data')
df.to_sql(tgtDefiniton.tgtDatabaseTable,con=conn,if_exists='replace',chunksize = batchSize,index=False)
else:
df.to_sql(tgtDefiniton.tgtDatabaseTable,con=conn,if_exists='fail',index=False)
logging.debug("Data wrote to database")
except Exception as e:
logging.error(e)
raise
Error
(Background on this error at: http://sqlalche.me/e/e3q8)
2021-08-30 13:31:42 ERROR (pymssql.OperationalError) (2714, b"There is already an object
named 'test' in the database.DB-Lib error message 20018, severity 16:\nGeneral SQL Server
error: Check messages from the SQL Server\n")
EDIT:
Log Entry
2021-08-30 13:31:36 DEBUG Writting Dataframe to SQL Server database
2021-08-30 13:31:36 DEBUG create_engine(mssql+pymssql://REST OF CONNECTION INFO
2021-08-30 13:31:36 DEBUG DB Engine Created
2021-08-30 13:31:36 DEBUG Writting DF to Database table test
2021-08-30 13:31:36 DEBUG ifTableExists: append.
2021-08-30 13:31:36 DEBUG Appending Data
2021-08-30 13:31:42 ERROR (pymssql.OperationalError) (2714, b"There is already an object named 'test' in the database.DB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")
[SQL:
I had the same problem and I found two ways to solve it although I lack the insight as to why this solves it:
Either pass the database name in the url when creating a connection
or pass the database name as a schema in pd.to_sql.
Doing both does not hurt.
```
#create connection to MySQL DB via sqlalchemy & pymysql
user = credentials['user']
password = credentials['password']
port = credentials['port']
host = credentials['hostname']
dialect = 'mysql'
driver = 'pymysql'
db_name = 'test_db'
# setup SQLAlchemy
from sqlalchemy import create_engine
cnx = f'{dialect}+{driver}://{user}:{password}#{host}:{port}/'
engine = create_engine(cnx)
# create database
with engine.begin() as con:
con.execute(f"CREATE DATABASE {db_name}")
############################################################
# either pass the db_name vvvv - HERE- vvvv after creating a database
cnx = f'{dialect}+{driver}://{user}:{password}#{host}:{port}/{db_name}'
############################################################
engine = create_engine(cnx)
table = 'test_table'
col = 'test_col'
with engine.begin() as con:
# this would work here instead of creating a new engine with a new link
# con.execute(f"USE {db_name}")
con.execute(f"CREATE TABLE {table} ({col} CHAR(1));")
# insert into database
import pandas as pd
df = pd.DataFrame({col : ['a','b','c']})
with engine.begin() as con:
# this has no effect here
# con.execute(f"USE {db_name}")
df.to_sql(
name= table,
if_exists='append',
con=con,
############################################################
# or pass it as a schema vvvv - HERE - vvvv
#schema=db_name,
############################################################
index=False
)```
Tested with python version 3.8.13 and sqlalchemy 1.4.32.
Same problem might have appeared here and here.
If I understood you correctly you are trying to upload pandas dataframe into SQL table that already exists. Then you just need to create a connection with sql alchemy and write your data to the table:
import pyodbc
import sqlalchemy
import urllib
from sqlalchemy.pool import NullPool
serverName = 'Server_Name'
dataBase = 'Database_Name'
conn_str = urllib.parse.quote_plus(
r'DRIVER={SQL Server};SERVER=' + serverName + r';DATABASE=' + dataBase + r';TRUSTED_CONNECTION=yes')
conn = 'mssql+pyodbc:///?odbc_connect={}'.format(conn_str) #IF you are using MS Sql Server Studio
engine = sqlalchemy.create_engine(conn, poolclass=NullPool)
connection = engine.connect()
sql_table.to_sql('Your_Table_Name', engine, schema='Your_Schema_Name', if_exists='append', index=False,
chunksize=200)
connection.close()
I'm trying to store the results of an Oracle SQL query into a dataframe and the execution hangs infinitely. But, when I print the query it comes out instantly. What is causing the error when saving this as a DataFrame?
import cx_Oracle
import pandas as pd
dsn_tns = cx_Oracle.makedsn('HOST', 'PORT', service_name='SID')
conn = cx_Oracle.connect(user='USER', password='PASSWORD', dsn=dsn_tns)
curr =conn.cursor()
curr.execute('alter session set current_schema= apps')
df = pd.read_sql('select * from TABLE', curr)
####THE ALTERNATIVE CODE TO PRINT THE RESULTS
# curr.execute('select * from TABLE')
# for line in curr:
# print(line)
curr.close()
conn.close()
Pandas's read_sql requires a connection object for its con argument not the result of a cursor's execute. Also, consider using SQLAlchemy the recommended interface between pandas and databases where you define the schema in the engine connection assignment. This engine also allows to_sql calls.
engine = create_engine("oracle+cx_oracle://user:pwd#host:port/dbname")
df = pd.read_sql('select * from TABLE', con=engine)
engine.dispose()
And as mentioned on this DBA post, in Oracle users and schemas are essentially the same thing (unlike other RBDMS). Therefore, try passing apps as the user in create_engine call with needed credentials:
engine = create_engine("oracle+cx_oracle://apps:PASSWORD#HOST:PORT/SID")
df = pd.read_sql('select * from TABLE', con=engine)
engine.dispose()
While trying to write a pandas' dataframe into sql-server, I get this error:
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': ('42S02', "[42S02] [Microsoft][SQL Server Native Client 11.0][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Statement(s) could not be prepared. (8180)")
It seems pandas is looking into sqlite instead of the real database.
It's not a connection problem since I can read from the sql-server with the same connection using pandas.read_sql
The connection has been set using
sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
It's not a database permission problem either since I can write line by line using the same connection parameters as:
cursor = conn.cursor()
cursor.execute('insert into test values (1, 'test', 10)')
conn.commit()
I could just write a loop to instert line by line but I would like to know why to_sql isn't working for me, and I am affraid it won't be as efficient.
Environment:
Python: 2.7
Pandas: 0.20.1
sqlalchemy: 1.1.12
Thanks in advance.
runnable example:
import pandas as pd
from sqlalchemy import create_engine
import urllib
params = urllib.quote_plus("DRIVER={SQL Server Native Client 11.0};SERVER=
<servername>;DATABASE=<databasename>;UID=<username>;PWD=<password>")
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
test = pd.DataFrame({'col1':1, 'col2':'test', 'col3':10}, index=[0])
conn=engine.connect().connection
test.to_sql("dbo.test", con=conn, if_exists="append", index=False)
According to the to_sql doc, the con parameter is either an SQLAchemy engine or the legacy DBAPI2 connection (sqlite3). Because you are passing the connection object rather than the SQLAlchemy engine object as the parameter, pandas is inferring that you're passing a DBAPI2 connection, or a SQLite3 connection since its the only one supported. To remedy this, just do:
myeng = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
# Code to create your df
...
# Now write to DB
df.to_sql('table', myeng, index=False)
try this.
good to connect MS SQL server(SQL Authentication) and update data
from sqlalchemy import create_engine
params = urllib.parse.quote_plus(
'DRIVER={ODBC Driver 13 for SQL Server};'+
'SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
#df: pandas.dataframe; mTableName:table name in MS SQL
#warning: discard old table if exists
df.to_sql(mTableName, con=engine, if_exists='replace', index=False)
So I ran into this same thing. I tried looking through the code, couldn't figure out why it wasn't working but it looks like it gets stuck on this call.
pd.io.sql._is_sqlalchemy_connectable(engine)
I found that if I run this first it returns True, but as soon as I run it after running df.to_sql() it returns False. Right now I'm running it before I do the df.to_sql() and it actually works.
Hope this helps.
Im executing the following code, the purposes of the exeuction is to create a lookup-table in the Oracle data base to speed up my load of data. The table I want to load in is simply a vector with ID values, so only one column is loaded.
The code is written per below:
lookup = df.id_variable.drop_duplicates()
conn = my_oracle_connection()
obj = lookup.to_sql(name = 'lookup', con = conn, if_exists = 'replace')
I get the following error when exeucting this:
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master
WHERE type='table' AND name=?;': ORA-01036: illegal variable
name/number
I can execute a psql.read_sql() query but above fails.
Now, I dont exactly know how to go about fixing it, im quite new to the technical aspects of getting this to work so any pointers in what direction to take it would be greately appriciated.
Thanks for any time and input!
I had the same issue when using cx_Oracle connection (I was able to use .read_sql function, but not the .to_sql one)
Use SQLalchemy connection instead:
import sqlalchemy as sa
oracle_db = sa.create_engine('oracle://username:password#database')
connection = oracle_db.connect()
dataframe.to_sql('table_name', connection, schema='schema_name', if_exists='append', index=False)
I think the problem happens writing to the Oracle DB using a connection object created by cx_Oracle. SqlAlchemy has a work around:
import cx_Oracle
from sqlalchemy import types, create_engine
conn = create_engine('oracle+cx_oracle://Jeremy:SuperSecret#databasehost:1521/?service_name=gdw')
df.to_sql('TEST', conn, if_exists='replace')
I am connecting to a sybase ASE 15 database from Python 3.4 using pyodbc and executing a stored procedure.
All works as expected if I use native pyodbc:
import pd
import pyodbc
con = pyodbc.connect('DSN=dsn_name;UID=username;PWD=password', autocommit=True)
df = pd.read_sql("exec p_procecure #GroupName='GROUP'", con)
[Driver is Adaptive Server Enterprise].
I have to have autocommit=True and if I do no I get the following error:
DatabaseError: Execution failed on sql 'exec ....': ('ZZZZZ', "[ZZZZZ]
[SAP][ASE ODBC Driver][Adaptive Server Enterprise]Stored procedure
'p_procedure' may be run only in unchained transaction mode. The 'SET
CHAINED OFF' command will cause the current session to use unchained
transaction mode.\n (7713) (SQLExecDirectW)")
I attempt to achieve the same using SQLAlchemy (1.0.9):
from sqlalchemy import create_engine, engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.sql import text
url = r'sybase+pyodbc://username:password#dsn'
engine = create_engine(url, echo=True)
sess = sessionmaker(bind=engine).Session()
df = pd.read_sql(text("exec p_procedure #GroupName='GROUP'"),conn.execution_options(autocommit=True))
The error message is the same despite the fact I have specified autocommit=True on the connection. (I have also tested this at the session level but should not be necessary and made no difference).
DBAPIError: (pyodbc.Error) ('ZZZZZ', "[ZZZZZ] [SAP][ASE ODBC
Driver][Adaptive Server Enterprise]....
Can you see anything wrong here?
As always, any help would be much appreciated.
Passing the autocommit=True argument as an item in the connect_args argument dictionary does work:
connect_args = {'autocommit': True}
create_engine(url, connect_args=connect_args)
connect_args – a dictionary of options which will be passed directly
to the DBAPI’s connect() method as additional keyword arguments.
I had some problems with autocommit option. The only thing that worked for me was to change this option to True after establishing connection.
ConnString = 'Driver=%SQL_DRIVER%;Server=%SQL_SERVER%;Uid=%SQL_LOGIN%;Pwd=%SQL_PASSWORD%;'
SQL_CONNECTION = pyodbc.connect(ConnString)
SQL_CONNECTION.autocommit = True