Python SQLAlchemy: Data source name not found and no default driver specified - python

Using Python: when connecting to SQL Server using pyodbc, everything works fine, but when I switch to sqlalchemy, the connection fails, giving me the error message:
('IM002', '[IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified (0) (SQLDriverConnect)')
My code:
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=servername;DATABASE=dbname;UID=username;PWD=password')
engine = sqlalchemy.create_engine("mssql+pyodbc://username:password#servername/dbname")
I can't find the error in my code, and don't understand why the first options works, but the second doesn't.
Help is highly appreciated!

Ran into this problem as well, appending a driver query string to the end of my connection path worked:
"mssql+pyodbc://" + uname + ":" + pword + "#" + server + "/" + dbname + "?driver=SQL+Server"
Update (July 2021) – As above, just modernized (Python 3.6+):
f"mssql+pyodbc://{uname}:{pword}#{server}:{port}/{dbname}?driver=ODBC+Driver+17+for+SQL+Server"
Note that driver= must be all lowercase.

It works using pymssql, instead of pyodbc.
Install pymssql using pip, then change your code to:
engine = sqlalchemy.create_engine("mssql+pymssql://username:password#servername/dbname")

Very late, but experienced the same such problem myself recently. Turned out it was a problem with the latest SQLAlchemy version. Had to rollback my version from 1.4.17 to 1.4.12 (unsure of in-between versions, just went with a version I knew worked).
pip install sqlalchemy==1.4.12

I had original poster's problem with a trusted connection to the Microsoft SQL Server database (pandas 1.5.3, SQLAlchemy 2.0.4). Using answers from this question, this did the trick for me:
import sqlalchemy
import pandas as pd
server = "servername"
database = "dbname"
driver = "ODBC+Driver+17+for+SQL+Server"
url = f"mssql+pyodbc://{server}/{database}?trusted_connection=yes&driver={driver}"
engine = sqlalchemy.create_engine(url)
query = """
SELECT [column1]
,[column2] as some_other_name
FROM [server].[dbo].[table]"""
with engine.begin() as conn:
sqla_query = sqlalchemy.text(query)
df = pd.read_sql(sqla_query, conn)
It should be noted that pandas is not yet fully compatible with SQLAlchemy 2.0: https://pandas.pydata.org/docs/whatsnew/v1.5.3.html

Related

Error when using fast_executemany = True in sql alchemy with driver='SQL+Server'

I am trying to use fast_executemany to speed up my df.to_sql insert.
I read the documentation and added it to my code like this:
import pandas as pd
import sqlalchemy
import numpy as np
import random
#connect to database
server = 'Test'
database = 'Test'
driver = 'SQL+Server'
driver1 = 'ODBC+Driver+13+for+SQL+Server'
engine_stmt = ("mssql+pyodbc://#%s/%s?driver=%s" % (server, database, driver))
engine = sqlalchemy.create_engine(engine_stmt, fast_executemany=True)
connection = engine.connect()
When I run this code without the fast_executemany it works, but it takes fairly long for the insert.
Therefore I wanted to use the command, but I get an error when using it with the 'SQL+Server' driver. Therefore I tried to change the driver according to the documentation to 'ODBC+Driver+13+for+SQL+Server' I get the following error:
def create_connect_args(self, url):
InterfaceError: (pyodbc.InterfaceError) ('IM002', '[IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified (0) (SQLDriverConnect)')
So I guess this driver is not working for me? I tested a few different ones, but the only one that is working is the 'SQL+Server'
It looks like you're missing the ODBC 13 driver on your computer.
Try installing that and run your script again.
Alternatively, try swapping:
driver1 = 'ODBC+Driver+13+for+SQL+Server'
for
driver1 = 'ODBC+Driver+17+for+SQL+Server'

Python "REPLACE" command in a SQL query(Error message)

I am new in python and I need this query to check my database if it is working. There are lots of queries but it is only one of them.
import pyodbc
db_file = 'C:\\Users\\****\\Desktop\\bbbb.mdb' #define the location of your Access file
odbc_conn_str = 'DRIVER={Microsoft Access Driver (*.mdb)};DBQ=%s' %(db_file) # define the odbc connection parameter
conn = pyodbc.connect(odbc_conn_str)
cursor = conn.cursor() # create a cursor
sql_update_statement = "UPDATE NUMARATAJ SET KAPINUMERIK = IIf (ISNUMERIC (LEFT (REPLACE(KAPINO,'-','/'),4)=-1),LEFT(KAPINO,4))" # edit the SQL statement that you want to execute
cursor.execute(sql_update_statement) # execute the SQL statement
cursor.commit()
cursor.close()
conn.close()
When I try to run this code it says;
File "C:\Users\****\Desktop\aa2a.py", line 9, in <module>
cursor.execute(sql_update_statement) # execute the SQL statement
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Microsoft Access Sürücüsü] Undefined 'REPLACE' function in the expression. (-3102) (SQLExecDirectW)")
How can I fix it, or get it work, can you help me?
It could be:
sql_update_statement = "UPDATE NUMARATAJ SET KAPINUMERIK = LEFT(KAPINO,4) WHERE ISNUMERIC(LEFT(REPLACE(KAPINO,'-','/'),4)"
However, Replace is a VBA function, not Access SQL, so that is probably why you receive the error.
Try simply:
sql_update_statement = "UPDATE NUMARATAJ SET KAPINUMERIK = LEFT(KAPINO,4) WHERE ISNUMERIC(LEFT(KAPINO),4)"
There are several "VBA functions" that were not directly supported by the older "Jet" ODBC driver (Microsoft Access Driver (*.mdb)) but are directly supported by the newer "ACE" ODBC driver (Microsoft Access Driver (*.mdb, *.accdb)). Replace is one of those functions.
So if you really need to use the Replace function you can download and install the 32-bit version of the Access Database Engine Redistributable and use the newer ODBC driver.

Connecting and testing a JDBC driver from Python

I'm trying to do some testing on our JDBC driver using Python.
Initially figuring out JPype, I eventually managed to connect the driver and execute select queries like so (reproducing a generalized snippet):
from __future__ import print_function
from jpype import *
#Start JVM, attach the driver jar
jvmpath = 'path/to/libjvm.so'
classpath = 'path/to/JDBC_Driver.jar'
startJVM(jvmpath, '-ea', '-Djava.class.path=' + classpath)
# Magic line 1
driver = JPackage('sql').Our_Driver
# Initiating a connection via DriverManager()
jdbc_uri = 'jdbc:our_database://localhost:port/database','user', 'passwd')
conn = java.sql.DriverManager.getConnection(jdbc_uri)
# Executing a statement
stmt = conn.createStatement()
rs = stmt.executeQuery ('select top 10 * from some_table')
# Extracting results
while rs.next():
''' Magic #2 - rs.getStuff() only works inside a while loop '''
print (rs.getString('col_name'))
However, I've failed to to batch inserts, which is what I wanted to test. Even when executeBatch() returned a jpype int[], which should indicate a successful insert, the table was not updated.
I then decided to try out py4j.
My plight - I'm having a hard time figuring out how to do the same thing as above. It is said py4j does not start a JVM on its own, and that the Java code needs to be prearranged with a GatewayServer(), so I'm not sure it's even feasible.
On the other hand, there's a library named py4jdbc that does just that.
I tinkered through the dbapi.py code but didn't quite understand the flow, and am pretty much jammed.
If anyone understands how to load a JDBC driver from a .jar file with py4j and can point me in the right direction, I'd be much grateful.
add a commit after adding the records and before retrieving.
conn.commit()
I have met a similar problem in airflow, I used teradata jdbc jars and jaydebeapi to connect teradata database and execute sql:
[root#myhost transfer]# cat test_conn.py
import jaydebeapi
from contextlib import closing
jclassname='com.teradata.jdbc.TeraDriver'
jdbc_driver_loc = '/opt/spark-2.3.1/jars/terajdbc4-16.20.00.06.jar,/opt/spark-2.3.1/jars/tdgssconfig-16.20.00.06.jar'
jdbc_driver_name = 'com.teradata.jdbc.TeraDriver'
host='my_teradata.address'
url='jdbc:teradata://' + host + '/TMODE=TERA'
login="teradata_user_name"
psw="teradata_passwd"
sql = "SELECT COUNT(*) FROM A_TERADATA_TABLE_NAME where month_key='202009'"
conn = jaydebeapi.connect(jclassname=jdbc_driver_name,
url=url,
driver_args=[login, psw],
jars=jdbc_driver_loc.split(","))
with closing(conn) as conn:
with closing(conn.cursor()) as cur:
cur.execute(sql)
print(cur.fetchall())
[root#myhost transfer]# python test_conn.py
[(7734133,)]
[root#myhost transfer]#
In py4j, with your respective JDBC uri:
from py4j.java_gateway import JavaGateway
# Open JVM interface with the JDBC Jar
jdbc_jar_path = '/path/to/jdbc_driver.jar'
gateway = JavaGateway.launch_gateway(classpath=jdbc_jar_path)
# Load the JDBC Jar
jdbc_class = "com.vendor.VendorJDBC"
gateway.jvm.class.forName(jdbc_class)
# Initiate connection
jdbc_uri = "jdbc://vendor:192.168.x.y:zzzz;..."
con = gateway.jvm.DriverManager.getConnection(jdbc_uri)
# Run a query
sql = "select this from that"
stmt = con.createStatement(sql)
rs = stmt.executeQuery()
while rs.next():
rs.getInt(1)
rs.getFloat(2)
.
.
rs.close()
stmt.close()

DB2 sql query to a Pandas DataFrame from my Mac

I have a need to access data that resides in a remote db2 database via a sql statement and convert it to a Pandas DataFrame. All from my Mac. I looked at using Pandas' read_sql with the ibm_db_sa adapter, but it looks like the prerequisite client side software is not supported on the Mac
I came up with an jdbc option, which I'm posting, but I'm curious to know if anyone else has any ideas
Here's an option using jdbc, the pip installable JayDeBeApi and the appropriate db jar file
Note: this could be used for other jdbc/jaydebeapi compliant databases like Oracle, MS Sql Server, etc
import jaydebeapi
import pandas as pd
def read_jdbc(sql, jclassname, driver_args, jars=None, libs=None):
'''
Reads jdbc compliant data sources and returns a Pandas DataFrame
uses jaydebeapi.connect and doc strings :-)
https://pypi.python.org/pypi/JayDeBeApi/
:param sql: select statement
:param jclassname: Full qualified Java class name of the JDBC driver.
e.g. org.postgresql.Driver or com.ibm.db2.jcc.DB2Driver
:param driver_args: Argument or sequence of arguments to be passed to the
Java DriverManager.getConnection method. Usually the
database URL. See
http://docs.oracle.com/javase/6/docs/api/java/sql/DriverManager.html
for more details
:param jars: Jar filename or sequence of filenames for the JDBC driver
:param libs: Dll/so filenames or sequence of dlls/sos used as
shared library by the JDBC driver
:return: Pandas DataFrame
'''
try:
conn = jaydebeapi.connect(jclassname, driver_args, jars, libs)
except jaydebeapi.DatabaseError as de:
raise
try:
curs = conn.cursor()
curs.execute(sql)
columns = [desc[0] for desc in curs.description] #getting column headers
#convert the list of tuples from fetchall() to a df
return pd.DataFrame(curs.fetchall(), columns=columns)
except jaydebeapi.DatabaseError as de:
raise
finally:
curs.close()
conn.close()
Some examples
#DB2
conn = 'jdbc:db2://<host>:5032/<db>:currentSchema=<schema>;'
class_name = 'com.ibm.db2.jcc.DB2Driver'
sql = 'SELECT name FROM table_name FETCH FIRST 5 ROWS ONLY'
df = read_jdbc(sql, class_name, [conn, 'myname', 'mypwd'])
#PostgreSQL
conn = 'jdbc:postgresql://<host>:5432/<db>?currentSchema=<schema>'
class_name = 'org.postgresql.Driver'
jar = '/path/to/jar/postgresql-9.4.1212.jar'
sql = 'SELECT name FROM table_name LIMIT 5'
df = read_jdbc(sql, class_name, [conn, 'myname', 'mypwd'], jars=jar)
I got a simpler answer from https://stackoverflow.com/a/33805547/914967 where it uses pip module ibm_db only:
import ibm_db
import ibm_db_dbi
import pandas as pd
conn_handle = ibm_db.connect('DATABASE={};HOSTNAME={};PORT={};PROTOCOL=TCPIP;UID={};PWD={};'.format(db_name, hostname, port_number, user, password), '', '')
conn = ibm_db_dbi.Connection(conn_handle)
df = pd.read_sql(sql, conn)
Bob, you should check out ibmdbpy (https://pypi.python.org/pypi/ibmdbpy). It is a pandas data frame style API to DB2 and dashDB tables. It supports both underlying DB2 client drivers, ODBC and JDBC.
So as prerequisites you need to set up the DB2 client driver package for Mac that you can find here: http://www-01.ibm.com/support/docview.wss?uid=swg21385217
After #IanBjorhovde commented on my question I investigated another solution that allows me to use sqlalchemy and pandas' read_sql()
Here are the steps I took. Note: I got this working on OSX Yosemite (10.10.4) for python 3.4 and 3.5
1) Download IBM DB2 Express-C (no-cost community edition of DB2)
https://www-01.ibm.com/marketing/iwm/iwm/web/pick.do?source=swg-db2expressc&S_TACT=000000VR&lang=en_US&S_OFF_CD=10000761
2) After navigating to the unzipped dir
sudo ./db2_install
I accepted the default location of /opt/IBM/db2/V10.1
3) Install ibm_db and ibm_db_sa
pip install ibm_db
I built ibm_db_sa from source because the pip installed failed
python setup.py install
That should do it. You might get an error like 'Reason: image not found' when you try to connect to your db so read this for the fix. Note: might require a reboot
Example usage:
import ibm_db_sa
import pandas as pd
from sqlalchemy import select, create_engine
eng = create_engine('ibm_db_sa://<user_name>:<pwd>#<host>:5032/<db name>')
sql = 'SELECT name FROM table_name FETCH FIRST 5 ROWS ONLY'
df = pd.read_sql(sql, eng)

Reading table into pandas using sqlalchemy from SQL Server

I've been at this for many hours, and cannot figure out what's wrong with my approach. I'm trying to read a table into pandas using sqlalchemy (from a SQL server 2012 instance) and getting the following error:
DBAPIError: (Error) ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'SQL Server' : file not found (0) (SQLDriverConnect)") None None
I'm using the following code:
import sqlalchemy as sql
from sqlalchemy import create_engine
import pyodbc
pyodbc.connect('DSN=MYDSN;UID=User;PWD=Password')
Which returns:
<pyodbc.Connection at 0x10a26a420>
Which I think is good. Then when I run the following:
connectionString = 'mssql+pyodbc://User:Password#IPAdress/Database'
engine = sql.create_engine(connectionString)
pd.read_sql("ecodata", engine)
I get the following error mentioned above.
Is there something wrong with my driver setup? I've been wrestling with the driver setup for days and thought I had it beat.
Any help is greatly appreciated.
For the record, the answer to my question was figured out by joris:
My syntax for the connection string was wrong, and when I changed it from this
connectionString = 'mssql+pyodbc://User:Password#IPAdress/Database'
to this
connectionString = 'mssql+pyodbc://User:Password#IPAddress:Port/Database?driver=FreeTDS'
It worked!
Thanks again for the help!
Try forming your connection like this. You need a few more parameters.
con = pyodbc.connect('DRIVER={FreeTDS};SERVER="yourserver";PORT=1433;DATABASE=yourdb;UID=youruser;PWD=yourpassword;TDS_Version=7.2;')
To figure out which TDS Version to use:
http://www.freetds.org/userguide/choosingtdsprotocol.htm
Hopefully, this helps!

Categories

Resources