I wish to use sqlachemy with teradata dialect to push some csv into a table.
So far I wrote this :
import pandas as pd
from sqlalchemy import create_engine
user = '******'
pasw = '******'
host = 'FTGPRDTD'
DATABASE = 'DB_FTG_SRS_DATALAB'
# connect
td_engine = create_engine('teradata://'+ user +':' + pasw + '#'+ DBCNAME + ':1025/')
print ('ok step one')
print(td_engine)
# execute sql
df = pd.read_csv(r'C:/Users/c92434/Desktop/Load.csv')
print('df chargé')
df.to_sql(name= 'mdc_load', con = td_engine, index=False, schema = DATABASE,
if_exists='replace')
print ('ok step two')
This is the error message I get :
DatabaseError: (teradata.api.DatabaseError) (0, '[08001] [TPT][ODBC SQL Server Wire Protocol driver]Invalid Connection Data., [TPT][ODBC SQL Server Wire Protocol driver]Invalid attribute in connection string: DBCNAME.')
(Background on this error at: http://sqlalche.me/e/4xp6)
What I can I do ?
Hopefully you've solved this by now, but I had success with this. Looking at what you provided, it looks like the host information you set is not being used in the connection string. My example includes the dtype parameter, which I use to define the data type for each column so they don't show up as CLOB.
database = "database_name"
table = "mdc_load"
user = "user"
password = "password"
host = 'FTGPRDTD:1025'
td_engine = create_engine(f'teradata://{user}:{password}#{host}/?database={database}&driver=Teradata&authentication=LDAP')
conn = td_engine.connect()
data.to_sql(name=table, con=conn, index=False, if_exists='replace', dtype=destType)
conn.close()
The "teradata" dialect (sqlalchemy-teradata module) relies on a Teradata ODBC driver being separately installed on the client platform. If you have multiple ODBC drivers installed that include the word Teradata in the name (for example, because you installed TPT with the Teradata-branded drivers for other database platforms), you may need to explicitly specify the one to be used by appending an optional parameter to your connection string, e.g.
td_engine = create_engine('teradata://'+ user +':' + pasw + '#'+ DBCNAME + ':1025/?driver=Teradata Database ODBC Driver 16.20')
Alternatively, you could use the "teradatasql" dialect (teradatasqlalchemy module) which does not require ODBC.
Related
I'm using python Jupyter-Lab inside a Docker Conteiner, which is embedded in an AWS EC-2. This Docker Container has an Instant Oracle Cliente installed inside it, so everything is set. The problem is that I'm still having trouble to connect this Docker to my AWS RDS with an Oracle Database, but only using SQLAlchemy.
When I try the connection using cx-Oracle==8.2.1 engine:
host = '***********************'
user = '*********'
password = '**********'
port = '****'
service = '****'
dsn_tns = cx_Oracle.makedsn(host,
port,
service)
engine_oracle = cx_Oracle.connect(user=user, password=password, dsn=dsn_tns)
Everything works fine. I can read tables using pandas read_sql(), I can create tables using cx_Oracle execute(), etc.
But when I try to take a DataFrame and send it to my RDS using pandas to_sql(), my cx_Oracle connection returns the error:
DatabaseError: ORA-01036: illegal variable name/number
I then tried to use a SQLAlchemy==1.4.22 engine from the string:
tns = """
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = %s)(PORT = %s))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = %s)
)
)
""" % (host, port, service)
engine_alchemy = create_engine('oracle+cx_oracle://%s:%s#%s' % (user, password, tns))
But I get this error:
DatabaseError: ORA-12154: TNS:could not resolve the connect identifier specified
And I keep getting this error even when I try to use pandas read_sql with the SQLAlchemy engine. Thus, I ran out of options. Can somebody help me please?
EDIT*
I tried again with SQLAlchemy==1.3.9 and it worked. Does anybody knows why?
The code I'm using for reading and sending a test table from and to Oracle is:
sql = """
SELECT
*
FROM
DADOS_MIS.DR_ACIO_ATIVOS_HASH
WHERE
ROWNUM <= 5"""
df = pd.read_sql(sql, engine_oracle)
dtyp1 = {c:'VARCHAR2('+str(df[c].str.len().max())+')'
for c in df.columns[df.dtypes == 'object'].tolist()}
dtyp2 = {c:'NUMBER'
for c in df.columns[df.dtypes == 'float64'].tolist()}
dtyp3 = {c:'DATE'
for c in df.columns[df.dtypes == 'datetime'].tolist()}
dtyp4 = {c:'NUMBER'
for c in df.columns[df.dtypes == 'int64'].tolist()}
dtyp_total = dtyp1
dtyp_total.update(dtyp2)
dtyp_total.update(dtyp3)
dtyp_total.update(dtyp4)
df.to_sql(name='teste', con=engine_oracle, if_exists='replace', dtype=dtyp_total, index=False)
The dtyp_total is:
{'IDENTIFICADOR': 'VARCHAR2(32)',
'IDENTIFICADOR_PRODUTO': 'VARCHAR2(32)',
'DATA_CHAMADA': 'VARCHAR2(19)',
'TABULACAO': 'VARCHAR2(25)'}
I am unable to create a SQL Server stored procedure using Python's pyodbc. The command executes correctly and I get no error message however the stored procedure does not appear on the server
import pyodbc
host = 'myServer'
database = 'model'
conn = pyodbc.connect(
r'DRIVER={SQL Server Native Client 11.0};' +
r'SERVER=' + host + ';' +
r'DATABASE=' + database + ';' +
r'Trusted_Connection=yes'
)
cursor = conn.cursor()
sql = """
CREATE OR ALTER PROCEDURE [dbo].[Test] AS
SELECT 1
"""
cursor.execute(sql)
conn.close()
pyodbc connections default to having autocommit disabled as specified in Python's DB API 2.0 spec. In that mode, any changes to the database must be committed by calling commit() on the Connection.
If you want a connection with autocommit enabled, see this answer for details.
I'm trying to follow the method for inserting a Panda data frame into SQL Server that is mentioned here as it appears to be the fastest way to import lots of rows.
However I am struggling with figuring out the connection parameter.
I am not using DSN , I have a server name, a database name, and using trusted connection (i.e. windows login).
import sqlalchemy
import urllib
server = 'MYServer'
db = 'MyDB'
cxn_str = "DRIVER={SQL Server Native Client 11.0};SERVER=" + server +",1433;DATABASE="+db+";Trusted_Connection='Yes'"
#cxn_str = "Trusted_Connection='Yes',Driver='{ODBC Driver 13 for SQL Server}',Server="+server+",Database="+db
params = urllib.parse.quote_plus(cxn_str)
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
conn = engine.connect().connection
cursor = conn.cursor()
I'm just not sure what the correct way to specify my connection string is. Any suggestions?
I have been working with pandas and SQL server for a while and the fastest way I found to insert a lot of data in a table was in this way:
You can create a temporary CSV using:
df.to_csv('new_file_name.csv', sep=',', encoding='utf-8')
Then use pyobdc and BULK INSERT Transact-SQL:
import pyodbc
conn = pyodbc.connect(DRIVER='{SQL Server}', Server='server_name', Database='Database_name', trusted_connection='yes')
cur = conn.cursor()
cur.execute("""BULK INSERT table_name
FROM 'C:\\Users\\folders path\\new_file_name.csv'
WITH
(
CODEPAGE = 'ACP',
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)""")
conn.commit()
cur.close()
conn.close()
Then you can delete the file:
import os
os.remove('new_file_name.csv')
It was a second to charge a lot of data at once into SQL Server. I hope this gives you an idea.
Note: don't forget to have a field for the index. It was my mistake when I started to use this lol.
Connection string parameter values should not be enclosed in quotes so you should use Trusted_Connection=Yes instead of Trusted_Connection='Yes'.
I am trying to create a database using pyodbc, however, I cannot find it seems to be paradox as the pyodbc needs to connect to a database first, and the new database is created within the linked one. Please correct me if I am wrong.
In my case, I used following code to create a new database
conn = pyodbc.connect("driver={SQL Server};server= serverName; database=databaseName; trusted_connection=true")
cursor = conn.cursor()
sqlcommand = """
CREATE DATABASE ['+ #IndexDBName +'] ON PRIMARY
( NAME = N'''+ #IndexDBName+''', FILENAME = N''' + #mdfFileName + ''' , SIZE = 4000KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N'''+ #IndexDBName+'_log'', FILENAME = N''' + #ldfFileName + ''' , SIZE = 1024KB , MAXSIZE = 100GB , FILEGROWTH = 10%)'
"""
cursor.execute(sqlcommand)
cursor.commit()
conn.commit()
The above code works without errors, however, there is no database created.
So how can I create a database using pyodbc?
Thanks a lot.
If you try to create a database with the default autocommit value for the connection, you should receive an error like the following. If you're not seeing this error message, try updating the SQL Server native client for a more descriptive message:
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][SQL Server Native Client 11.0]
[SQL Server]CREATE DATABASE statement not allowed within multi-statement transaction.
(226) (SQLExecDirectW)')
Turn on autocommit for the connection to resolve:
conn = pyodbc.connect("driver={SQL Server};server=serverName; database=master; trusted_connection=true",
autocommit=True)
Note two things:
autocommit is not part of the connection string, it is a separate keyword passed to the connect function
specify the initial connection database context is the master system database
As an aside, you may want to check the #IndexDBName, #mdfFileName, and #ldfFileName are being appropriately set in your T-SQL. With the code you provided, a database named '+ #IndexDBName +' would be created.
The accepted answer did not work for me but I managed to create a database using the following code on Ubuntu:
conn_str = r"Driver={/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.9.so.1.1};" + f"""
Server={server_ip};
UID=sa;
PWD=passwd;
"""
conn = pyodbc.connect(conn_str, autocommit=True)
cursor = conn.cursor()
cursor.execute(f"CREATE DATABASE {db_name}")
Which uses the default "master database" when connecting. You can check if the dataset is created by this query:
SELECT name FROM master.sys.databases
sqlalchemy, a db connection module for Python, uses SQL Authentication (database-defined user accounts) by default. If you want to use your Windows (domain or local) credentials to authenticate to the SQL Server, the connection string must be changed.
By default, as defined by sqlalchemy, the connection string to connect to the SQL Server is as follows:
sqlalchemy.create_engine('mssql://*username*:*password*#*server_name*/*database_name*')
This, if used using your Windows credentials, would throw an error similar to this:
sqlalchemy.exc.DBAPIError: (Error) ('28000', "[28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for us
er '***S\\username'. (18456) (SQLDriverConnect); [28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for us
er '***S\\username'. (18456)") None None
In this error message, the code 18456 identifies the error message thrown by the SQL Server itself. This error signifies that the credentials are incorrect.
In order to use Windows Authentication with sqlalchemy and mssql, the following connection string is required:
ODBC Driver:
engine = sqlalchemy.create_engine('mssql://*server_name*/*database_name*?trusted_connection=yes')
SQL Express Instance:
engine = sqlalchemy.create_engine('mssql://*server_name*\\SQLEXPRESS/*database_name*?trusted_connection=yes')
If you're using a trusted connection/AD and not using username/password, or otherwise see the following:
SAWarning: No driver name specified; this is expected by PyODBC when using >DSN-less connections
"No driver name specified; "
Then this method should work:
from sqlalchemy import create_engine
server = <your_server_name>
database = <your_database_name>
engine = create_engine('mssql+pyodbc://' + server + '/' + database + '?trusted_connection=yes&driver=ODBC+Driver+13+for+SQL+Server')
A more recent response if you want to connect to the MSSQL DB from a different user than the one you're logged with on Windows. It works as well if you are connecting from a Linux machine with FreeTDS installed.
The following worked for me from both Windows 10 and Ubuntu 18.04 using Python 3.6 & 3.7:
import getpass
from sqlalchemy import create_engine
password = getpass.getpass()
eng_str = fr'mssql+pymssql://{domain}\{username}:{password}#{hostip}/{db}'
engine = create_engine(eng_str)
What changed was to add the Windows domain before \username.
You'll need to install the pymssql package.
Create Your SqlAlchemy Connection URL From Your pyodbc Connection String OR Your Known Connection Parameters
I found all the other answers to be educational, and I found the SqlAlchemy Docs on connection strings helpful too, but I kept failing to connect to MS SQL Server Express 19 where I was using no username or password and trusted_connection='yes' (just doing development at this point).
Then I found THIS method in the SqlAlchemy Docs on Connection URLs built from a pyodbc connection string (or just a connection string), which is also built from known connection parameters (i.e. this can simply be thought of as a connection string that is not necessarily used in pyodbc). Since I knew my pyodbc connection string was working, this seemed like it would work for me, and it did!
This method takes the guesswork out of creating the correct format for what you feed to the SqlAlchemy create_engine method. If you know your connection parameters, you put those into a simple string per the documentation exemplified by the code below, and the create method in the URL class of the sqlalchemy.engine module does the correct formatting for you.
The example code below runs as is and assumes a database named master and an existing table named table_one with the schema shown below. Also, I am using pandas to import my table data. Otherwise, we'd want to use a context manager to manage connecting to the database and then closing the connection like HERE in the SqlAlchemy docs.
import pandas as pd
import sqlalchemy
from sqlalchemy.engine import URL
# table_one dictionary:
table_one = {'name': 'table_one',
'columns': ['ident int IDENTITY(1,1) PRIMARY KEY',
'value_1 int NOT NULL',
'value_2 int NOT NULL']}
# pyodbc stuff for MS SQL Server Express
driver='{SQL Server}'
server='localhost\SQLEXPRESS'
database='master'
trusted_connection='yes'
# pyodbc connection string
connection_string = f'DRIVER={driver};SERVER={server};'
connection_string += f'DATABASE={database};'
connection_string += f'TRUSTED_CONNECTION={trusted_connection}'
# create sqlalchemy engine connection URL
connection_url = URL.create(
"mssql+pyodbc", query={"odbc_connect": connection_string})
""" more code not shown that uses pyodbc without sqlalchemy """
engine = sqlalchemy.create_engine(connection_url)
d = {'value_1': [1, 2], 'value_2': [3, 4]}
df = pd.DataFrame(data=d)
df.to_sql('table_one', engine, if_exists="append", index=False)
Update
Let's say you've installed SQL Server Express on your linux machine. You can use the following commands to make sure you're using the correct strings for the following:
For the driver: odbcinst -q -d
For the server: sqlcmd -S localhost -U <username> -P <password> -Q 'select ##SERVERNAME'
pyodbc
I think that you need to put:
"+pyodbc" after mssql
try this:
from sqlalchemy import create_engine
engine = create_engine("mssql+pyodbc://user:password#host:port/databasename?driver=ODBC+Driver+17+for+SQL+Server")
cnxn = engine.connect()
It works for me
Luck!
If you are attempting to connect:
DNS-less
Windows Authentication for a server not locally hosted.
Without using ODBC connections.
Try the following:
import sqlalchemy
engine = sqlalchemy.create_engine('mssql+pyodbc://' + server + '/' + database + '?trusted_connection=yes&driver=SQL+Server')
This avoids using ODBC connections and thus avoids pyobdc interface errors from DPAPI2 vs DBAPI3 conflicts.
I would recommend using the URL creation tool instead of creating the url from scratch.
connection_url = sqlalchemy.engine.URL.create("mssql+pyodbc",database=databasename, host=servername, query = {'driver':'SQL Server'})
engine = sqlalchemy.create_engine(connection_url)
See this link for creating a connection string with SQL Server Authentication (non-domain, uses username and password)