I am trying to fetch a clob data from Oracle server and the connection is made through ssh tunnel.
When I tried to run the following code:
(id,clob) = cursor.fetchone()
print('one fetched')
clob_data = clob.read()
print(clob_data)
the execution freezes
Can someone help me with what's wrong here because I have referred to cx_oracle docs and the example code is just the same.
It is possible that there is a round trip taking place that is not being handled properly by the cx_Oracle driver. Please create an issue here (https://github.com/oracle/python-cx_Oracle/issues) with a few more details such as platform, Python version, Oracle database/client version, etc.
You can probably work around the issue, however, by simply returning the CLOBs as strings as can be seen in this sample: https://github.com/oracle/python-cx_Oracle/blob/master/samples/ReturnLobsAsStrings.py.
Related
I am aware a similiar question exists. It has not been marked as answered yet and I tried all suggestions so far. I am also not native speaker, please excuse spelling mistakes.
I have written a small class in python to interact with a SQL Database.
Now I want to be able to connect to either SQL or MYSQL Database with the same functionalities.
It would be perfect for me to just change the connection type in the instance initiation to keep my class maintainable. Else I would need to create a seconde class using for example the mysql.connector, which would result in two classes with nearly the same structure and content.
This is how I tried to use pyodbc so far:
conn = pyodbc.connect('Driver={SQL Server};'
'Server=xyzhbv;'
'Database=Test;'
'ENCRYPT=yes;'
'UID=root;'
'PWD=12345;')
Please note that I changed all credentials.
What do I need to change to use pyodbc for MySQL?
Is that even possible?
Or
Can I use both libaries within one class without confusing? (they share function names)
Many Thanks for any help.
Have a great day.
I have a pandas DataFrame in python and want this DataFrame directly to be written into a Netezza Database.
I would like to use the pandas.to_sql() method that is described here but it seems like that this method needs one to use SQLAlchemy to connect to the DataBase.
The Problem: SQLAlchemy does not support Netezza.
What I am using at the moment to connect to the database is pyodbc. But this o the other hand is not understood by pandas.to_sql() or am I wrong with this?
My workaround to this is to write the DataFrame into a csv file via pandas.to_csv() and send this to the Netezza Database via pyodbc.
Since I have big data, writing the csv first is a performance issue. I actually do not care if I have to use SQLAlchemy or pyodbc or something different but I cannot change the fact that I have a Netezza Database.
I am aware of deontologician project but as the author states itself "is far from complete, has a lot of bugs".
I got the package to work (see my solution below). But if someone nows a better solution, please let me know!
I figured it out. For my solution see accepted answer.
Solution
I found a solution that I want to share for everyone with the same problem.
I tried the netezza dialect from deontologician but it does not work with python3 so I made a fork and corrected some encoding issues. I uploaded to github and it is available here. Be aware that I just made some small changes and that is mostly work of deontologician and nobody is maintaining it.
Having the netezza dialect I got pandas.to_sql() to work directy with the Netezza database:
import netezza_dialect
from sqlalchemy import create_engine
engine = create_engine("netezza://ODBCDataSourceName")
df.to_sql("YourDatabase",
engine,
if_exists='append',
index=False,
dtype=your_dtypes,
chunksize=1600,
method='multi')
A little explaination to the to_sql() parameters:
It is essential that you use the method='multi' parameter if you do not want to take pandas for ever to write in the database. Because without it it would send an INSERT query per row. You can use 'multi' or you can define your own insertion method. Be aware that you have to have at least pandas v0.24.0 to use it. See the docs for more info.
When using method='multi' it can happen (happend at least to me) that you exceed the parameter limit. In my case it was 1600 so I had to add chunksize=1600 to avoid this.
Note
If you get a warning or error like the following:
C:\Users\USER\anaconda3\envs\myenv\lib\site-packages\sqlalchemy\connectors\pyodbc.py:79: SAWarning: No driver name specified; this is expected by PyODBC when using DSN-less connections
"No driver name specified; "
pyodbc.InterfaceError: ('IM002', '[IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified (0) (SQLDriverConnect)')
Then you propably treid to connect to the database via
engine = create_engine(netezza://usr:pass#address:port/database_name)
You have to set up the database in the ODBC Data Source Administrator tool from Windows and then use the name you defined there.
engine = create_engine(netezza://ODBCDataSourceName)
Then it should have no problems to find the driver.
I know you already answered the question yourself (thanks for sharing the solution)
One general comment about large data-writes to Netezza:
I’d always choose to write data to a file and then use the external table/ODBC interface to insert the data. Instead of inserting 1600 rows at a time, you can probably insert millions of rows in the same timeframe.
We use UTF8 data in the flat file and CSV unless you want to load binary data which will probably require fixed width files.
I’m not a python savvy but I hope you can follow me ...
If you need a documentation link, you can start here: https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.load.doc/c_load_create_external_tbl_syntax.html
I've got mssql 2005 running on my personal computer with a database I'd like to run some python scripts on. I'm looking for a way to do some really simple access on the data. I'd like to run some select statements, process the data and maybe have python save a text file with the results.
Unfortunately, even though I know a bit about python and a bit about databases, it's very difficult for me to tell, just from reading, if a library does what I want. Ideally, I'd like something that works for other versions of mssql, is free of charge and licensed to allow commercial use, is simple to use, and possibly works with ironpython.
Everyone else seems to have the cPython -> SQL Server side covered. If you want to use IronPython, you can use the standard ADO.NET API to talk to the database:
import clr
clr.AddReference('System.Data')
from System.Data.SqlClient import SqlConnection, SqlParameter
conn_string = 'data source=<machine>; initial catalog=<database>; trusted_connection=True'
connection = SqlConnection(conn_string)
connection.Open()
command = connection.CreateCommand()
command.CommandText = 'select id, name from people where group_id = #group_id'
command.Parameters.Add(SqlParameter('group_id', 23))
reader = command.ExecuteReader()
while reader.Read():
print reader['id'], reader['name']
connection.Close()
If you've already got IronPython, you don't need to install anything else.
Lots of docs available here and here.
I use SQL Alchemy with cPython (I don't know if it'll work with IronPython though). It'll be pretty familiar to you if you've used Hibernate/nHibernate. If that's a bit too verbose for you, you can use Elixir, which is a thin layer on top of SQL Alchemy. To use either one of those, you'll need pyodbc, but that's a pretty simple install.
Of course, if you want to write straight SQL and not use an ORM, you just need pyodbc.
pyodbc comes with Activestate Python, which can be downloaded from here. A minimal odbc script to connect to a SQL Server 2005 database looks like this:
import odbc
CONNECTION_STRING="""
Driver={SQL Native Client};
Server=[Insert Server Name Here];
Database=[Insert DB Here];
Trusted_Connection=yes;
"""
db = odbc.odbc(CONNECTION_STRING)
c = db.cursor()
c.execute ('select foo from bar')
rs = c.fetchall()
for r in rs:
print r[0]
I also successfully use pymssql with CPython. (With and without SQLAlchemy).
http://adodbapi.sourceforge.net/ can be used with either CPython or IronPython. I have been very pleased with it.
PyPyODBC (http://code.google.com/p/pypyodbc) works under PyPy, Ironpython and CPython.
This article shows a Hello World sample of accessing mssql in Python.
PyPyODBC has almostly same usage as pyodbc, as it can been seen as a re-implemenation of the pyodbc moudle. Because it's written in pure Python, it can also run on IronPython and PyPy.
Actually, when switch to pypyodbc in your existing script, you can do this:
#import pyodbc <-- Comment out the original pyodbc importing line
import pypyodbc as pyodbc # Let pypyodbc "pretend" the pyodbc
pyodbc.connect(...) # pypyodbc has 99% same APIs as pyodbc
...
I've used pymssql with standard python and liked it. Probably easier than the alternatives mentioned if you're just looking for basic database access.
Sample code.
If you are want the quick and dirty way with CPython (also works for 3.X python):
Install PYWIN32 after you install python http://sourceforge.net/projects/pywin32/files/pywin32/
Import the following library:
import odbc
I created the following method for getting the SQL Server odbc driver (it is slightly different in naming depending on your version of Windows, so this will get it regardless):
def getSQLServerDriver():
key = winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE, r"SOFTWARE\ODBC\ODBCINST.INI")
sqlServerRegExp = re.compile('sql.*server', re.I | re.S)
try:
for i in range(0, 2048):
folder = winreg.EnumKey(key, i)
if sqlServerRegExp.match(folder):
return folder.strip()
except WindowsError:
pass
Note: if you use the above function, you'll need to also import these two libraries: winreg and re
Then you use the odbc API 1 information as defined here: http://www.python.org/dev/peps/pep-0248/
Your connection interface string should look something like this (assuming you are using my above method for getting the ODBC driver name, and it is a trusted connection):
dbString = "Driver={SQLDriver};Server=[SQL Server];Database=[Database Name];Trusted_Connection=yes;".replace('{SQLDriver}', '{' + getSQLServerDriver() + '}')
This method has many down sides. It is clumsy because of only supporting ODBC API 1, and there are a couple minor bugs in either the API or the ODBC driver that I've run across, but it does get the job done in all versions of CPython in Windows.
I come from a PHP background where MySQL easily works from PHP, and don't the process for getting MySQL to work from Python. From all the research i did and reading of similar yet non-exact questions, it seems to me that there are different ways to achieve this, which makes it even harder for me to wrap my head around. So far I have MySQL-python-1.2.3 installed for python 2.7.1 in Windows XP 32Bit. Can anyone give me an overview of what is necessary to get MySQL working from Python in Windows, or even what is next after my steps all the way to fetching a table row? Thanks in advance.
UPDATE:
#Mahmoud, using your suggestion i have triggered the following:
If you just want to use the DBAPI, then here's a simple snippet explaining how to issue a SELECT query.
import MySQLdb
db = MySQLdb.connect(host="host", user="username", passwd="your-pass", db="the-db-name")
To perform a query, you first need a cursor, and then you can execute queries on it:
cursor = db.cursor()
max_age = 42
cursor.execute("""SELECT name FROM employees WHERE age < %s""", (max_age,))
print cursor.fetchone()
However, you most likely want to use an ORM, I recommend SQLAlchemy. It essentially trivializes database interaction by providing a super-powerful abstraction layer.
I've got mssql 2005 running on my personal computer with a database I'd like to run some python scripts on. I'm looking for a way to do some really simple access on the data. I'd like to run some select statements, process the data and maybe have python save a text file with the results.
Unfortunately, even though I know a bit about python and a bit about databases, it's very difficult for me to tell, just from reading, if a library does what I want. Ideally, I'd like something that works for other versions of mssql, is free of charge and licensed to allow commercial use, is simple to use, and possibly works with ironpython.
Everyone else seems to have the cPython -> SQL Server side covered. If you want to use IronPython, you can use the standard ADO.NET API to talk to the database:
import clr
clr.AddReference('System.Data')
from System.Data.SqlClient import SqlConnection, SqlParameter
conn_string = 'data source=<machine>; initial catalog=<database>; trusted_connection=True'
connection = SqlConnection(conn_string)
connection.Open()
command = connection.CreateCommand()
command.CommandText = 'select id, name from people where group_id = #group_id'
command.Parameters.Add(SqlParameter('group_id', 23))
reader = command.ExecuteReader()
while reader.Read():
print reader['id'], reader['name']
connection.Close()
If you've already got IronPython, you don't need to install anything else.
Lots of docs available here and here.
I use SQL Alchemy with cPython (I don't know if it'll work with IronPython though). It'll be pretty familiar to you if you've used Hibernate/nHibernate. If that's a bit too verbose for you, you can use Elixir, which is a thin layer on top of SQL Alchemy. To use either one of those, you'll need pyodbc, but that's a pretty simple install.
Of course, if you want to write straight SQL and not use an ORM, you just need pyodbc.
pyodbc comes with Activestate Python, which can be downloaded from here. A minimal odbc script to connect to a SQL Server 2005 database looks like this:
import odbc
CONNECTION_STRING="""
Driver={SQL Native Client};
Server=[Insert Server Name Here];
Database=[Insert DB Here];
Trusted_Connection=yes;
"""
db = odbc.odbc(CONNECTION_STRING)
c = db.cursor()
c.execute ('select foo from bar')
rs = c.fetchall()
for r in rs:
print r[0]
I also successfully use pymssql with CPython. (With and without SQLAlchemy).
http://adodbapi.sourceforge.net/ can be used with either CPython or IronPython. I have been very pleased with it.
PyPyODBC (http://code.google.com/p/pypyodbc) works under PyPy, Ironpython and CPython.
This article shows a Hello World sample of accessing mssql in Python.
PyPyODBC has almostly same usage as pyodbc, as it can been seen as a re-implemenation of the pyodbc moudle. Because it's written in pure Python, it can also run on IronPython and PyPy.
Actually, when switch to pypyodbc in your existing script, you can do this:
#import pyodbc <-- Comment out the original pyodbc importing line
import pypyodbc as pyodbc # Let pypyodbc "pretend" the pyodbc
pyodbc.connect(...) # pypyodbc has 99% same APIs as pyodbc
...
I've used pymssql with standard python and liked it. Probably easier than the alternatives mentioned if you're just looking for basic database access.
Sample code.
If you are want the quick and dirty way with CPython (also works for 3.X python):
Install PYWIN32 after you install python http://sourceforge.net/projects/pywin32/files/pywin32/
Import the following library:
import odbc
I created the following method for getting the SQL Server odbc driver (it is slightly different in naming depending on your version of Windows, so this will get it regardless):
def getSQLServerDriver():
key = winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE, r"SOFTWARE\ODBC\ODBCINST.INI")
sqlServerRegExp = re.compile('sql.*server', re.I | re.S)
try:
for i in range(0, 2048):
folder = winreg.EnumKey(key, i)
if sqlServerRegExp.match(folder):
return folder.strip()
except WindowsError:
pass
Note: if you use the above function, you'll need to also import these two libraries: winreg and re
Then you use the odbc API 1 information as defined here: http://www.python.org/dev/peps/pep-0248/
Your connection interface string should look something like this (assuming you are using my above method for getting the ODBC driver name, and it is a trusted connection):
dbString = "Driver={SQLDriver};Server=[SQL Server];Database=[Database Name];Trusted_Connection=yes;".replace('{SQLDriver}', '{' + getSQLServerDriver() + '}')
This method has many down sides. It is clumsy because of only supporting ODBC API 1, and there are a couple minor bugs in either the API or the ODBC driver that I've run across, but it does get the job done in all versions of CPython in Windows.