I am querying an Oracle database through cx_Oracle Python and getting the following error after 12 queries. How can I run multiple queries within the same session? Or do I need to quit the session after each query? If so, how do I do that?
DatabaseError: (cx_Oracle.DatabaseError) ORA-02391: exceeded
simultaneous SESSIONS_PER_USER limit (Background on this error at:
https://sqlalche.me/e/14/4xp6)
My code looks something like:
import pandas as pd
import cx_Oracle
from sqlalchemy import create_engine
def get_query(item):
...
return valid_query
cx_Oracle.init_oracle_client(lib_dir=r"\instantclient_21_3")
user = ...
password = ...
tz = pytz.timezone('US/Eastern')
dsn_tns = cx_Oracle.makedsn(...,
...,
service_name=...)
cstr = f'oracle://{user}:{password}#{dsn_tns}'
engine = create_engine(cstr,
convert_unicode=False,
pool_recycle=10,
pool_size=50,
echo=False)
df_dicts = {}
for item in items:
df = pd.read_sql_query(get_query(item), con=cstr)
df_dicts[item.attribute] = df
Thank you!
You can use the cx_Oracle connection object directly in Pandas - for Oracle connection I always found that worked better than sqlalchemy for simple use as a Pandas connection.
Something like:
conn = cx_Oracle.connect(f'{user}/{password}#{dsn_tns}')
df_dicts = {}
for item in items:
df = pd.read_sql(sql=get_query(item), con=conn, parse_dates=True)
df_dicts[item.attribute] = df
(Not sure if you had dates, I just remember that being a necessary element for parsing).
Related
I am trying to import some data from the database (Postgre SQL) to work with them in Python. I tried with the code below, which seems quite similar to the ones I've found on the internet.
import psycopg2
import sqlalchemy as db
import pandas as pd
engine = db.create_engine('database specifications')
connection = engine.connect()
metadata = db.MetaData()
data = db.Table(tabela, metadata, schema=shema, autoload=True, autoload_with=engine)
query = db.select([data])
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
df = pd.DataFrame(ResultSet)
However, it returns data without column names. What did I forget?
It turned out the only thing needed is adding
columns = data.columns.keys()
df.columns = columns
There is a great debate about that in this thread.
I am running a loop, and in each iteration I go to the PostGres database to find certain matching records. My issue is that it takes too long to open a new database connection at each time through the loop.
import psycopg2
from psycopg2 import sql
from sqlalchemy import create_engine
engine = create_engine(mycredentials)
db_connect = engine.connect()
query = ("""SELECT ("CustomerId") AS "CustomerId",
("LastName") AS "LastName",
("FirstName") AS "FirstName",
("City") AS "City",
FROM
public."postgres_table"
WHERE (LEFT("LastName", 4) = %(blocking_lastname)s
AND LEFT("City", 4) = %(blocking_city)s)
OR (LEFT("FirstName", 3) = %(blocking_firstname)s
AND LEFT("City", 4) = %(blocking_city)s)
""")
for index, row in df_loop.iterrows():
blocking_lastname = row['LastName'][:4]
blocking_firstname = row['FirstName'][:3]
blocking_city = row['City'][:4]
df = pd.read_sql(query, db_connect, params= {
'blocking_firstname': blocking_firstname,
'blocking_lastname': blocking_lastname,
'blocking_city': blocking_city})
This code works, but it doesn't appear to be keeping the database connection open since there is too much latency with the query. (Note: The table columns have indexes.)
UPDATE: I updated the code above, using db_connect as an object outside of the loop rather than engine.connect() as a call in pd.read_sql() as I originally had it written.
I have this code:
import teradata
import dask.dataframe as dd
login = login
pwd = password
udaExec = teradata.UdaExec (appName="CAF", version="1.0",
logConsole=False)
session = udaExec.connect(method="odbc", DSN="Teradata",
USEREGIONALSETTINGS='N', username=login,
password=pwd, authentication="LDAP");
And the connection is working.
I want to get a dask dataframe. I have tried this:
sqlStmt = "SOME SQL STATEMENT"
df = dd.read_sql_table(sqlStmt, session, index_col='id')
And I'm getting this error message:
AttributeError: 'UdaExecConnection' object has no attribute '_instantiate_plugins'
Does anyone have a suggestion?
Thanks in advance.
read_sql_table expects a SQLalchemy connection string, not a "session" as you are passing. I have not heard of teradata being used via sqlalchemy, but apparently there is at least one connector you could install, and possibly other solutions using the generic ODBC driver.
However, you may wish to use a more direct approach using delayed, something like
from dask import delayed
# make a set of statements for each partition
statements = [sqlStmt + " where id > {} and id <= {}".format(bounds)
for bounds in boundslist] # I don't know syntax for tera
def get_part(statement):
# however you make a concrete dataframe from a SQL statement
udaExec = ..
session = ..
df = ..
return dataframe
# ideally you should provide the meta and divisions info here
df = dd.from_delayed([delayed(get_part)(stm) for stm in statements],
meta= , divisions=)
We will be interested to hear of your success.
I have been using the following lines of code for the longest time, without any hitch, but today it seems to have produced the following error and I cannot figure out why. The strange thing is that I have other scripts that use the same code and they all seem to work...
import pandas as pd
import psycopg2
link_conn_string = "host='<host>' dbname='<db>' user='<user>' password='<pass>'"
conn = psycopg2.connect(link_conn_string)
df = pd.read_sql("SELECT * FROM link._link_bank_report_lms_loan_application", link_conn_string)
Error Message:
"Could not parse rfc1738 URL from string '%s'" % name)
sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string 'host='<host>' dbname='<db>' user='<user>' password='<pass>''
change link_conn_string to be like here:
postgresql://[user[:password]#][netloc][:port][/dbname][?param1=value1&...]
Eg:
>>> import psycopg2
>>> cs = 'postgresql://vao#localhost:5432/t'
>>> c = psycopg2.connect(cs)
>>> import pandas as pd
>>> df = pd.read_sql("SELECT now()",c)
>>> print df;
now
0 2017-02-27 21:58:27.520372+00:00
Your connection string is wrong.
You should remove the line where you assign a string to "link_conn_string" variable and replace the next line with something like this (remember to replace localhost, postgres and secret with the name of the machine where postgresql is running, the user and password required to make that connection):
conn = psycopg2.connect(dbname="localhost", user="postgres", password="secret")
Also, you can check if the database is working with "psql" command, from the terminal, type (again, remember to change user and database):
psql -U postgresql database
Trying to list the names of the databases on a remote MS SQL server using Python (Just like the Object Explorer in MS SQL Server Management Studio).
Current solution: The required query is SELECT name FROM sys.databases;. So current solution is using SQLAlchemy and Pandas, which works fine as below.
import pandas
from sqlalchemy import create_engine
#database='master'
engine = create_engine('mssql+pymssql://user:password#server:port/master')
query = "select name FROM sys.databases;"
data = pandas.read_sql(query, engine)
output:
name
0 master
1 tempdb
2 model
3 msdb
Question: How to list the names of the databases on the server using
SQLAlchemy's inspect(engine) similar to listing table names under a database? Or any simpler way without importing Pandas?
from sqlalchemy import inspect
#trial 1: with no database name
engine = create_engine('mssql+pymssql://user:password#server:port')
#this engine not have DB name
inspector = inspect(engine)
inspector.get_table_names() #returns []
inspector.get_schema_names() #returns [u'dbo', u'guest',...,u'INFORMATION_SCHEMA']
#trial 2: with database name 'master', same result
engine = create_engine('mssql+pymssql://user:password#server:port/master')
inspector = inspect(engine)
inspector.get_table_names() #returns []
inspector.get_schema_names() #returns [u'dbo', u'guest',...,u'INFORMATION_SCHEMA']
If all you really want to do is avoid importing pandas then the following works fine for me:
from sqlalchemy import create_engine
engine = create_engine('mssql+pymssql://sa:saPassword#localhost:52865/myDb')
conn = engine.connect()
rows = conn.execute("select name FROM sys.databases;")
for row in rows:
print(row["name"])
producing
master
tempdb
model
msdb
myDb
It is also possible to obtain tables from a specific scheme with execute the single query with the driver below: DB-API interface to Microsoft SQL Server for Python.
pip install pymssql
import pymssql
# Connect to the database
conn =
pymssql.connect(server='127.0.0.1',user='root',password='root',database='my_database')
# Create a Cursor object
cur = conn.cursor()
# Execute the query: To get the name of the tables from my_database
cur.execute("select table_name from information_schema.tables") # where table_schema = 'tableowner'
for row in cur.fetchall():
# Read and print tables
for row in cur.fetchall():
print(row[0])
output:
my_table_name_1
my_table_name_2
my_table_name_3
...
my_table_name_x
I believe the following snippet will list the names of the available databases on whatever server you choose to connect to. This will return a JSON object that will be displayed in your browser. This question is a bit old, but I hope this helps anyone curious who stops by.
from flask import Flask, request
from flask_restful import Resource, Api
from sqlalchemy import create_engine, inspect
from flask_jsonpify import jsonify
engine = create_engine('mssql+pymssql://user:password#server:port/master')
class AllTables(Resource):
def get(self):
conn = engine.connect()
inspector = inspect(conn)
tableList = [item for item in inspector.get_table_names()]
result = {'data': tableList}
return jsonify(result)
api.add_resource(AllTables, '/alltables')
app.run(port='8080')
here is another solution which fetch row by row:
import pymssql
connect = pymssql.connect(server, user, password, database)
cursor = connect.cursor(as_dict=True)
row = cursor.fetchone()
while row:
for r in row.items():
print r[0], r[1]
row = cursor.fetchone()