I am running a loop, and in each iteration I go to the PostGres database to find certain matching records. My issue is that it takes too long to open a new database connection at each time through the loop.
import psycopg2
from psycopg2 import sql
from sqlalchemy import create_engine
engine = create_engine(mycredentials)
db_connect = engine.connect()
query = ("""SELECT ("CustomerId") AS "CustomerId",
("LastName") AS "LastName",
("FirstName") AS "FirstName",
("City") AS "City",
FROM
public."postgres_table"
WHERE (LEFT("LastName", 4) = %(blocking_lastname)s
AND LEFT("City", 4) = %(blocking_city)s)
OR (LEFT("FirstName", 3) = %(blocking_firstname)s
AND LEFT("City", 4) = %(blocking_city)s)
""")
for index, row in df_loop.iterrows():
blocking_lastname = row['LastName'][:4]
blocking_firstname = row['FirstName'][:3]
blocking_city = row['City'][:4]
df = pd.read_sql(query, db_connect, params= {
'blocking_firstname': blocking_firstname,
'blocking_lastname': blocking_lastname,
'blocking_city': blocking_city})
This code works, but it doesn't appear to be keeping the database connection open since there is too much latency with the query. (Note: The table columns have indexes.)
UPDATE: I updated the code above, using db_connect as an object outside of the loop rather than engine.connect() as a call in pd.read_sql() as I originally had it written.
Related
I am querying an Oracle database through cx_Oracle Python and getting the following error after 12 queries. How can I run multiple queries within the same session? Or do I need to quit the session after each query? If so, how do I do that?
DatabaseError: (cx_Oracle.DatabaseError) ORA-02391: exceeded
simultaneous SESSIONS_PER_USER limit (Background on this error at:
https://sqlalche.me/e/14/4xp6)
My code looks something like:
import pandas as pd
import cx_Oracle
from sqlalchemy import create_engine
def get_query(item):
...
return valid_query
cx_Oracle.init_oracle_client(lib_dir=r"\instantclient_21_3")
user = ...
password = ...
tz = pytz.timezone('US/Eastern')
dsn_tns = cx_Oracle.makedsn(...,
...,
service_name=...)
cstr = f'oracle://{user}:{password}#{dsn_tns}'
engine = create_engine(cstr,
convert_unicode=False,
pool_recycle=10,
pool_size=50,
echo=False)
df_dicts = {}
for item in items:
df = pd.read_sql_query(get_query(item), con=cstr)
df_dicts[item.attribute] = df
Thank you!
You can use the cx_Oracle connection object directly in Pandas - for Oracle connection I always found that worked better than sqlalchemy for simple use as a Pandas connection.
Something like:
conn = cx_Oracle.connect(f'{user}/{password}#{dsn_tns}')
df_dicts = {}
for item in items:
df = pd.read_sql(sql=get_query(item), con=conn, parse_dates=True)
df_dicts[item.attribute] = df
(Not sure if you had dates, I just remember that being a necessary element for parsing).
I am trying to import some data from the database (Postgre SQL) to work with them in Python. I tried with the code below, which seems quite similar to the ones I've found on the internet.
import psycopg2
import sqlalchemy as db
import pandas as pd
engine = db.create_engine('database specifications')
connection = engine.connect()
metadata = db.MetaData()
data = db.Table(tabela, metadata, schema=shema, autoload=True, autoload_with=engine)
query = db.select([data])
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
df = pd.DataFrame(ResultSet)
However, it returns data without column names. What did I forget?
It turned out the only thing needed is adding
columns = data.columns.keys()
df.columns = columns
There is a great debate about that in this thread.
I have a dictionary with 3 keys which correspond to field names in a SQL Server table. The values of these keys come from an excel file and I store this dictionary in a dataframe which I now need to insert into a SQL table. This can all be seen in the code below:
import pandas as pd
import pymssql
df=[]
fp = "file path"
data = pd.read_excel(fp,sheetname ="CRM View" )
row_date = data.loc[3, ]
row_sita = "ABZPD"
row_event = data.iloc[12, :]
df = pd.DataFrame({'date': row_date,
'sita': row_sita,
'event': row_event
}, index=None)
df = df[4:]
df = df.fillna("")
print(df)
My question is how do I insert this dictionary into a SQL table now?
Also, as a side note, this code is part of a loop which needs to go through several excel files one by one, insert the data into dictionary then into SQL then delete the data in the dictionary and start again with the next excel file.
You could try something like this:
import MySQLdb
# connect
conn = MySQLdb.connect("127.0.0.1","username","passwore","table")
x = conn.cursor()
# write
x.execute('INSERT into table (row_date, sita, event) values ("%d", "%d", "%d")' % (row_date, sita, event))
# close
conn.commit()
conn.close()
You might have to change it a little based on your SQL restrictions, but should give you a good start anyway.
For the pandas dataframe, you can use the pandas built-in method to_sql to store in db. Following is the way to use it.
import sqlalchemy as sa
params = urllib.quote_plus("DRIVER={};SERVER={};DATABASE={};Trusted_Connection=True;".format("{SQL Server}",
"<db_server_url>",
"<db_name>"))
conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
engine = sa.create_engine(conn_str)
df.to_sql(<table_name>, engine,schema=<schema_name>, if_exists="append", index=False)
For this method you you will need to install sqlalchemy package.
pip install sqlalchemy
You will also need to setup the MSSql DSN on the machine.
I have the following python code, it reads through a text file line by line and takes characters x to y of each line as the variable "Contract".
import os
import pyodbc
cnxn = pyodbc.connect(r'DRIVER={SQL Server};CENSORED;Trusted_Connection=yes;')
cursor = cnxn.cursor()
claimsfile = open('claims.txt','r')
for line in claimsfile:
#ldata = claimsfile.readline()
contract = line[18:26]
print(contract)
cursor.execute("USE calms SELECT XREF_PLAN_CODE FROM calms_schema.APP_QUOTE WHERE APPLICATION_ID = "+str(contract))
print(cursor.fetchall())
When including the line cursor.fetchall(), the following error is returned:
Programming Error: Previous SQL was not a query.
The query runs in SSMS and replace str(contract) with the actual value of the variable results will be returned as expected.
Based on the data, the query will return one value as a result formatted as NVARCHAR(4).
Most other examples have variables declared prior to the loop and the proposed solution is to set NO COUNT on, this does not apply to my problem so I am slightly lost.
P.S. I have also put the query in its own standalone file without the loop to iterate through the file in case this was causing the problem without success.
In your SQL query, you are actually making two commands: USE and SELECT and the cursor is not set up with multiple statements. Plus, with database connections, you should be selecting the database schema in the connection string (i.e., DATABASE argument), so TSQL's USE is not needed.
Consider the following adjustment with parameterization where APPLICATION_ID is assumed to be integer type. Add credentials as needed:
constr = 'DRIVER={SQL Server};SERVER=CENSORED;Trusted_Connection=yes;' \
'DATABASE=calms;UID=username;PWD=password'
cnxn = pyodbc.connect(constr)
cur = cnxn.cursor()
with open('claims.txt','r') as f:
for line in f:
contract = line[18:26]
print(contract)
# EXECUTE QUERY
cur.execute("SELECT XREF_PLAN_CODE FROM APP_QUOTE WHERE APPLICATION_ID = ?",
[int(contract)])
# FETCH ROWS ITERATIVELY
for row in cur.fetchall():
print(row)
cur.close()
cnxn.close()
Trying to list the names of the databases on a remote MS SQL server using Python (Just like the Object Explorer in MS SQL Server Management Studio).
Current solution: The required query is SELECT name FROM sys.databases;. So current solution is using SQLAlchemy and Pandas, which works fine as below.
import pandas
from sqlalchemy import create_engine
#database='master'
engine = create_engine('mssql+pymssql://user:password#server:port/master')
query = "select name FROM sys.databases;"
data = pandas.read_sql(query, engine)
output:
name
0 master
1 tempdb
2 model
3 msdb
Question: How to list the names of the databases on the server using
SQLAlchemy's inspect(engine) similar to listing table names under a database? Or any simpler way without importing Pandas?
from sqlalchemy import inspect
#trial 1: with no database name
engine = create_engine('mssql+pymssql://user:password#server:port')
#this engine not have DB name
inspector = inspect(engine)
inspector.get_table_names() #returns []
inspector.get_schema_names() #returns [u'dbo', u'guest',...,u'INFORMATION_SCHEMA']
#trial 2: with database name 'master', same result
engine = create_engine('mssql+pymssql://user:password#server:port/master')
inspector = inspect(engine)
inspector.get_table_names() #returns []
inspector.get_schema_names() #returns [u'dbo', u'guest',...,u'INFORMATION_SCHEMA']
If all you really want to do is avoid importing pandas then the following works fine for me:
from sqlalchemy import create_engine
engine = create_engine('mssql+pymssql://sa:saPassword#localhost:52865/myDb')
conn = engine.connect()
rows = conn.execute("select name FROM sys.databases;")
for row in rows:
print(row["name"])
producing
master
tempdb
model
msdb
myDb
It is also possible to obtain tables from a specific scheme with execute the single query with the driver below: DB-API interface to Microsoft SQL Server for Python.
pip install pymssql
import pymssql
# Connect to the database
conn =
pymssql.connect(server='127.0.0.1',user='root',password='root',database='my_database')
# Create a Cursor object
cur = conn.cursor()
# Execute the query: To get the name of the tables from my_database
cur.execute("select table_name from information_schema.tables") # where table_schema = 'tableowner'
for row in cur.fetchall():
# Read and print tables
for row in cur.fetchall():
print(row[0])
output:
my_table_name_1
my_table_name_2
my_table_name_3
...
my_table_name_x
I believe the following snippet will list the names of the available databases on whatever server you choose to connect to. This will return a JSON object that will be displayed in your browser. This question is a bit old, but I hope this helps anyone curious who stops by.
from flask import Flask, request
from flask_restful import Resource, Api
from sqlalchemy import create_engine, inspect
from flask_jsonpify import jsonify
engine = create_engine('mssql+pymssql://user:password#server:port/master')
class AllTables(Resource):
def get(self):
conn = engine.connect()
inspector = inspect(conn)
tableList = [item for item in inspector.get_table_names()]
result = {'data': tableList}
return jsonify(result)
api.add_resource(AllTables, '/alltables')
app.run(port='8080')
here is another solution which fetch row by row:
import pymssql
connect = pymssql.connect(server, user, password, database)
cursor = connect.cursor(as_dict=True)
row = cursor.fetchone()
while row:
for r in row.items():
print r[0], r[1]
row = cursor.fetchone()